What's the problem you're trying to solve? It's hard for anyone to give useful advice without any more context.
That said, I recommend whichever language is easiest for you. I use R and have not fully learned Python, so I have an obvious bias. If you're performing complicated statistical analysis, I'd recommend R, but for more traditional programming, I get the impression that Python interfaces more efficiently with other languages.
I use both. I like R for its charting capability and the sheer amount of packages for different use cases. I use Python to pre-process data and get it into because it is a lot easier than in R. Also Scikit-learn, NumPy and Pandas are really nice.
As everyone says, depends on what you are trying to do.
For all interactive/exploratory analysis, for statistical graphics, for more advanced statistics, for most statistics-related research work in general, I would definitely pick R out of these two.
If statistics is only a small part of the application, if you already know exactly what you have to do (i.e. no data exploration), if you have to do a lot of web/text processing -- probably Python.
Also, check which one has more/better packages related to what you are doing.
For some stats projects I would go with something else entirely though.
I'm not sure that's the distinction -- there is no shortage of general-purpose programming code written in R, as well as statistics done in Python.
It's more about use cases -- is it all about statistics, or is actual statistics only a small part? Is the focus primarily on research and exploration, or on implementation and deployment?
It really all boils down to what problem you're trying to solve, what kind of analysis you're trying to do, and what the performance requirements are. For basic stats, R and Python will be comparable in terms of library availability and functionality. If you start getting into more specialized and/or esoteric statistics, you will find more R packages (libraries) than you will Python libraries.
I use both for my data analysis (though I'm not doing anything too fancy these days). R is great at statistics (that's what it was designed for) but a bit of a terrible programming language. So I combine the two, do all the IO stuff etc in Python and run the actual statistical analysis in R (RPy is a great Python interface to R).
...although even at the risk of getting laughed at, Excel is an awesome tool for a lot of quick and dirty data analysis stuff (depending on what you do). Throw in a little Python/R (e.g. via datanitro) and it's even more powerful.
[+] [-] aaren|12 years ago|reply
I read this [blog] earlier and as a result I don't think I'll bother to learn R unless I have to.
[pandas]: http://pandas.pydata.org/
[statsmodels]: http://statsmodels.sourceforge.net/
[blog]: http://www.talyarkoni.org/blog/2013/11/18/the-homogenization...
[+] [-] svjunkie|12 years ago|reply
That said, I recommend whichever language is easiest for you. I use R and have not fully learned Python, so I have an obvious bias. If you're performing complicated statistical analysis, I'd recommend R, but for more traditional programming, I get the impression that Python interfaces more efficiently with other languages.
[+] [-] unknown|12 years ago|reply
[deleted]
[+] [-] stadeschuldt|12 years ago|reply
[+] [-] xixi77|12 years ago|reply
For all interactive/exploratory analysis, for statistical graphics, for more advanced statistics, for most statistics-related research work in general, I would definitely pick R out of these two.
If statistics is only a small part of the application, if you already know exactly what you have to do (i.e. no data exploration), if you have to do a lot of web/text processing -- probably Python.
Also, check which one has more/better packages related to what you are doing.
For some stats projects I would go with something else entirely though.
[+] [-] code_scrapping|12 years ago|reply
You could compare R to Matlab, or R to python library with similar scope (numpy or pandas).
[+] [-] xixi77|12 years ago|reply
It's more about use cases -- is it all about statistics, or is actual statistics only a small part? Is the focus primarily on research and exploration, or on implementation and deployment?
[+] [-] floppydisk|12 years ago|reply
[+] [-] bjoerns|12 years ago|reply
[+] [-] bjoerns|12 years ago|reply