top | item 25317338

(no title)

Huntsecker | 5 years ago

personally Im surprised R is still in active development when the main use case for people to use R (at least when I was using it) was for statistical analysis. Python with its libraries (a lot I believe ported from R) just does is nicer, and faster.

discuss

order

peatmoss|5 years ago

R vs. Python flamewars always strike me as a Budweiser vs. Miller kind of argument. Neither is really a “craft beer” of programming languages. Neither are super remarkable as programming languages. Both made a bunch pragmatic tradeoffs to appeal to large audiences that share similar values—both are “average joe” beers.

Python has comparative advantages over R in production roles. R has comparative advantage in statistical libraries, visualization, and meta programming. Neither are exemplars for production deployment or meta programming (R is an exemplar for stats libraries however).

canjobear|5 years ago

Tidyverse absolutely has a hipster craft beer feel to it. I think it's great, but it's true.

civilized|5 years ago

I'm really into this package that lets you manipulate tabular data using dozens of different systems with the exact same code

Yeah, you've probably never heard of it

civilized|5 years ago

Nah, it's not nicer. dplyr is way better than pandas. But there is no end to the supply of Python fanbois who only know Python and assume that whatever's in Python just has to be better

CameronNemo|5 years ago

I don't mind pandas so much, although dplyr is quite nice IMO (feels like natural language and declarative/SQL like, whereas pandas ends up with lots of procedural idioms).

ggplot is something that I don't think matplotlib is comparable to at all, though. I am so much faster at iterating on a visualization with R/ggplot than Python/matplotlib. Maybe it is my tooling, though. How about others who have used both? What are your experiences?

hated|5 years ago

Pandas is used in some top 10 banks for analytics. Its performance is abysmal at the scale used there. Nobody wants to invest resources in training analysts to write high performance code so here we are. I have never viewed SQL more highly after seeing the mess that analysts make when writing imperative code.

fithisux|5 years ago

I use Python at work, but R is the uber weapon.

SubiculumCode|5 years ago

yeah. Pick one, learn it, and you'll be fine, no patter if you chose Python or R.

Personally, I prefer R for my use case which is longitudinal analysis of experimental data.

canjobear|5 years ago

About 8 years ago I agreed with this point, but with the development of tidyverse, R has become far superior to Python for anything involving dataframes.

I teach classes involving data analysis, some in Python and some in R (different topics). The amount of time the Python students spend fighting pandas---looking up errors, trying to parse the docs, trying out new arcane indexing strategies---is obscene. On the other hand, the R students progress rapidly. I'd move everything to R if I could, but Python is still better for NLP pipelines.

iaw|5 years ago

I know R because that's what we used at my first company. I would love to switch to Python/Pandas but I'm comfortable with R and it does everything I need it to with one exception over ten years of heavy use.

Python is wonderful but the cognitive load for switching in industry and academia without a clear cost benefit isn't worth it to most people I know in my shoes. I encourage new coders to learn Python but discounting R feels a bit asinine.

Hadley is still actively doing work for R which has led to a graphing packages that is substantially better than anything in Python (last I check). I have no doubt that Python will steal it and implement it eventually (as they should) but R is still doing firsts that Python hasn't (note the native implementation of Piping, they're late to the party on lambda functions obviously)

Icathian|5 years ago

I made the switch years ago and there is lots that python does better. I really, really wish for a perfect port of dplyr and ggplot2. Those are what I truly miss, everything else I'm pretty happy with.

civilized|5 years ago

R already has a better lambda than Python, simply by virtue of having first class functions. This is just a bit shorter notation for something that already existed.

zwaps|5 years ago

I use Python whenever I can, but R has loads and loads of statistical libraries that Pyrhon doesn’t. It is not even close.

Emphere|5 years ago

Yeah, basically this. I assume HN has a higher number of people who work in ML jobs in fields like finance etc. If you're working in any sort of social/public health research, then most new methods seem to be implemented as R packages. I'm thinking of things like new methods for propensity score, sequential trial designs etc. Also seems to be the preferred language on the Stats Stack Exchange posts.

free2OSS|5 years ago

What kind of stat problems?

Also I used to love Python... Until I got a full time job and learned why static typing exists.