top | item 40817272

(no title)

I'm not sure what would lead to you believe this. I've worked in the data science/ML space for over a decade now and I see the majority of pure analytics projects started in R, including at big tech companies I've worked at recently.

Of course, ML projects and other things that need to result in production-grade models are almost always done in Python. This is currently the most visible form of "data project" due to all the ML/AI hype, but it is far from the only data work going on.

discuss

boringg|1 year ago

Im curious about the people who use R in big tech companies that you've worked at. Were the R users the people who had just come out of school and still working using their academic dev environment before weening off?

I always found that was the group who used R - kind of a use what you are used to until it gets out of step with the remaining workflow.

I also would say that the amount of R I see is far less than python.

disgruntledphd2|1 year ago

So, (speaking as someone who started with R and now predominantly writes Python), I think there's a bunch of things going on here.

1. R is 100% better for analytics work and statistical modelling. There's just no contest.

2. Python is much, much better for data getting (APIs/scraping etc) and dealing with non table-like data. Again, there's basically no contest here.

3. Software engineers hate R (in most cases), which means that it's easier to hand over work for production in Python.

This leads to a situation where it looks like most of the prod-level work is being done in Python, but if you look under the covers you'll discover that most prototyping/analysis/exploration is done in R and then ported to Python if it works.

Like, Python is a great language for lots of things, but it's pretty terrible for exploratory DS work (pandas is like the worst features of base R and base Python mashed together in an unholy hybrid).

There's also the fact that all the NN stuff is predominantly Python, so lots of companies believe that they need Python people, which reinforces the stereotype.

And finally, while I love R, Python has more guardrails, and it's harder to make an unmaintainable mess with it (relative to R). Particularly when people use all the various lazy evaluation packages that the tidyverse has used over the past decade (I once maintained a codebase that used all of these in different places, it was not a fun experience).

aydyn|1 year ago

Youre wrong. Python is outpacing R in usage. Every metric you can find proves it. R also has fundamental issues and lacks serious development.

vixen99|1 year ago

Not to dispute because I have no idea so I'll assume you're correct. But how many metrics did you find and how were they obtained? And how would you know they are representative of all R users?

stoperaticless|1 year ago

s/serious/hyped