top | item 13726010

Ask HN: Learn Python, R? Or something else?

27 points| socrates1998 | 9 years ago

Hi Hacker News, I am doing a statistics project for some American football teams and I think it's going to require me to learn R.

Should I just learn R right away? Or should I learn another programming language first (like Python), then learn R?

I have some limited experience in Web development, doing web design and building website mainly in Word Press. As such, I know a some html, css, and Php.

Just looking to see if I could learn R without knowing much else.

43 comments

order
[+] RockyMcNuts|9 years ago|reply
You can learn R without knowing anything else. It's a good statistics package but a dated and quirky programming language.

I'd consider doing the Stanford Statistical Learning class which starts off with teaching you R. https://lagunita.stanford.edu/courses/HumanitiesSciences/Sta...

Also recommend Swirl, which is an interactive tutorial - http://swirlstats.com/students.html

Or you can go straight to Julia, which is a modern statistical package and language without R's quirks http://julialang.org/

If you're eventually going to put your project on the Web, or just want to learn programming the right way, might be better off doing it in python.

[+] phillc73|9 years ago|reply
What makes R a dated and quirky language?

I know a reasonable amount of R, but I don't really know any other programming language. I did once go looking at Julia, and had thought to write a few simple things using it, but in the end just ran out of time and did it with R.

Learning R has been one of the most challenging and enjoyable things I've done over the last three years. However, I would be interested to know where Python, Go or Julia would make my life better or easier.

I mostly write R scripts to analyse and graphically display interpretations of data. There are so many contributed libraries, it feels like I am really spoiled for choice.

I also write Shiny web apps. Is there a Python equivalent of Shiny? I have heard of Django, but is that really as easy as just writing a Python script and deploying on a server? I always thought Django was more of a complete framework which also required server side coding.

Anyway, I would genuinely like to know what advantage Python or Julia or Go would provide me with over just continuing with R.

[+] mindcrime|9 years ago|reply
If you definitely need to use R, I'd say just learn R. R is different enough from most other languages that I don't think you'd get a lot of value from, say, learning Python first.

Why do I say that? First of all, R's syntax is quirky and different enough from (Python|Java|C|Ruby|etc.) that you might almost find it harder to learn R if you're already used to something else. Second, aside from the syntax the biggest thing to get used to with R is that it's very much vector oriented. Basically you're always working with vectors, even when you only have what you would otherwise think of as a single scalar value. You just put in a vector of length 1. Anyway, that whole paradigm is different enough from other programming languages, that you might as well just learn it that way from the beginning.

Now to be fair, there are libraries and things that let other languages act and feel a bit more like R, but I'm intentionally not considering those right now, as that would just be one more complication to deal with. And if you are locked in on using R for whatever reason, there's no need to complicate life.

The only other question I would have, is whether or not you absolutely must use R at all. If you have the option to choose your language, you can do pretty much anything that you can do in R, using Python, or Octave, or probably many other languages. If that's an option, then you just need to decide which would be easier / more useful for you. And while I won't take sides in general, I will say that Python may be a little bit easier to learn in general, but then you're back to using external libraries for more of the statistical / numerical stuff.

Just looking to see if I could learn R without knowing much else.

My guess is that you can. R has some quirks, but there's nothing especially scary about it. Depending on how much you already know about statistics, you may find that learning and understanding the math is more difficult than learning to use R.

[+] mateo411|9 years ago|reply
What dataset are you using? I would be interested in checking it out.

If you don't know R or Python, I would say that learning Python might work out better for you. Python is a general purpose programming language, whereas R is really good at stats and visualization. Python is also pretty good at this, you can use pandas, matplotlib, and scikit-learn.

[+] socrates1998|9 years ago|reply
It's a string of qualitative data that I want to interpret, like the team runs the ball Right, then Right, then Left.

My goal is to be able to analyze the next play and possibly give a probability attached to where the coach will call the play.

But that's just the first step. I would probably need to do a lot of different stuff with the data.

[+] syntexis|9 years ago|reply
Learn R first, it's ideal for your project and not difficult.
[+] kasperset|9 years ago|reply
R should not be ignored if you are doing statistics. It has first class support for stats built in plus there are so many packages available to do more weight lifting.
[+] dec0dedab0de|9 years ago|reply
If you're just analyzing the data R seems to be the right choice. If you're going to need to gather and parse the data then Python is a much better general purpose language.
[+] clumsysmurf|9 years ago|reply
I've been asking myself the same question. I often hear of Go as a better Python, so I was hoping to find a good numerical / statistical story for Go ...

Bruce Eckel had a blog about this, I think you'll find the discussion around R / Python / Mathematica interesting

http://bruceeckel.github.io/2015/02/15/why-not-go-there/

[+] 77ko|9 years ago|reply
I started and found Python easier then R. Python is a lot more 'english readable' while R is more like the code you see on Hollywood screens, somewhat indecipherable with magic incantations.

As a starter, you probably need something like dataquest[0] or udacitys[1] data courses.

[0] https://www.dataquest.io [1] www.udacity.com

[+] anotheryou|9 years ago|reply
And if Python: 2 or 3?

(I'm kinda decided and want to do a bit of statistcics, but also have one general purpose language I know a bit better)

[+] Sir_Substance|9 years ago|reply
R is good for statistics and nothing else. It is, however, really good at statistics.

Python in more general purpose. If you want to do things that are not statistics in the future, python makes more sense.

If you're going to get started with python, get started with python 3.

[+] brunosaboia|9 years ago|reply
As other said, R is the way to go if you decide to work solely with statistics. My opinion towards R is that it needs some improvements, sometimes it feels like a language made for prototyping only. But that is maybe its goal, so in the end it could be a good thing to match academic needs.
[+] eb0la|9 years ago|reply
For me the best part of R is documentation.

Using Python for data will help you learn something that can be used for more... but be warned that Dataframes on stuff like Spark don't work with list comprehensions (my favorite Python feature).

[+] busterarm|9 years ago|reply
If this is a one-off and you don't expect to need to do more of this kind of thing, I'd say learn Python, otherwise R.

Fortran is always fun though. :)

[+] probinso|9 years ago|reply
R is a bad programming language. Learn R.