top | item 12301996

15 Page Tutorial for R

194 points| sndean | 9 years ago |studytrails.com

29 comments

order
[+] minimaxir|9 years ago|reply
This tutorial is much more basic and has much less practical statistical applications than the R tutorial posted last week (https://news.ycombinator.com/item?id=12264360), which itself is out-of-date relative to the R for Data Science book (http://r4ds.had.co.nz/)

I really am curious why anything "R" and "Tutorial" gets massively upvoted to the Top 3 of HN like clockwork nowadays. I might have to restart my R tutorial screencasts since there appears to be a demand. :P

[+] blahi|9 years ago|reply
Because it's exploding. Microsoft backing does wonders, for one. But it was even without them. Every analytics vendor standardizes on R.
[+] ekianjo|9 years ago|reply
> I really am curious why anything "R" and "Tutorial" gets massively upvoted to the Top 3 of HN like clockwork nowadays.

Same thing I was wondering. Is it simply that R is gaining popularity among a more mainstream audience, while it hasnt changed that much recently ?

[+] fxj|9 years ago|reply

  > a[1-3]
  [1] 1 3
  > a[[1-3]]
  Error in a[[1 - 3]] : attempt to select more than one element
The first line evaluates to a[-2]. I think what the author meant was a[1:3]. a[[-2]] gives an error, a[[2]] is valid R.
[+] capnrefsmmat|9 years ago|reply
a[-2] is valid R syntax. A negative index means that element is omitted from the result, so a[-2] is a without the second element.
[+] danso|9 years ago|reply
As an aspiring R-learner, I have a few quibbles in the first pages I've looked at:

re: the Assignment operator http://www.studytrails.com/R/Core/AssignmentOperator.jsp

> However, there is a difference between the two operators. = is only allowed at the top level i.e. if the complete expression is written at the prompt. so = is not allowed in control structures. Here's an example:

      > if (aa=0) {print ("test")}
      Error: unexpected '=' in "if (aa="
      > aa
      Error: object 'aa' not found
      > if (bb<-0) {print ("test")}
      > bb
      [1] 0
What in the LOLWTF...This is a bad example because even as an experienced programmer, I have no idea what is supposed to happen, or why this pattern would even be used -- and even then, I was surprised with the result. This is an overly complicated explanation even if it is correct. I prefer Hadley Wickham's style guide:

http://adv-r.had.co.nz/Style.html

> Use <-, not =, for assignment.

(an assignment operator that isn't an equals sign is one of the things I miss most when switching away from R)

From the next chapter, Listing Objects: http://www.studytrails.com/R/Core/ListingObjects.jsp

> All entities in R are called objects. They can be arrays, numbers, strings, functions.

This may technically be the case, but any R tutorial that does not open up with what makes R different from other mainstream languages is doing the reader a major disservice. This Listing Objects chapters shows patterns that use R in ways that I haven't seen used in other R examples (beginner and advanced). Even if it's correct, what's the point in showing esoteric examples unless this tutorial is meant to teach R to someone interested in the design of languages?

Again, Wickham's Advanced R book handles this topic well (in fact, Advanced R is probably the best book you can read if you already know how to program -- it is incredibly accessible) in his early chapter on Data structures: http://adv-r.had.co.nz/Data-structures.html

> R’s base data structures can be organised by their dimensionality (1d, 2d, or nd) and whether they’re homogeneous (all contents must be of the same type) or heterogeneous (the contents can be of different types). This gives rise to the five data types most often used in data analysis...Almost all other objects are built upon these foundations. In the OO field guide you’ll see how more complicated objects are built of these simple pieces.

And this next sentence is the one thing I wished someone had printed out and stapled to my forehead before I started to learn R:

> Note that R has no 0-dimensional, or scalar types. Individual numbers or strings, which you might think would be scalars, are actually vectors of length one.

Maybe that's self-evident to other programmers, but even as someone who once programmed in MATLAB, I was stunningly ignorant I was of how every return value I interacted with was a vector, even a single simple string. In retrospect, the interactive shell alludes to this...but I didn't even bother looking up the details about the shell:

         > 2 + 2
         [1] 4
         > 'a'
         [1] "a"

R is wonderfully easy to start up with and produce visualizations with. But skipping over the language's fundamentals was incredibly painful for me. A few minutes skimming "Advanced R" would have easily saved me hours of confusion.
[+] stewbrew|9 years ago|reply
wrt assignments: = does assignments too but not eagerly. It's used for late bindings. This isn't a matter of style as your reference to adv-r would suggest but a different use case.

I wish people would spend more time reading the official language reference instead of all these half-baked tutorials.

[+] nerdponx|9 years ago|reply
The only place I ever see the "if (x<-y$x)" style used is deep in base R source that probably hasn't been touched in a decade.
[+] poisonarena|9 years ago|reply
I prefer Python
[+] mhuffman|9 years ago|reply
Also, if you are using models and algorithms on the backend, R is painful compared to python.

If you are just doing interactive work RStudio and jupyter with well-stocked support libraries are about equivilant to me.

If you want to do something not all that popular (eg. conjoint analysis) or something cutting edge (duplicating recent research results), R is pretty much the only way to go.

[+] kimi|9 years ago|reply
Agree. While R's statistical stuff is much more polished, doing anything but running the regressions is a royal PITA.