One really underappreciated aspect of R is that it's a lisp at heart. This enables the user (and enterprising package writer) to build really clean abstractions for the task at hand.
The tidyverse suite of Hadley Wickham is a great example of this, notably with the pipe operator %>% (similar to |> in F#) which is not part of the base language and yet could be very easily implemented. Julia's macros probably enables the same type of implementation, but I don't see how one would achieve it as easily in Python for example. Non-standard evaluation is another example of R's lispiness in action [0].
Also, consider how easy it is to walk R's S-exp. Expressions in R can only be one of four things: an atomic value, a name, a call or a pairlist. Wickham's Advanced R has a great intro on this [1].
I believe Wickham's amazing work with tidyverse (which really changes the way you code in R) is just the beginning of a rediscovery of R's inner lisp power, a kind of "R: the good parts" moment.
I have seen HN crowd hating R very similar to hating js. While I'm not getting into those details, I'd like to list a few reasons why I like R:
- RStudio is simply great. I know Python has got Jupyter notebook but RStudio makes a good IDE for anyone (even beginners).
- Python is great because it's easier for beginners to start doing magick without getting frustrated hence a good beginners language and it is more appropriate for R because anyone who wants to begin with Data Analytics, R is a lot easier - without trying to figure out how to install a new package, load a new package, make a plot or anything of that matter. Hence the fall out rate would be less.
- Tidyverse. Without denial, it's a better Universe than Marvel's cinematic universe. Not a single day in my job goes without using dplyr.
- While I've quoted tidyverse in general, ggplot2 - embracing the grammar of graphics has set a very nice standard for visualization libraries which matplotlib (the goto library of Python doesn't offer much)
- Pandas is nothing but a library built on Numpy to offer R like data wrangling functions hence I'd like to consider dplyr and R's inbuilt data manipulation functions superior.
There is no doubt that Python has its own advantages with single library scikit-learn and webservices, R is no way to be hated.
I know many people think otherwise, but I hate R for many reasons. Here are some of them:
- You can use '=' and '<-' to assign values to variables and both do the same, except in a few edge-cases where you now spend one week finding the error
- It confuses and mixes functional programming and oop not only per entity but also between the usage of them. Want to get a value of entity X? use x.getValue(). Want to get a value of entity Y? Use Y.getValue(y).
- The ide crashes once an hour and does not detect file-changes which forces you to restart it manually.
- People say R is the best and optimized for data-analytics which is simply not true. It's a marketing-lie spread by the creators. There is no data-analytics-task that you cannot do with the same ease in other programming languages.
Disclaimer: My big-data-profs enforced me to use R even for tasks where R should not be used.
I've been a heavy R user for about 7 years, and I only slightly disagree with one of your points.
(In my opinion) R is best for traditional statistics, as opposed to AI, machine learning, predictive analytics, data science, data analysis or any other variant thereof.
If you're more concerned with Chi-squared tests than unit tests, or if you need to teach a mathematician or a biologist how to fit regression models and analyse residuals, goodness-of-fit statistics, p-values etc, then R is the best language for the job.
If you need to build a program (as opposed to just do a thing), or if you're more interested in accuracy than inference (as per most machine learning tasks), then Python with sklearn and pandas blows R out of the water.
> People say R is the best and optimized for data-analytics which is simply not true. It's a marketing-lie spread by the creators. There is no data-analytics-task that you cannot do with the same ease in other programming languages.
Really? Maybe you worked with R before data.table, dplyr, and the tidyverse packages? I'm not that familiar with pandas in Python, but there is an incredible amount of productivity to be gained from knowing your way around a set of just around 5 packages in R that I never had when working with C++, Java, Perl, or Ruby.
Also, it could just be me, but I overcome the = and <- confusion by simply never using =.
I feel your pain! It took me a long time to get used to R. The only reason I tolerate it is I used SAS before that, so my point of comparison is an even more obtuse programming framework! Some general advice should you want to work with R some more:
- For assignment, always use '<-'. Read it as "set to". For example, "x <- runif(10)" means "set x to a vector of 10 uniform random numbers". When passing arguments in function calls, use '='.
- If the IDE gives you problems, try using the command line. R Studio or the R GUI app are not necessary. Simply type 'R' in a shell and you have an interactive read-line environment. Use the shell for exploratory work, then write code in your favorite editor and copy/paste after developing a series of commands you want to run.
- Use base R as much as possible, don't install a new package just for one function that you could do with base R functions, even if it's not elegant. Package bloat is one reason for inconsistencies in APIs. Some package developers will make you do x.getValue() and others getValue(x). But remember these are 3rd party packages. You can do a lot using just base R and a few select packages that are well respected (gglot2, dplyr, Hmisc, reshape).
I complain about this every time a post on R programming comes up here, but my favorite thing to hate (our of many) about R is that there's no way to find out what the directory of the current script is. Imagine someone would want to use relative paths to their data files so that they could version control their scripts and run them unmodified on different machines! We wouldn't want to enable such abominations now would we!
> - The ide crashes once an hour and does not detect file-changes which forces you to restart it manually.
There's more than one IDE for R [0], and strictly that's not a problem with R itself, but the people who built the IDE.
> - People say R is the best and optimized for data-analytics which is simply not true. It's a marketing-lie spread by the creators. There is no data-analytics-task that you cannot do with the same ease in other programming languages.
I think there's very little marketing behind R. It's predominantly a statistics package, but clearly a lot of statisticians are using it for data analysis. So I think it's people that using it for data analysis that talk about it, not bloggers paid to write about it.
I’ve known many quants use both R and python/numpy/pandas for complimentary tasks. The R standard library was generally spoken about in positive terms, but for data massaging and manipulation beyond pure maths/stats analysis a python environment probably offers much more flexibility.
Note that I don’t claim expertise in the above, but a bunch of very talented people I’ve worked directly with, and who were very directly incentivized to be productive, used R.
Perhaps your profs were trying to help you learn R, including its limitations, when they were setting you tasks?
I do not really disagree with you, except for the '<-' bit, just map it to a keyboard shortcut, and move on :).
But I would give R a try with the tidyverse, it made me go from hating R to just not caring about it.
While libraries are extremely inconsistent, if you want to use cutting edge statistical methdos as a researcher, you pretty much have no other option. Finally, data wrangling is quite well developed in the R evironment.
So long story short, after many years of hating R, now I just find it a handy tool to do my work despite it being old, inconsistent and sometimes annoying.
nice list of R's WTFs. I've recently hit an issue that you access properties of "S4 classes" (what are those? "The S4 object system. R has three object oriented (OO) systems: [[S3]], [[S4]] and [[R5]].") using @ instead of $.
I think that the users/library creators are also guilty of why working with R is such a pain. Giving them the option to overload operator was a major mistake. C++ programmers are more often engineers who have more concern for the code reader and even they probably overuse it.
"You can use '=' and '<-' to assign values to variables and both do the same, except in a few edge-cases where you now spend one week finding the error"
Can you tell us a bit about those edge-cases which can lead to hard finding bugs?
- There is also '->' which can be even more confusing (or helpful if you use pipes)
- Aren't those methods defined by each package/object? Or you mean '@'?
- RStudio (the most used IDE) is one of the reasons that I use R so heavily. Never encountered your problems. I can even git checkout to another branch without problems and the new versions are loaded without problem.
- For a quick descriptive analysis or some tests I don't know something easier, compared to SQL or Python. But that's probably only personal preferences and/or knowledge of the language.
And I can see why some people like R. They are end users for whom the language was explicitly designed, so they like the ergonomics (to use the term Rustaceans are popularizing.)
The thing is, like Perl and Latex and other products you could think of, R was initially written by people with a good idea of the end uses and how to enable those end uses, but not a good idea on how to reconcile those ergonomics with the need for a clean parseable syntax.
So if you make too extensive a reliance on R, you wind up having to hire someone like me.
I really enjoy R and the more I learn programming the more I enjoy it. Best thing I ever did was learn the language Racket and How to Design Programs and Hadley Wickham's tidyverse.
I moved from Python to R about six years ago. Before that I did most of my work in the command-line. R's rise in popularity has been caused by the libraries in the tidyverse and data.table. The millions of dollars invested into R by many companies and an amazing eco-system
> You can use '=' and '<-' to assign values to variables and both do the same, except in a few edge-cases where you now spend one week finding the error
Its just R's symantics. It is almost universaly spoken to just use <- for style consistency and edge cases. Use the RStudio shortcut `Alt` and `-`. The reason you spend a week is the reason why it is recommended for all users to just use <-.
> It confuses and mixes functional programming and oop not only per entity but also between the usage of them. Want to get a value of entity X? use x.getValue(). Want to get a value of entity Y? Use Y.getValue(y).
R comes from S the creators of R were also inspired by Scheme. Personally I learned the language Racket to be a better R programmer and I pretty much live in the Functional side of R. I actually like the fact that they added more functional core to R from S+. http://r.cs.purdue.edu/pub/ecoop12.pdf
> The ide crashes once an hour and does not detect file-changes which forces you to restart it manually.
Then use a different system. R does not equal RStudio. I have never experienced this and I have worked in Linux, Mac and Windows 7 - 10. To me RStudio is the best example of an electron app and the only IDE that I actually use the built in git feature. R Projects and RWorkbooks are the best features of RStudio.
> People say R is the best and optimized for data-analytics which is simply not true. It's a marketing-lie spread by the creators. There is no data-analytics-task that you cannot do with the same ease in other programming languages.
So R is just as easy to use for data analysis and has best in class statistics? Also I think tidyverse is much easier than any other data analysis system I have ever seen, but I guess that is just my opinion.
> It's a marketing-lie spread by the creators.
What does Ross Ihaka and Robert Gentleman have to gain for marketing and what lie has they ever said. This falls into conspiracy theory.
There is a large community of R users and we like R. There are a ton of
A few weeks ago I had to do some data transformation (just a few thousand lines of data). Because I have some history with Excel I startet LibreOffice and wrote some formulas. After a few days I reached the point when LibreOffice required one and a half hours to recalculate the formulas.
That was the moment when I asked a friend of my who has some R experience to help me with the basics (yes the syntax is kinda weird at the beginning). After 4 hours of learning by doing we had the same result as what I had reached in a few days of work with LibreOffice and it calculated everything in about 17 seconds. Yes, this time I knew exactly what I wanted and R can do much more efficient transformations than you could ever do with a spreadsheet calculator. Nevertheless I was quite happy with the result.
As I am normally use to code with vim and tmux I use R just like a (bash)-script with the following shebang:
#!/usr/bin/env Rscript
That way I can throw it into a watch myScript.R while I write it in vim in a different tmux pane. That might have some disadvantages compared to RStudio (e.g. can't view graphics in a terminal), but as it fits very nicely into my normal workflow and performs very well, I am very happy with that solution.
You can send a line to the R console using <space>. I've assigned loads of keyboard shortcuts beginning with your local leader that will do things like str(), levels(), head(), tail(), sum() on the object under the cursor.
It works fine with plotting figures, and I think you can set it up with tmux, though I use vim's buffers.
Haven't seen any disadvantages to compared to Rstudio yet. I guess you could even do :!git add ... from vim.
The book "R for Data Science" by Garrett Grolemund and Hadley Wickham (O'Reilly, 2017) [1] provides a comprehensive introduction to modern R and a set of packages known as the tidyverse. Highly recommended.
I second this - the tidyverse packages make R feel like it's supposed to feel like, tables make sense, string manipulation makes sense, it's all grown out of one consistent approach, kind of the opposite of base R.
Just a shame that R's approach to namespace is so bad that importing tidyverse leads to a few name-clashes with bioconductor...
To save others some of the head-banging sessions I've had with R:
R has an integer division operator, %/%. R gives you the ability to define your own infix operators, as long as you give them symbols that start and end with %. Here's the kicker--all such operators have a higher precedence than multiply and divide, which can lead to unexpected results.
R as a programming language can be frustrating. It has scalar values; you just can't store one in a variable (it becomes a vector of length one). Some functions and operators will work with vectors of arbitrary length... but some require a vector of length one.
(Speaking of which, binary operations on vectors are done by adding corresponding elements, BUT if one operand runs out first, it will start picking them off from the beginning again, with a warning if the length of the longer one isn't a multiple of the length of the shorter one. This may be surprising.)
The wonky list notation takes time to get used to: foo[1] gives you a sublist; chances are you want foo[[1]].
Deciding which of the *apply() functions you want can be a pain. What passes for lambda expressions in R is clunky.
m:n gives you a vector of m, m + 1, ..., n... unless M > n, in which case it assumes you want m, m - 1, ..., n, so 1:0 won't give you an empty vector. This makes for clumsy special case code.
Man I dislike R for its syntax. It does a terrible disservice to people who start coding in R and then think that they "know programming" while they have missed most of the basic programming paradigms any "normal" programming language has.
I think R has a lot of similar ideology as PHP and well everyone has their own opinion about PHP.
Also I found the tutorial seriously lacking I mean no data.frames, matrices, vectors, tables or factors? How to iterate over data.frame might be the biggest thing a beginner needs to know before shooting themselves in the head. apply, lapply, sapply or vapply - which one do I need? Well IMO apply is the best one to start with as it's the basis of them all. sapply is almost the same but it just transforms the result into a vector or matrix.
apply is NOT the basis for the other *apply functions, in fact apply it's the exception. There's apply and there's lapply. The rest are variations of lapply.
As for people mistakenly believing that they "know programming" I don't think this has anything to do with R's syntax. R is a programming language but it's also a system for interactive data analysis thus the syntax had to be adapted to that end.
I'll probably get downvoted for this, but let me tell you - Please don't use R in production. Please don't use R for any serious work.
Over the years, I've come to learn to appreciate the fact that languages are just tools. You simply use the right tool for the job. If you let your personal bias, love/hate get in the way, it will cause you a lot of pain in the long run. In the same token, R is one of the most fucked up languages to work with if you use it simply because you assume it's good for all analytics-related projects. It's not.
In one of my previous companies, we had a hipster, always used everything that's on trend. Against all advice, he decided to use R for many of our internal and client facing projects.
For what would have taken a week if Rails were used, he'd write everything in R Shiny. Yes, he used a statistical programming language to write a web application and serve APIs(!). Performance was terrible. There were lot of break downs. Development prolonged, even his own team members lost morale. I unfortunately had the ill luck of having to maintain some of his codebases and those days were the worst in my life. Worse yet, he didn't have a formal software engineering background, so he loved the idea that you are able to code everything inside of this blackbox called R Studio. Fuck tests, there were no tests written because he didn't understand the importance of tests. The projects he worked on lasted for nearly 1.5 years without completion. Almost every project had an instance on the cloud running an R server and it also costed a LOT simply because it was eating a lot of memory. Even our Ruby projects didn't consume as much.
Eventually most of the projects failed, we lost lot of customers. Many team members quit. All because of one singular mistake of choosing a language that's not right for the job. Eventually, one of our competitors came up with a working prototype in production using Python, Flask and with much better analytic capability at scale in less than 3 months. Python can do a LOT that R can do and cannot do and the code is much, much easier to read.
For example, string concatination:
Python:
hello + world
R:
paste("hello","world",sep="")
If you're really interested in data science and/or analytics, I sincerely urge you to start with Python and Pandas together rather than R. It is much, much performant, easier to reason, and much, much easier to maintain and scale. Please consider this as heartfelt advice based on my mistakes rather than a rant. Thank you.
I have started really enjoying R (with tidyverse) because it allows me to present complicated topics in a very simple manner. I can easily embed short R snippets and LaTeX equations in an Emacs Org mode document, and then export it as a very nice-looking easy-to-read HTML or PDF document with basically no effort other than coming up with the text itself.
As the other comments on this submission imply, if you’re learning R from scratch, start with tidyverse.
You can use base R, but when people talk about how much they hate R, it’s usually because of base R, not tools like dplyr/ggplot2. (I had learned R and used it in college, and nearly quit R entirely until dplyr was released)
And over the last summer, I started using forcats/lubridate, and I am kicking myself for wasting my time not using them sooner and using ugly hacks for the appropriate functionality instead.
> For someone like me, who has only had some programming experience in Python, the syntax of R feels alienating initially. However, I believe it’s just a matter of time before adapting to the unique logicality of a new language.
I preferred R to Python right from the start. However, R is anything but logical, and its syntax is the least of its problems.
> And indeed, the grammar of R flows more naturally to me after having to practice for a while, and I began to grasp its kind of remarkable beauty, that has captivated the heart of countless statisticians throughout the years.
Wow, statisticians care about beauty? This is a shocking scientific discovery! (In the social sciences, but don't let this detract from your achievement.) What data do you use to support your theory?
I use R (or want to use it) whenever I find myself using excel or google-spreadsheet. If I was more fluent in R I would use it many more times. I found that using it instead of standard spreadsheet was much more robust. Spreadsheet have their role, however R is an amazing tool to have in your programming toolset.
It's clear there are a lot of strong opinions about R!
One kind of obscure problem I run in to is R's embrace of a global namespace. Package developers sometimes assume people are using this namespace, and access it via the globalEnv() function. This means that to use the package anywhere else, you basically have to patch their code.
(in contrast, I don't even think about problems like this occurring in python packages. Worst case scenario, can just use a subprocess )
R can be an annoying programming language, but for some reason I've found it easier to use for prototyping than even Python. I think it's because I can sloppily copy and paste between notepad and repl without much issue, whereas in Python I have to be concerned about the whitespace and things are a bit more verbose. I also get more out of the graphing capability of R, but that's probably because I don't understand Python's graphing well enough. Be that as it may, R just seems to have what I need to get things done as sloppily as I need. My workflow tends to be a combination of Python or Java spitting out numbers, and then using R to analyze and graph those numbers, all glued together with Bash scripts.
R is free programming - see the R site above for the terms of utilization. It keeps running on a wide assortment of stages including UNIX, Windows and MacOS.
That's silly. If you're doing research using new statistical methods, they're almost certainly available on R first. And ggplot2 remains the best plotting library I've ever seen.
[+] [-] othello|8 years ago|reply
The tidyverse suite of Hadley Wickham is a great example of this, notably with the pipe operator %>% (similar to |> in F#) which is not part of the base language and yet could be very easily implemented. Julia's macros probably enables the same type of implementation, but I don't see how one would achieve it as easily in Python for example. Non-standard evaluation is another example of R's lispiness in action [0].
Also, consider how easy it is to walk R's S-exp. Expressions in R can only be one of four things: an atomic value, a name, a call or a pairlist. Wickham's Advanced R has a great intro on this [1].
I believe Wickham's amazing work with tidyverse (which really changes the way you code in R) is just the beginning of a rediscovery of R's inner lisp power, a kind of "R: the good parts" moment.
[0] http://adv-r.had.co.nz/Computing-on-the-language.html
[1] http://adv-r.had.co.nz/Expressions.html
[+] [-] amrrs|8 years ago|reply
- RStudio is simply great. I know Python has got Jupyter notebook but RStudio makes a good IDE for anyone (even beginners).
- Python is great because it's easier for beginners to start doing magick without getting frustrated hence a good beginners language and it is more appropriate for R because anyone who wants to begin with Data Analytics, R is a lot easier - without trying to figure out how to install a new package, load a new package, make a plot or anything of that matter. Hence the fall out rate would be less.
- Tidyverse. Without denial, it's a better Universe than Marvel's cinematic universe. Not a single day in my job goes without using dplyr.
- While I've quoted tidyverse in general, ggplot2 - embracing the grammar of graphics has set a very nice standard for visualization libraries which matplotlib (the goto library of Python doesn't offer much)
- Pandas is nothing but a library built on Numpy to offer R like data wrangling functions hence I'd like to consider dplyr and R's inbuilt data manipulation functions superior.
There is no doubt that Python has its own advantages with single library scikit-learn and webservices, R is no way to be hated.
Even millenial companies have found interest in R https://medium.com/airbnb-engineering/using-r-packages-and-e...
Edit:
Missed RShiny to simply create a web app (unlike in Python starting a Flask server and then writing stuff on top of it)
[+] [-] realPubkey|8 years ago|reply
- You can use '=' and '<-' to assign values to variables and both do the same, except in a few edge-cases where you now spend one week finding the error
- It confuses and mixes functional programming and oop not only per entity but also between the usage of them. Want to get a value of entity X? use x.getValue(). Want to get a value of entity Y? Use Y.getValue(y).
- The ide crashes once an hour and does not detect file-changes which forces you to restart it manually.
- People say R is the best and optimized for data-analytics which is simply not true. It's a marketing-lie spread by the creators. There is no data-analytics-task that you cannot do with the same ease in other programming languages.
Disclaimer: My big-data-profs enforced me to use R even for tasks where R should not be used.
[+] [-] ploika|8 years ago|reply
(In my opinion) R is best for traditional statistics, as opposed to AI, machine learning, predictive analytics, data science, data analysis or any other variant thereof.
If you're more concerned with Chi-squared tests than unit tests, or if you need to teach a mathematician or a biologist how to fit regression models and analyse residuals, goodness-of-fit statistics, p-values etc, then R is the best language for the job.
If you need to build a program (as opposed to just do a thing), or if you're more interested in accuracy than inference (as per most machine learning tasks), then Python with sklearn and pandas blows R out of the water.
[+] [-] vijucat|8 years ago|reply
Really? Maybe you worked with R before data.table, dplyr, and the tidyverse packages? I'm not that familiar with pandas in Python, but there is an incredible amount of productivity to be gained from knowing your way around a set of just around 5 packages in R that I never had when working with C++, Java, Perl, or Ruby.
Also, it could just be me, but I overcome the = and <- confusion by simply never using =.
[+] [-] czep|8 years ago|reply
- For assignment, always use '<-'. Read it as "set to". For example, "x <- runif(10)" means "set x to a vector of 10 uniform random numbers". When passing arguments in function calls, use '='.
- If the IDE gives you problems, try using the command line. R Studio or the R GUI app are not necessary. Simply type 'R' in a shell and you have an interactive read-line environment. Use the shell for exploratory work, then write code in your favorite editor and copy/paste after developing a series of commands you want to run.
- Use base R as much as possible, don't install a new package just for one function that you could do with base R functions, even if it's not elegant. Package bloat is one reason for inconsistencies in APIs. Some package developers will make you do x.getValue() and others getValue(x). But remember these are 3rd party packages. You can do a lot using just base R and a few select packages that are well respected (gglot2, dplyr, Hmisc, reshape).
[+] [-] roel_v|8 years ago|reply
[+] [-] icc97|8 years ago|reply
There's more than one IDE for R [0], and strictly that's not a problem with R itself, but the people who built the IDE.
> - People say R is the best and optimized for data-analytics which is simply not true. It's a marketing-lie spread by the creators. There is no data-analytics-task that you cannot do with the same ease in other programming languages.
I think there's very little marketing behind R. It's predominantly a statistics package, but clearly a lot of statisticians are using it for data analysis. So I think it's people that using it for data analysis that talk about it, not bloggers paid to write about it.
[0]: https://stackoverflow.com/questions/1097367/what-ides-are-av...
[+] [-] tomalpha|8 years ago|reply
Note that I don’t claim expertise in the above, but a bunch of very talented people I’ve worked directly with, and who were very directly incentivized to be productive, used R.
Perhaps your profs were trying to help you learn R, including its limitations, when they were setting you tasks?
[+] [-] Yeikoff|8 years ago|reply
But I would give R a try with the tidyverse, it made me go from hating R to just not caring about it.
While libraries are extremely inconsistent, if you want to use cutting edge statistical methdos as a researcher, you pretty much have no other option. Finally, data wrangling is quite well developed in the R evironment.
So long story short, after many years of hating R, now I just find it a handy tool to do my work despite it being old, inconsistent and sometimes annoying.
[+] [-] cshenton|8 years ago|reply
- documentation is all in PDF format
- can only install packages from the interpreter
- testing libraries not feature complete
- weird namespacing
- poor test coverage in popular packages
- no mature webserver
[+] [-] yread|8 years ago|reply
I think that the users/library creators are also guilty of why working with R is such a pain. Giving them the option to overload operator was a major mistake. C++ programmers are more often engineers who have more concern for the code reader and even they probably overuse it.
[+] [-] thousandautumns|8 years ago|reply
[+] [-] hutzlibu|8 years ago|reply
Can you tell us a bit about those edge-cases which can lead to hard finding bugs?
[+] [-] tugash|8 years ago|reply
- Aren't those methods defined by each package/object? Or you mean '@'?
- RStudio (the most used IDE) is one of the reasons that I use R so heavily. Never encountered your problems. I can even git checkout to another branch without problems and the new versions are loaded without problem.
- For a quick descriptive analysis or some tests I don't know something easier, compared to SQL or Python. But that's probably only personal preferences and/or knowledge of the language.
[+] [-] ocschwar|8 years ago|reply
And I can see why some people like R. They are end users for whom the language was explicitly designed, so they like the ergonomics (to use the term Rustaceans are popularizing.)
The thing is, like Perl and Latex and other products you could think of, R was initially written by people with a good idea of the end uses and how to enable those end uses, but not a good idea on how to reconcile those ergonomics with the need for a clean parseable syntax.
So if you make too extensive a reliance on R, you wind up having to hire someone like me.
[+] [-] baldfat|8 years ago|reply
I really enjoy R and the more I learn programming the more I enjoy it. Best thing I ever did was learn the language Racket and How to Design Programs and Hadley Wickham's tidyverse.
I moved from Python to R about six years ago. Before that I did most of my work in the command-line. R's rise in popularity has been caused by the libraries in the tidyverse and data.table. The millions of dollars invested into R by many companies and an amazing eco-system
> You can use '=' and '<-' to assign values to variables and both do the same, except in a few edge-cases where you now spend one week finding the error
Its just R's symantics. It is almost universaly spoken to just use <- for style consistency and edge cases. Use the RStudio shortcut `Alt` and `-`. The reason you spend a week is the reason why it is recommended for all users to just use <-.
> It confuses and mixes functional programming and oop not only per entity but also between the usage of them. Want to get a value of entity X? use x.getValue(). Want to get a value of entity Y? Use Y.getValue(y).
R comes from S the creators of R were also inspired by Scheme. Personally I learned the language Racket to be a better R programmer and I pretty much live in the Functional side of R. I actually like the fact that they added more functional core to R from S+. http://r.cs.purdue.edu/pub/ecoop12.pdf
> The ide crashes once an hour and does not detect file-changes which forces you to restart it manually.
Then use a different system. R does not equal RStudio. I have never experienced this and I have worked in Linux, Mac and Windows 7 - 10. To me RStudio is the best example of an electron app and the only IDE that I actually use the built in git feature. R Projects and RWorkbooks are the best features of RStudio.
> People say R is the best and optimized for data-analytics which is simply not true. It's a marketing-lie spread by the creators. There is no data-analytics-task that you cannot do with the same ease in other programming languages.
So R is just as easy to use for data analysis and has best in class statistics? Also I think tidyverse is much easier than any other data analysis system I have ever seen, but I guess that is just my opinion.
> It's a marketing-lie spread by the creators.
What does Ross Ihaka and Robert Gentleman have to gain for marketing and what lie has they ever said. This falls into conspiracy theory.
There is a large community of R users and we like R. There are a ton of
[+] [-] dcl|8 years ago|reply
[+] [-] Gatsky|8 years ago|reply
Regarding your 3rd point, I have never seen or heard anyone say R is the best, and I use it almost daily.
[+] [-] Zariff|8 years ago|reply
[+] [-] JepZ|8 years ago|reply
That was the moment when I asked a friend of my who has some R experience to help me with the basics (yes the syntax is kinda weird at the beginning). After 4 hours of learning by doing we had the same result as what I had reached in a few days of work with LibreOffice and it calculated everything in about 17 seconds. Yes, this time I knew exactly what I wanted and R can do much more efficient transformations than you could ever do with a spreadsheet calculator. Nevertheless I was quite happy with the result.
As I am normally use to code with vim and tmux I use R just like a (bash)-script with the following shebang:
That way I can throw it into a watch myScript.R while I write it in vim in a different tmux pane. That might have some disadvantages compared to RStudio (e.g. can't view graphics in a terminal), but as it fits very nicely into my normal workflow and performs very well, I am very happy with that solution.[+] [-] dm319|8 years ago|reply
I love it.
You can send a line to the R console using <space>. I've assigned loads of keyboard shortcuts beginning with your local leader that will do things like str(), levels(), head(), tail(), sum() on the object under the cursor.
It works fine with plotting figures, and I think you can set it up with tmux, though I use vim's buffers.
Haven't seen any disadvantages to compared to Rstudio yet. I guess you could even do :!git add ... from vim.
[1] https://github.com/jalvesaq/Nvim-R
[+] [-] jeroenjanssens|8 years ago|reply
[1] http://r4ds.had.co.nz/
[+] [-] a_bonobo|8 years ago|reply
Just a shame that R's approach to namespace is so bad that importing tidyverse leads to a few name-clashes with bioconductor...
[+] [-] icc97|8 years ago|reply
[0]: http://adv-r.had.co.nz/
[+] [-] cecilialee|8 years ago|reply
[+] [-] jejones3141|8 years ago|reply
R has an integer division operator, %/%. R gives you the ability to define your own infix operators, as long as you give them symbols that start and end with %. Here's the kicker--all such operators have a higher precedence than multiply and divide, which can lead to unexpected results.
R as a programming language can be frustrating. It has scalar values; you just can't store one in a variable (it becomes a vector of length one). Some functions and operators will work with vectors of arbitrary length... but some require a vector of length one.
(Speaking of which, binary operations on vectors are done by adding corresponding elements, BUT if one operand runs out first, it will start picking them off from the beginning again, with a warning if the length of the longer one isn't a multiple of the length of the shorter one. This may be surprising.)
The wonky list notation takes time to get used to: foo[1] gives you a sublist; chances are you want foo[[1]].
Deciding which of the *apply() functions you want can be a pain. What passes for lambda expressions in R is clunky.
m:n gives you a vector of m, m + 1, ..., n... unless M > n, in which case it assumes you want m, m - 1, ..., n, so 1:0 won't give you an empty vector. This makes for clumsy special case code.
[+] [-] tekkk|8 years ago|reply
I think R has a lot of similar ideology as PHP and well everyone has their own opinion about PHP.
Also I found the tutorial seriously lacking I mean no data.frames, matrices, vectors, tables or factors? How to iterate over data.frame might be the biggest thing a beginner needs to know before shooting themselves in the head. apply, lapply, sapply or vapply - which one do I need? Well IMO apply is the best one to start with as it's the basis of them all. sapply is almost the same but it just transforms the result into a vector or matrix.
[+] [-] yummy|8 years ago|reply
[+] [-] lottin|8 years ago|reply
As for people mistakenly believing that they "know programming" I don't think this has anything to do with R's syntax. R is a programming language but it's also a system for interactive data analysis thus the syntax had to be adapted to that end.
[+] [-] neya|8 years ago|reply
Over the years, I've come to learn to appreciate the fact that languages are just tools. You simply use the right tool for the job. If you let your personal bias, love/hate get in the way, it will cause you a lot of pain in the long run. In the same token, R is one of the most fucked up languages to work with if you use it simply because you assume it's good for all analytics-related projects. It's not.
In one of my previous companies, we had a hipster, always used everything that's on trend. Against all advice, he decided to use R for many of our internal and client facing projects.
For what would have taken a week if Rails were used, he'd write everything in R Shiny. Yes, he used a statistical programming language to write a web application and serve APIs(!). Performance was terrible. There were lot of break downs. Development prolonged, even his own team members lost morale. I unfortunately had the ill luck of having to maintain some of his codebases and those days were the worst in my life. Worse yet, he didn't have a formal software engineering background, so he loved the idea that you are able to code everything inside of this blackbox called R Studio. Fuck tests, there were no tests written because he didn't understand the importance of tests. The projects he worked on lasted for nearly 1.5 years without completion. Almost every project had an instance on the cloud running an R server and it also costed a LOT simply because it was eating a lot of memory. Even our Ruby projects didn't consume as much.
Eventually most of the projects failed, we lost lot of customers. Many team members quit. All because of one singular mistake of choosing a language that's not right for the job. Eventually, one of our competitors came up with a working prototype in production using Python, Flask and with much better analytic capability at scale in less than 3 months. Python can do a LOT that R can do and cannot do and the code is much, much easier to read.
For example, string concatination:
Python:
R: If you're really interested in data science and/or analytics, I sincerely urge you to start with Python and Pandas together rather than R. It is much, much performant, easier to reason, and much, much easier to maintain and scale. Please consider this as heartfelt advice based on my mistakes rather than a rant. Thank you.[+] [-] kqr|8 years ago|reply
It is incredibly liberating.
[+] [-] minimaxir|8 years ago|reply
You can use base R, but when people talk about how much they hate R, it’s usually because of base R, not tools like dplyr/ggplot2. (I had learned R and used it in college, and nearly quit R entirely until dplyr was released)
And over the last summer, I started using forcats/lubridate, and I am kicking myself for wasting my time not using them sooner and using ugly hacks for the appropriate functionality instead.
[+] [-] catnaroek|8 years ago|reply
I preferred R to Python right from the start. However, R is anything but logical, and its syntax is the least of its problems.
> And indeed, the grammar of R flows more naturally to me after having to practice for a while, and I began to grasp its kind of remarkable beauty, that has captivated the heart of countless statisticians throughout the years.
Wow, statisticians care about beauty? This is a shocking scientific discovery! (In the social sciences, but don't let this detract from your achievement.) What data do you use to support your theory?
[+] [-] tomerbd|8 years ago|reply
[+] [-] closed|8 years ago|reply
One kind of obscure problem I run in to is R's embrace of a global namespace. Package developers sometimes assume people are using this namespace, and access it via the globalEnv() function. This means that to use the package anywhere else, you basically have to patch their code.
(in contrast, I don't even think about problems like this occurring in python packages. Worst case scenario, can just use a subprocess )
[+] [-] uptownfunk|8 years ago|reply
[+] [-] doggydogs94|8 years ago|reply
[+] [-] abakus|8 years ago|reply
[+] [-] yters|8 years ago|reply
[+] [-] jerianasmith|8 years ago|reply
[+] [-] gregman1|8 years ago|reply
[+] [-] pmyteh|8 years ago|reply
[+] [-] icc97|8 years ago|reply
Perhaps you missed of a word at the end of the sentence? (Not that this makes it much better, but at least comprehensible)
... unable to learn Python?