(no title)
mshron | 4 years ago
Over the last decade, the R community has largely standardized around tools like dplyr, ggplot, tibble, purrr, and so on that make doing data science work way easier to reason about. Much more ergonomic. At my company we switched from using Python to using R for most analytical data science work because the Tidyverse tools make it so much easier to avoid bugs and weird join issues than you get in a more imperative programming environment.
uryga|4 years ago
Consider this example:
(Source: https://tidyeval.tidyverse.org/sec-why-how.html)Where'd `height` and `gender` come from in the dplyr version? They're just columns in a DF, not variables, and yet they act like variables... Well that's the dplyr magic baby!
dplyr (and other tidystuff) achieves this "niceness" by doing a whole bunch of what amounts to gnarly metaprogramming[1] -- that example was taken from a whole big chapter about "Tidy evalutation", describing how it does all this quote()-ing and eval()-ing under the hood to make the "nicer" version work. it's (arguably) more pleasant to read and write, but much harder to actually understand -- "easy, but not simple", to paraphrase a slightly tired phrase.
---
[1] IIRC it works something like this. the expressions
are actually passed to `filter` as unevaluated ASTs (think lisp's `quote`), and then evaluated in a specially constructed environment with added variables like `height` and `gender` corresponding to your dataframe's columns. IIRC this means it can do some cool things like run on an SQL backend (similar to C#'s LINQ), but it's not somthing i'd expose a beginner to.canjobear|4 years ago
buixuanquy|4 years ago
jstx1|4 years ago
melling|4 years ago
|>
https://www.r-bloggers.com/2021/05/new-features-in-r-4-1-0/
tarsinge|4 years ago
tpoacher|4 years ago
Avoid tidyverse like the plague, except when you can't, or when you don't actually care about the sanity of your code and are happy copy/pasting pre-prescribed snippets without needing to understand let alone modify them.
mr_toad|4 years ago
One day I’ll have a whole week free so I can sit down and learn an entire graphical grammar so that I can remove the egregious amounts of chart-junk in the ggplot defaults.