top | item 21189755

(no title)

maltelau | 6 years ago

A popular counterpoint in the R community is that in many data cleaning tasks, the bottleneck is human understanding / coding time, not comptutation time. In other words, we'd rather spend 1 hour writing up a script that runs in 10 minutes and needs to be run a handful of times at most, than spend 6 hours writing something that takes 10 seconds.

Edit: This of course goes hand-in-hand with the claim that it is easier/faster to write R scripts. If you're not familiar with it, the tidyr and dplyr packages in particular (part of the tidyverse) are fantastic in the verbs they provide for thinking about data cleaning.

discuss

order

No comments yet.