A popular counterpoint in the R community is that in many data cleaning tasks, the bottleneck is human understanding / coding time, not comptutation time. In other words, we'd rather spend 1 hour writing up a script that runs in 10 minutes and needs to be run a handful of times at most, than spend 6 hours writing something that takes 10 seconds.Edit: This of course goes hand-in-hand with the claim that it is easier/faster to write R scripts. If you're not familiar with it, the tidyr and dplyr packages in particular (part of the tidyverse) are fantastic in the verbs they provide for thinking about data cleaning.
No comments yet.