(no title)
rlh2 | 3 years ago
I often chuckle when people complain about R in production and how it isn't a good general purpose programming language, my experience has been the polar opposite. You can write bad code in any language, and R is no exception, but R allows you to write so much less code and R-core is truly exceptional at backwards compatibility. Our approach to R is basically:
- Don't have a lot of dependencies, and when you do have dependencies, make sure they themselves don't have a lot of dependencies. While we do use shiny as mentioned above, our core models are very dependency light and shiny is just a basic front end.
- data.table (which was designed by quants) is a zero-dependency package that is by far the best tabular data manipulation package that has ever been created since the dawn of time. We generally work on an EC2 instance running linux with a ton of memory. In the < .01% of cases where a dataset doesn't fit in memory (e.g. tick data), we do initial parsing with awk if file based or SQL if DB based and then work in R.
- Check/coerce argument types and lengths on function input to catch and avoid all the quirky edge cases that drive people nuts - it's so easy!
- I hate OOP and I love that R doesn't encourage it. Mutable state, especially for non-software engineers, is the devil. Don't get me wrong, OOP has its place, but the fact that R encourages functional programming is one of the best things about it. The slight inefficiency this produces is almost never a problem.
- R is not slow at all when used correctly. Additionally, the C API is a joy to use when necessary.
- Stick to the base types: vectors, matrices, lists, environments and data.tables (only exception). The fact that you can name, and then use names to index all of the above is stunningly powerful. The only "objects" we really create are lightweight extensions of lists with an S3 print method.
- We have an internal version of renv/packrat that creates a plain text "dependency file" for projects and we pin package versions in docker containers. RConnect doesn't use docker right now, but they do have a versioning system that works quite well in my experience.
I definitely wouldn't want to build something like a company website in R, but I also wouldn't want to build that in C either. R definitely has it's place a server-side language even outside it's assumed domain of statistics.
Haters gonna hate, but joke is on them.
No comments yet.