top | item 34567305

(no title)

rlh2 | 3 years ago

The obsession with cpu speed almost always confuses me in these topics. Time it takes to program is way more important, and that’s where a terse language like R shines. The base/most common functions are almost always executing C anyway. It’s kind of like lisp in that it’s easy to write slow code, but who cares if it’s “fast enough”? Also, it’s almost always easy to speed up if necessary at the R level and R’s C API is also easy to use for for numeric computing/optimization which is exposed at the C level if you want to use it.

discuss

order

kkoncevicius|3 years ago

It depends. Take for example any omic dataset where you might need to run a GLM model on ~500,000 rows. Codes I've seen for this operation can range in time from taking 30 minutes to 2 days.

My take away here is that, sure, for one operation the speed is not that critical, but there is always the case where that one operation will be used close to a million times in one analysis and then it all adds up. On top of that if it's implemented in C then the invocation from R to C and back will be happening that many times which adds to the slowness.

derbOac|3 years ago

Yes, I use R, Julia, and Python from time to time depending on the case and my mood and they all have their advantages and disadvantages.

R is more than fast enough for straightforward prototypical analyses where a lot of the code is calling C or something lower level and you're not introducing something "new" to the interpreter system. But if you want to do some unusual optimization there's going to be something that bottlenecks everything unless you go into C/C++/Fortran yourself, and then Julia is a good compromise. I've had times when Julia didn't save any time whatsover, and other times when it took something that would literally run over a week at least in R and it was done in 30 minutes in Julia.

Having said that, the more I use Julia the more I find myself scratching my head about it. It's very elegant but it's just low-level enough that sometimes I wonder if it's worth it over, say, modern C++ or something similarly low level, which tends to have nice abstracted libraries that have accumulated over the years. I also have the general impression, mentioned in a controversial post discussed here on HN, that a lot of Julia libraries I've used just don't quite work for mysterious reasons I've never been able to figure out. Everything with Julia has gotten better with time but I still have this sense that I could put a lot of time into some codebase, and have it just hit a wall because of some dependency that's not operating as documented.

There's kind of an embarrassment of riches in numerical computing today, and yet I still have the feeling there's room for something else. Maybe that's the mythical golden language that's lured all sorts of language developers since the beginning though.

Hasnep|3 years ago

One of the key points of Julia is that the language you use for performance critical parts is also Julia. That applies to both the libraries like DataFrames.jl and for situations where you'd drop to a lower level language when optimising. I think being productive in Fortran or C++ is unrealistic for most scientific programmers.

freehorse|3 years ago

It is a trade-off and a sweet spot has a lot to do with the specific context and background. Run speed matters a lot when the difference is between having to run your code on a dataset for half an hour vs through the whole night. Once you have prototyped your code, you are gonna use it more and more (not to mention runs in order to tweak parameters or validate results), and R's speed is not satisfying enough for my work. Python matlab are easy and fast enough to program in, and much faster for tasks that are computing-heavy. If I was getting into C I would not have saved as much time as I would have put into learning how run eg parallel tasks there safely. Moreover, R is not necessarily faster to program, always; real (ie tidyverse-style) R is quite idiosyncratic, if you come from a programming and not from a statistics background probably it will take more time to learn than it is worth unless it is sth important in your work environment.

CyberDildonics|3 years ago

When someone understands what is happening when their program executes they will write faster programs without much more effort.

You might like writing slow programs, but that doesn't mean people like using them.