top | item 6823240

(no title)

tsahyt | 12 years ago

I've already posted this on /r/haskell on reddit but since the article turned up here as well, I might just as well post this again. Since we're talking optimization I had a short look at the C code and found it to be a very naive implementation and have managed to cut the execution time quite a bit by employing a sort of memoization. Once you get to a number of the sequence which you have already seen before, the length from there to 1 remains the same as before.

So here is the code: http://lpaste.net/96397

Here are my measurements:

    time ./original 1000000
    (329, 837799)

    real   0m0.183s
    user   0m0.180s
    sys    0m0.001s

    time ./new 1000000
    (329, 837799)
    
    real   0m0.017s
    user   0m0.015s
    sys    0m0.001s

discuss

deletes|12 years ago

The idea is to optimize while keeping the algorithm the same throughout the languages, at least if you want any meaningful comparison. You are comparing languages not different algorithms. +1 for effort though.

anon1385|12 years ago

It doesn't seem that meaningful to compare a totally naive C implementation to an optimised Haskell one either. (Well, other than to show how fast C is even without any effort to optimise the code). Saying you have to keep the algorithm the same seems like a rather arbitrary limitation on the comparisons, especially since some algorithms are much less of a natural fit in some languages (see: quicksort in Haskell vs C [1]).

I would suggest it's best to show (at least) 4 versions of the code to compare two language: the naive implementation in both languages, and the really optimised version in both languages (plus intermediate levels of optimisation if you feel dedicated). That gives a picture of how fast the typical code one would write will be, as well as the potential for optimisation if you are experiencing bottlenecks, and how much effort it takes to optimise the code.

[1] http://augustss.blogspot.co.uk/2007/08/quicksort-in-haskell-...

ejgottl|12 years ago

Agreed. Optimizing the code without first fixing the asymptotic performance of the algorithm detracts significantly from the article. I've solved this problem before in Project Euler, in both Haskell and C++, and for my purposes as a beginning Haskell programmer, figuring out how to construct an asymptotically efficient algorithm for a Dynamic Programming problem in Haskell is what is interesting here.

tsahyt|12 years ago

I know what you mean but it's early in the morning here and I feel like being a bit pedantic: As far as we can tell neither of the algorithms actually terminates for arbitrary inputs. If we could show that, we'd have proven the Collatz conjecture. Therefore the notion of asymptotic performance doesn't make a lot of sense, since we want to give an upper bound on how long the algorithms takes to terminate. However, if we found such an upper bound, it would be equivalent to solving the problem itself, because the runtime depends on how long a given sequence is, and putting it into a closed form. At that point we'd trivially have an O(1) algorithm.

What we can do, is measure complexity empirically and extrapolate something from there. I haven't done a lot of tests with this code but to me it seemed like it would scale the same as the original one but had a much lower constant.