top | item 47007966

(no title)

shoo | 16 days ago

The underlying argument this article seems to be making is that an appropriate algorithm for any given application isn't always the one with the most efficient asymptotic performance for sufficiently large n -- for a given application (in this case routing), we have data on typical values of n that appear in reality and we can choose an algorithm that offers good enough (or optimal) performance for n in that constrained range, as well as potentially having other benefits such being simpler to implement correctly in a short amount of engineering time.

This argument is very much in line with Mike Acton's data-driven design philosophy [1] -- understand the actual specific problem you need to solve, not the abstract general problem. Understand the statistical distribution of actual problem instances, they'll have parameters in some range. Understand the hardware you're building writing software for, understand its finite capacity & capability.

It's common that new algorithms or data-structures with superior asymptotic efficiency are less performant for smaller problem sizes vs simpler alternatives. As always, it depends on the specifics of any given application.

[1] see Mike's CppCon 2014 talk "Data-Oriented Design and C++" https://www.youtube.com/watch?v=rX0ItVEVjHc

discuss

rramadass|16 days ago

> understand the actual specific problem you need to solve, not the abstract general problem. Understand the statistical distribution of actual problem instances, they'll have parameters in some range. Understand the hardware you're building writing software for, understand its finite capacity & capability.

Very much in line with what James Coplien and colleagues described with "Commonality and Variability Analysis" and "Family-oriented Abstraction, Specification, and Translation" (FAST) for Software Engineering in the 90's. Coplien's PhD thesis titled Multi-Paradigm Design and book titled Multi-Paradigm Design for C++ is based on this approach.

Commonality and Variability in Software Engineering (pdf) - https://www.dre.vanderbilt.edu/~schmidt/PDF/Commonality_Vari...

Multi-Paradigm Design (pdf of PhD thesis) - https://tobeagile.com/wp-content/uploads/2011/12/CoplienThes...

GuB-42|16 days ago

I came to realize that logs don't matter. O(log(n)) and O(1) are effectively the same thing.

The reason is that real computers have memory, accessing memory takes time, and bigger n needs more memory, simply to store the bits that represent the number. Already, we have a factor of O(log(n)) here.

But also each bit of memory takes physical space, we live in a 3D world, so, best case, on average, the distance to a memory cell is proportional to the cube root of the amount of memory cells we will have to access. And since the speed of light is finite, it is also a cube root in time.

So "constant time" is actually cube root on a real computer. And I am generous by saying "real". Real real computers have plenty of stuff going on inside of them, so in practice, complexity analysis at the O(log(n)) level is meaningless without considering the hardware details.

Going from log(n) to log2/3(n) is just an exercise in mathematics, it may lead to practical applications, but by itself, it doesn't mean much for your software.

fsckboy|16 days ago

>logs don't matter. O(log(n)) and O(1) are effectively the same thing.

ever heard of a hashtable? that's because O(c) is better than O(log(N)). if they were the same, you would only have heard of binary search.

MoltenMan|16 days ago

Most of this is very true, except for the one caveat I'll point out that a space complexity O(P(n)) for some function P implies at least a O(cubedroot(P(n))) time complexity, but many algorithms don't have high space complexity. If you have a constant space complexity this doesn't factor in to time complexity at all. Some examples would be exponentiation by squaring, miller-rabin primality testing, pollard-rho factorization, etc.

Of course if you include the log(n) bits required just to store n, then sure you can factor in the log of the cubed root of n in the time complexity, but that's just log(n) / 3, so the cubed root doesn't matter here either.

layer8|16 days ago

> Already, we have a factor of O(log(n)) here.

Doesn’t that mean that O(log(n)) is really O(log²(n))?

justinhj|15 days ago

All of Knuth's Art of Computer Programming covers this too.