top | item 47010622

(no title)

GuB-42 | 16 days ago

I came to realize that logs don't matter. O(log(n)) and O(1) are effectively the same thing.

The reason is that real computers have memory, accessing memory takes time, and bigger n needs more memory, simply to store the bits that represent the number. Already, we have a factor of O(log(n)) here.

But also each bit of memory takes physical space, we live in a 3D world, so, best case, on average, the distance to a memory cell is proportional to the cube root of the amount of memory cells we will have to access. And since the speed of light is finite, it is also a cube root in time.

So "constant time" is actually cube root on a real computer. And I am generous by saying "real". Real real computers have plenty of stuff going on inside of them, so in practice, complexity analysis at the O(log(n)) level is meaningless without considering the hardware details.

Going from log(n) to log2/3(n) is just an exercise in mathematics, it may lead to practical applications, but by itself, it doesn't mean much for your software.

discuss

fsckboy|15 days ago

>logs don't matter. O(log(n)) and O(1) are effectively the same thing.

ever heard of a hashtable? that's because O(c) is better than O(log(N)). if they were the same, you would only have heard of binary search.

GuB-42|13 days ago

Hash tables are indeed O(1) in theory while binary search is O(log(N)) in theory, no problem with that.

But in practice, accessing a physical memory cell is not O(1), it is O(cuberoot(N)).

For a binary search, you need O(log(N)) operation, but each operation (accessing a memory cell) is O(cuberoot(N')) where N' is the number of elements up to the level of the tree you are at, ending at N for the last iteration. So that's O(sum(0, log(N), cuberoot(N/exp(n))), which simplifies to O(cuberoot(N)), exactly the same!

In real practice, the choice of binary search vs hash table depends on considerations like cache behavior. Hybrid methods combining hash tables with hierarchical data structures are often used when N is large. But that's my point, you won't choose a hash table over a binary search because one is O(1) and the other is O(log(n)), you chose one over the other because of how it fits your hardware architecture.

Contrast with, say, O(n.log(n)) over O(n^2). When n is large, barring space-time tradeoffs, you should always pick the O(n.log(n)) algorithm, because the reduction in complexity will always beat any sane hardware considerations.

chowells|15 days ago

Please explain to me how you can hash n distinct strings into O(n) buckets in O(1) time. Please note that this process needs to work as n goes to infinity.

Hash tables are O(log n) structures when you don't hand-wave away the "compute a hash" part. The thing is, search trees are far worse than that in practice and you aren't hand-waving away the "compare two elements" part. That's where the real speed savings come from.

MoltenMan|15 days ago

Most of this is very true, except for the one caveat I'll point out that a space complexity O(P(n)) for some function P implies at least a O(cubedroot(P(n))) time complexity, but many algorithms don't have high space complexity. If you have a constant space complexity this doesn't factor in to time complexity at all. Some examples would be exponentiation by squaring, miller-rabin primality testing, pollard-rho factorization, etc.

Of course if you include the log(n) bits required just to store n, then sure you can factor in the log of the cubed root of n in the time complexity, but that's just log(n) / 3, so the cubed root doesn't matter here either.

layer8|15 days ago

> Already, we have a factor of O(log(n)) here.

Doesn’t that mean that O(log(n)) is really O(log²(n))?

thfuran|15 days ago

You have to define what n is.