top | item 19758726

(no title)

gameguy43 | 6 years ago

Original author here, happy to answer questions about data structures, algorithms, and coding interviews!

discuss

strainer|6 years ago

It might be notable that on very small lists (up to around 50, maybe 100 depending on cpu/cache) insertion sort is fastest of all, even on difficult input distributions. Its so simple that cpu can skip through it rapidly. I noticed this myself and have also read other developers mention it in sort design discussions. A popular general purpose sort called 'Timsort' defaults to it for small sequences. Its possible to tweak the insert algorithm to sum distance of elements moved so far and escape to a more substantial algorithm when it gets excessive. If little rearrangement is required insertion sort is about as fast as scanning through memory can be.

I think in practice the answer to the question "which should I use" (rather than "which should ideally be used") is the best developed hybrid sort library for the platform. Beyond the suitable case for insertion sort, its a really substantial job for an individual to develop, test and optimize a high performance sort library that can handle a range of distribution types.

btilly|6 years ago

One of the big factors behind this is that the cost of pipeline stalls in a theoretically good algorithm can significantly exceed the cost of extra comparisons.

Whether that is true depends on a combination of how much data you have, what kind of data you have, and your programming language's low-level representation.

For ints in C++ and a list of 20 elements, it will be reliably true. For strings in Perl with a list of 500 elements, it reliably won't be true.

gfdhnk|6 years ago

LSB radix sort can be slow, but MSB radix sort is the fastest sorting algorithm for an array of machine words.

pstuart|6 years ago

"No CS degree necessary." lured me in. I'm subscribed and looking forward to learning more.