top | item 9460863

(no title)

sinwave | 11 years ago

May I please have your faster than nlogn sorting algorithm

discuss

Perhaps he meant something along the lines of using sorting for the sake of development expediency when a different approach would have been more efficient. For example, finding the kth-largest element of a list: for large lists it might be more efficient [O(kn)] to repeatedly drop the max element than to sort and then index [O(nlogn)]. Of course, the first approach requires taking the time to write your own function, whereas the second may be as simple as `my_list.sort()[k]` depending on your language.

asgard1024|11 years ago

Sure thing: http://en.wikipedia.org/wiki/Radix_sort

alecco|11 years ago

Radix sort is amazing, but it's kinda O(n log n), see

https://en.wikipedia.org/wiki/Radix_sort#Efficiency

  > Radix sort complexity is O(wn) for n keys which are integers of
  > word size w. Sometimes w is presented as a constant, which would
  > make radix sort better (for sufficiently large n) than the best
  > comparison-based sorting algorithms, which all perform O(n log n)
  > comparisons to sort n keys. However, in general w cannot be
  > considered a constant: if all n keys are distinct, then w has to
  > be at least log n for a random-access machine to be able to store
  > them in memory, which gives at best a time complexity O(n log n).[2]
  > That would seem to make radix sort at most equally efficient as the
  > best comparison-based sorts (and worse if keys are much longer than
  > log n).

vvanders|11 years ago

+1 Radix will destroy nlogn(sometimes by multiple orders of magnitude) for any data set that you can fit in memory(and with the radix size tuned for your appropriate cache line length).

Never underestimate the strength of the prefetcher and using the correct data access patterns.

marcosdumay|11 years ago

For bounded countable sets (like integers between x and y), there are linear time sorting algorithms.

lorenzhs|11 years ago

It's pretty clear that he's not talking about asymptotically faster than n log(n), but faster on real machines. While optimisations---like in-place partitioning, tail recursion, and using insertion sort when your problem size falls below a threshold (e.g. less than 16 objects)---won't change asymptotical complexity, they will make your implementation run five times faster.

Asymptotic notation hides the constant factors, which is good when you're talking about asymptotic growth. But when you want a practically fast algorithm, constant factors matter!