top | item 27826221

(no title)

To get more gonzo, in CPython's list.sort(), the C code that actually merges two lists (which happen, in context, to be two contiguous array slices) is in listobject.c's `merge_at(()` function. That brings in the world of "galloping" (exponential search) optimizations, much faster than one-pair-at-a-time compares in many real-world cases.

So that's a whole lot of additional complication, but it's the heart of what _really_ makes timsort shine in its "jaw dropping" cases.

Tim (of "timsort" - which was an inside joke name at the start, because I never expected anything outside of CPython would use it ;-) ).

discuss

adamgordonbell|4 years ago

Wow. Thanks for writing it. I believe this is the part you are referring to: https://github.com/python/cpython/blob/main/Objects/listobje...

tim-peters|4 years ago

Right, `merge_at()`. But that in turn calls small mountains of other C code. It is, alas, a complicated approach. Folding it in would be substantially more work than you've already done.

The point: for inputs like your:

a = list(range(-100, 1700)) b = list(range(1400, 1800))

it would first use a variant of binary search to find where b[0] belongs in a. "Almost all of a" is consumed by that via a relatively tiny number of compares. It would similarly do a binary-ish search to find where a[-1] belongs in b. The tail end of b, from 1700 through 1799, then also never needs to be looked at again.

In the lucky case that two input lists don't overlap at all, and a[-1] <= b[0], that's enough so that a "one-pair-at-a-time" compare _never_ needs to be done. In context (the two lists are contiguous slices of an array), no pointers at all need to be copied in that case either - it deduces that the combined array slices are already in sorted order in total time O(log(len(a)) + log(len(b))), and it's done.

borski|4 years ago

Gosh, comments like these from the actual author are a reminder of why I love HN. :)

barosl|4 years ago

I'm also amazed by the fact that that John Nagle from the popular Nagle's algorithm is still active on HN!

j-pb|4 years ago

Is there a reason to use galloping over an array vs a seek in a search tree?

Better real world factors?

Galloping, Demaine set intersection, worst case optimal joins, they all seem to be different aspects of the same underlying principle.

So I'm very curious about the peculiarities in the sorting case.

tim-peters|4 years ago

See CPython's https://github.com/python/cpython/blob/main/Objects/listsort... for details about the sort. In fact, its "galloping" was inspired by a paper of which Demaine was a co-author (reference in the file already linked to).

CPython's lists are implemented as contiguous C arrays of pointers (to Python objects). Any way of grafting a tree-ish structure on top of that would be unacceptably wasteful of space and/or time. So sorting "gallops" over arrays because it's array-in, array-out.

cratermoon|4 years ago

I had the same idea, that re-using a lot of the non-sort parts of the TimSort code in this merge, the result would be just "Tim".