(no title)
exyi | 11 months ago
It's the classic, why is processing sorted array faster than unsorted one
def f(arr):
r = True
for x in arr:
if x > 128: r = not r
return r
arr = [ random.randint(0, 255) for i in range(0, 1000_000) ]
arr_sorted = list(sorted(arr))
%timeit f(arr)
%timeit f(arr_sorted)
Results are (on my machine): 17.5 ms for unsorted, and 13.5 ms for sorted. For comparison, in a compiled language, the difference is 4x
ammar2|11 months ago
This depends on the Python version, but if it has the specializing interpreter changes, the `COMPARE_OP` comparing the integers there is probably hitting a specialized `_COMPARE_OP_INT` [1].
This specialization has a ternary that does `res = (sign_ish & oparg) ? PyStackRef_True : PyStackRef_False;`. This might be the branch that ends up getting predicted correctly?
Older versions of Python go through a bunch of dynamic dispatch first and then end up with a similar sort of int comparison in `long_richcompare`. [2]
[1] https://github.com/python/cpython/blob/561965fa5c8314dee5b86...
[2] https://github.com/python/cpython/blob/561965fa5c8314dee5b86...
dzaima|11 months ago
LPisGood|11 months ago
I wonder if you could do Spectre type vulnerabilities in python. You would need some way to leak micro-architectural state, so without being particularly clever maybe python code could only be used as a gadget or something.
cma|11 months ago
exyi|11 months ago
I'm now thinking that the difference might be even larger if we instead avoid small integers and let the CPU get stuck chasing pointers. The idea is that it gets stuck on a memory access, which forces it to speculate much further, which in turn makes it backtrack a longer path if a branch was mispredicted. I'm obviously no expert on this, feel free to correct me
The results for 1B range instead of 255 are 17.6 ms for unsorted / 68.2 ms for sorted! We are back to what the original article observed and it's a way stronger effect than what branch prediction can offer. So don't sort your arrays, keep them in the order the boxed values were allocated ;)
bgwalter|11 months ago
GBhpkmY|11 months ago
[deleted]