top | item 44816390

(no title)

andylei | 6 months ago

i'll answer your argument with the initial paragraph you quoted:

> A compiler for C/C++/Rust could turn that kind of expression into three operations: load the value of x, multiply it by two, and then store the result. In Python, however, there is a long list of operations that have to be performed, starting with finding the type of p, calling its __getattribute__() method, through unboxing p.x and 2, to finally boxing the result, which requires memory allocation. None of that is dependent on whether Python is interpreted or not, those steps are required based on the language semantics.

discuss

immibis|6 months ago

Typically a dynamic language JIT handles this by observing what actual types the operation acts on, then hardcoding fast paths for the one type that's actually used (in most cases) or a few different types. When the type is different each time, it has to actually do the lookup each time - but that's very rare.

i.e.

if(a->type != int_type || b->type != int_type) abort_to_interpreter();

result = ((intval*)a)->val + ((intval*)b)->val;

The CPU does have to execute both lines, but it does them in parallel so it's not as bad as you'd expect. Unless you abort to the interpreter, of course.