top | item 34100781

(no title)

Sirened | 3 years ago

Have you actually benchmarked a context switch on modern hardware? A full switch (including register spilling and page table swap) can be had in <150 cycles on even cheap, older Arm A-series cores like the Cortex A72. We're not still living in a world where a context switch forces you to flush the TLB, you literally just have to pay the cost for the trap, spill, page table swap, unspill, and return. This cost is even lower of modern ARM processors which support speculative exceptions where you can perform the entire context switch speculatively.

discuss

paulmd|3 years ago

Netflix definitely benched their context-switches and they determined it was expensive enough to be worth engineering out.

https://www.phoronix.com/news/Netflix-NUMA-FreeBSD-Optimized

https://2019.eurobsdcon.org/slides/NUMA%20Optimizations%20in...

zorgmonkey|3 years ago

If context switches are performance problem you are probably pretty far down the optimation rabbit hole, but the articles you linked have nothing to do with context switches, they are about NUMA optimizations to sendfile on freebsd.

ilyt|3 years ago

So massive overhead when you need to context switch between network process, I/O process, and FS process just to pass some bytes, where in monolithic kernel that's just a cost of few function calls