The problem isn't locking so much, it's that you have to dispatch to a kernel thread when you're requesting and sending data, paying the cost of that context switch every time. In userspace you can spin a polling thread on its own core and DMA data up and down to the hardware all day long without yielding your thread to another one.
bogomipz|9 years ago
hendzen|9 years ago
For 99% of use cases this isn't a problem, but if you're trying to save every possible microsecond, then it definitely does.