top | item 27549591

(no title)

near | 4 years ago

They're really incredibly useful for writing emulators. You have to simulate 3-8 processors all running in parallel, but doing so with locks and mutexes tens of millions of times a second is excruciatingly slow and painful, so you have to do this in a single thread (unless you're talking about very modern designs that have lower expectations of cycle-based timings.)

Cooperative threads like this let you completely avoid having to develop state machines for each cycle within state machines for each instruction, etc. They let you suspend a thread four levels into the call stack, and then immediately resume at that point once other emulated processors have caught up to it in time. That lets you do fun tricks like only synchronizing components when required, so it can in some instances end up not only far more elegant, but also much faster than state machines, when they're used well.

I wrote a bit more about this and showed some examples here if anyone's interested: https://near.sh/articles/design/cooperative-threading

I also use them for my web server because I like them, but there are probably better ways of doing that.

discuss

vlovich123|4 years ago

Seems like adopting async/await throughout would accomplish the same benefits (letting you co-operatively yield whenever you want) while maintaining the performance of the state machine (since that's what async/await is in a single-threaded context).

near|4 years ago

The key thing is that I need to be able to suspend 3-5 layers deep into the call frame. The instruction dispatcher calls into an instruction which calls into a bus memory read function which triggers a DMA transfer that then needs to switch to the video processor, and then I need to resume right there inside the DMA transfer function once the video processor has caught up in time. So the extra stack frame for each fiber/thread is essential.