In 1985... yes I said 1985, the Amiga did all I/O through sending and receiving messages. You queued a message to the port of the device / disk you wanted, when the I/O was complete you received a reply on your port.
The same message port system was used to receive UI messages. And filesystems, on top of drive system, were also using port/messages. So did serial devices. Everything.
Simple, asynchronous by nature.
As a matter of fact, it was even more elegant than this. Devices were just DLL with a message port.
The multitasking was co-operative, and there was no paging or memory protection. That didn't work as well (But worked surprisingly well, especially compared to Win3.1 which came 5-6 years later and needed much more memory to be usable).
I suspect if Commodore/Amiga had done a cheaper version and did not suck so badly at planning and management, we would have been much farther along on software and hardware by now. The Amiga had 4 channel 8-bit DMA stereo sound in 1985 (which with some effort could become 13-bit 2 channel DMA stereo sound), a working multitasking system, 12-bit color high resolution graphics, and more. I think the PC had these specs as "standard" only in 1993 or so, and by "standard" I mean "you could assume there was hardware to support them, but your software needed to include specific support for at least two or three different vendors, such as Creative Labs SoundBlaster and Gravis UltraSound for sound).
A friend of mine was amazed by this capability of the Amiga when I showed him that on one screen I could play mod.DasBoot in NoiseTracker, pull the screen down partly then go on the BBS in the terminal by manually dialing atdt454074 and entering, without my A500 even skipping one beat...
All I had was the 512kB expander, he had a 386 with 387 and could only run a single tasking OS
I remember NetWare's IPX/SPX network stack used a similar async mechanism. The caller submits a buffer for read and continues to do whatever. When the network card receives the data, it puts them in the caller's buffer. The caller is notified via a callback when the data is ready. All these were fitted in a few K's of memory in a DOS TSR.
All the DOS games at the time used IPX for network play for a reason. TCP was too "big" to fit in memory.
"In 1985... yes I said 1985, the Amiga did all I/O through sending and receiving messages"
I do remember that, and it was cool. But, lightweight efficient message passing is pretty easy when all processes share the same unprotected memory space :)
When you want to squeeze every bit of performance out of a system, you want to avoid doing system calls as much as possible. io_uring lets you check if some i/o is done by just checking a piece of memory, instead of using read, pool, or such.
One thing that doesn't change is that every decade people will look at the Amiga and admire it the same no matter how much ~advances have been made since.
This over-romanticizes Amiga (a beautiful system no doubt) because there have been message-passing OSes since the 1960s (see Brinch Hansen's Nucleus for example). The key difference with io_uring is that is an incredibly efficient and general mechanism for async everything. It really is a wonderful piece of technology and an advance over the long line of "message passing" OSes (which always were too slow).
I don't think io_uring and ebpf will revolutionize programming on Linux. In fact I hope they don't. The most important aspect of a program is correctness, not speed. Writing asynchronous code is much harder to get right.
Sure, I still write asynchronous code. Mostly to find out if I can.
My experience has been that async code is hard to write, is larger, hard to read, hard to verify as correct and may not even be faster for many common use cases.
I also wrote some kernel code, for the same reason. To find out if I could.
Most programmers have this drive, I think. They want to push themselves.
And sure, go for it! Just realize that you are experimenting, and you are probably in over your head.
Most of us are most of the time.
Someone will have to be able to fix bugs in your code when you are unavailable. Consider how hard it is to maintain other people's code even if it is just a well-formed, synchronous series of statements. Then consider how much worse it is if that code is asynchronous and maybe has subtle timing bugs, side channels and race conditions.
If I haven't convinced you yet, let me try one last argument.
I invite you to profile how much actual time you spend doing syscalls. Syscalls are amazingly well optimized on Linux. The overhead is practically negligible. You can do hundreds of thousands of syscalls per second, even on old hardware. You can also easily open thousands of threads. Those also scale really well on Linux.
I don't know what kind of programming you're doing, but in network apps, if you have a thread per client and lots of clients (like a web server), you end up with lots of threads waiting on responses from slow clients, and that takes up memory. The time blocked on the syscall has nothing to do with your own machine's performance.
But on the other hand, if your server is behind a buffering proxy so it's not streaming directly over the Internet, it might not be a problem.
Writing asynchronous code is trying to fix how your code is executed in the code itself. It is the wrong solution for a real problem.
But I think what many people get wrong (not the person I'm replying to) is that how you write code and how you execute code does not have to be the same.
This is why Golang uses goroutines. This is why Javascript made async/await. This is why project loom exists. This is why erlang uses erlang processes.
All of these initiatives make it possible to write synchronous code and execute it as if it was written asynchronously.
And I think all of this also makes it clear that how you write code and how code is executed is not the same, so yes, I'm in agreement with the person I'm replying to, I don't think this will change how code is written that much, because this can't make writing code asynchronously any less of a bad idea than it is now.
Coincidentally last night I announced [0] a little io_uring systemd-journald tool I've been hacking on recently for fun.
No ebpf component at this time, but I do wonder if ebpf could perform journal searches in the kernel side and only send the matches back to userspace.
Another thing this little project brought to my attention is the need for a compatibility layer on pre-io_uring kernels. I asked on io_uring@vger [1] last night, but nobody's responded yet, does anyone here know if there's already such a thing in existence?
I'd like something roughly similar, to make the rr reverse debugger support io_uring. That likely can't work like most other syscalls, due to the memory only interface...
I was thinking about doing this for an event loop I was working on, but no code to show yet... you probably can get away easily with using pthreads and a sparse memfd to store the buffers.
This feels very very similar to IO completion ports / iocp on Windows. More modern versions of Windows even has registered buffers for completion which can be even more performant in certain scenarios. I'm looking forward to trying this out on Linux.
I'm curious to see how this might work its way into libuv and c++ ASIO libraries, too.
There's currently a lot of talk about io_uring, but most articles around it and usages still seem more in the exploration, research and toy project state.
I'm however wondering what the actual quality level is, whether people used it successfully in production and whether there is an overview with which kernel level which feature works without any [known] bugs.
When looking at the mailing list at https://lore.kernel.org/io-uring/ it seems like it is still a very fast moving project, with a fair amount bugfixes. Given that, is it realistic to think about using any kernel in with a kernel version between 5.5 and 5.7 in production where any bug would incur an availability impact, or should this still rather be a considered an ongoing implementation effort and revisited at some 5.xy version?
An extensive set of unit-tests would make it a bit easier to gain trust into that everything works reliably and stays working, but unfortunately those are still not a thing in most low-level projects.
> Things will never be the same again after the dust settles. And yes, I’m talking about Linux.
One has to be in quite a techie bubble to equate Linux kernel features with actual world-changing events, as the author goes on to do.
More on-topic though, having read the rest of the article, my guess is that while these features will let companies squeeze some more efficiency out of high-end servers, they won't change how most of us develop applications.
I am impressed with the level of linux knowledge in this thread. How do people become linux kernel hackers? Most of the developers I know (including myself) use linux but have very little awareness beyond application level programming.
You don't necessarily have to be a kernel hacker to be familiar with many of the features that the kernel provides. Just doing application debugging often requires to dig deeper until you hit some kernel balrogs.
Container problems? Namespaces, Cgroups, ...
Network problems? Netfilter, tc, lots of sysctl knobs, tcp algorithms (cue 1287947th thread on nagle/delayed acks/cork)
Slow disk IO? Now you need to read up on syscalls and maybe find more efficient uses. Copy_file_range doesn't work as expected? Suddenly you're reading kernel release notes or source code.
There's a famous book about Linux internals that I don't remember the name (but has "Linux" and "internals" on it). But I have never seen anybody doing it by reading a book (despite how excellent it can be). You just go change what you want or read the submodule you are interested in understanding, and use the book, site or whatever when you have a problem.
For the most part, it's just software. If you have the time and the interest, you can learn it like anything else. At some level, it requires an awareness of how the hardware works(page tables/MMUs/IOMMUs, interrupts, SMP, NUMA, etc).
I don't mean to downplay the investment, but if you're already an experienced software engineer you can get into it if it interests you. There is a different mindset among systems software programmers though. Reliability comes first, performance and functionality come second. It's a world away from hacking python scripts that only need to run once to perform their function.
I learned a TON about the Linux kernel through writing custom device drivers for FPGAs. Granted most of my experience is in the driver area and not in any of the subsystems, but even still I have a much better grasp of how the kernel operates now (and even more importantly, I know how to navigate it and how to find relevant documentation).
As others have said, hacking it, certainly. But if you're not up for that and would like something more passive, read LWN.net (and possibly subscribe!)
Today I am grateful for the brilliant minds around the world that continually open up fundamentally revolutionary new ways to develop applications. To Jens, to Alexei, and to Glauber, and to all of their kindred and ilk, we raise a glass!
"few niche applications" being any application that touches files, network or want to run code in the kernel. Sounds like a bigger target than just "niche", but I'm no Linux developer so what do I know.
At SCO in the mid-90s we were playing with very similar ideas to boost DB performance. The main motivation was the same then as it is now, don't block and avoid making system calls into the kernel once up and running. Don't recall if any of the work made it into product.
eBPF is still a bit rough but it's already very cool what you can do already.
It would be nice to see it at a high-level at the syscall interface i.e. currently if I want to attach a probe I have to find the function myself or use a library but it would he nice to have it understand elf files.
io_uring reduces but doesn't remove the system call overhead.
Only with in kernel polling mode is it close to removed. But kernel polling mode has it's own cost. If the system call overhead is no where close to being a bottle neck, i.e. you don't do system calls "that" much, e.g. because your endpoints take longer to complete then using kernel polling mode can degrade the overall system performance. And potential increase power consumption and as such heat generation.
Besides that user mode tcp stacks can be more tailored for your use case which can increase performance.
So all in all I would say that it depends on your use case. For some it will make user mode tcp useless or at least not worth it but for others it doesn't.
I'm genuinely curious; both of these changes seem to be exciting due to the ability for people to extend and implement specialized code/features using the kernel. Since the Linux kernel is GPLed (v2, I believe?), does this mean that the number of GPL requests related to products' operating systems is likely to increase, since groups using this extensibility will be writing code covered by the GPL which might actually be of value to other people? Or does the way io_uring and eBPF are implemented isolate the code in such a way that the extensions through their frameworks such that the GPL license won't affect them?
[+] [-] pierrebai|5 years ago|reply
In 1985... yes I said 1985, the Amiga did all I/O through sending and receiving messages. You queued a message to the port of the device / disk you wanted, when the I/O was complete you received a reply on your port.
The same message port system was used to receive UI messages. And filesystems, on top of drive system, were also using port/messages. So did serial devices. Everything.
Simple, asynchronous by nature.
As a matter of fact, it was even more elegant than this. Devices were just DLL with a message port.
[+] [-] beagle3|5 years ago|reply
The multitasking was co-operative, and there was no paging or memory protection. That didn't work as well (But worked surprisingly well, especially compared to Win3.1 which came 5-6 years later and needed much more memory to be usable).
I suspect if Commodore/Amiga had done a cheaper version and did not suck so badly at planning and management, we would have been much farther along on software and hardware by now. The Amiga had 4 channel 8-bit DMA stereo sound in 1985 (which with some effort could become 13-bit 2 channel DMA stereo sound), a working multitasking system, 12-bit color high resolution graphics, and more. I think the PC had these specs as "standard" only in 1993 or so, and by "standard" I mean "you could assume there was hardware to support them, but your software needed to include specific support for at least two or three different vendors, such as Creative Labs SoundBlaster and Gravis UltraSound for sound).
[+] [-] InafuSabi|5 years ago|reply
All I had was the 512kB expander, he had a 386 with 387 and could only run a single tasking OS
[+] [-] ww520|5 years ago|reply
All the DOS games at the time used IPX for network play for a reason. TCP was too "big" to fit in memory.
[+] [-] tyingq|5 years ago|reply
I do remember that, and it was cool. But, lightweight efficient message passing is pretty easy when all processes share the same unprotected memory space :)
[+] [-] gens|5 years ago|reply
[+] [-] agumonkey|5 years ago|reply
[+] [-] Upvoter33|5 years ago|reply
[+] [-] nonesuchluck|5 years ago|reply
- in the late 80s, Commodore ports AmigaOS to 386
- re-engineers Original Chipset as an ISA card
- OCS combines VGA output and multimedia (no SoundBlaster needed)
- offers AmigaOS to everyone, but it requires their ISA card to run
- runs DOS apps in Virtual 8086 mode, in desktop windows or full-screen
[+] [-] amelius|5 years ago|reply
Reminds me of: https://en.wikipedia.org/wiki/Unikernel
[+] [-] StreamBright|5 years ago|reply
[+] [-] bsder|5 years ago|reply
[+] [-] CalChris|5 years ago|reply
[+] [-] sigjuice|5 years ago|reply
https://en.wikipedia.org/wiki/Indirection
[+] [-] harry8|5 years ago|reply
[+] [-] fefe23|5 years ago|reply
Sure, I still write asynchronous code. Mostly to find out if I can. My experience has been that async code is hard to write, is larger, hard to read, hard to verify as correct and may not even be faster for many common use cases.
I also wrote some kernel code, for the same reason. To find out if I could. Most programmers have this drive, I think. They want to push themselves.
And sure, go for it! Just realize that you are experimenting, and you are probably in over your head.
Most of us are most of the time.
Someone will have to be able to fix bugs in your code when you are unavailable. Consider how hard it is to maintain other people's code even if it is just a well-formed, synchronous series of statements. Then consider how much worse it is if that code is asynchronous and maybe has subtle timing bugs, side channels and race conditions.
If I haven't convinced you yet, let me try one last argument.
I invite you to profile how much actual time you spend doing syscalls. Syscalls are amazingly well optimized on Linux. The overhead is practically negligible. You can do hundreds of thousands of syscalls per second, even on old hardware. You can also easily open thousands of threads. Those also scale really well on Linux.
[+] [-] skybrian|5 years ago|reply
But on the other hand, if your server is behind a buffering proxy so it's not streaming directly over the Internet, it might not be a problem.
[+] [-] cheph|5 years ago|reply
But I think what many people get wrong (not the person I'm replying to) is that how you write code and how you execute code does not have to be the same.
This is essentially why google made their N:M threading patches: https://lore.kernel.org/lkml/20200722234538.166697-1-posk@po...
This is why Golang uses goroutines. This is why Javascript made async/await. This is why project loom exists. This is why erlang uses erlang processes.
All of these initiatives make it possible to write synchronous code and execute it as if it was written asynchronously.
And I think all of this also makes it clear that how you write code and how code is executed is not the same, so yes, I'm in agreement with the person I'm replying to, I don't think this will change how code is written that much, because this can't make writing code asynchronously any less of a bad idea than it is now.
[+] [-] junon|5 years ago|reply
[+] [-] trevyn|5 years ago|reply
[+] [-] pengaru|5 years ago|reply
No ebpf component at this time, but I do wonder if ebpf could perform journal searches in the kernel side and only send the matches back to userspace.
Another thing this little project brought to my attention is the need for a compatibility layer on pre-io_uring kernels. I asked on io_uring@vger [1] last night, but nobody's responded yet, does anyone here know if there's already such a thing in existence?
[0] https://lists.freedesktop.org/archives/systemd-devel/2020-No...
[1] https://lore.kernel.org/io-uring/20201126043016.3yb5ggpkgvuz...
[+] [-] anarazel|5 years ago|reply
[+] [-] cycloptic|5 years ago|reply
[+] [-] adzm|5 years ago|reply
I'm curious to see how this might work its way into libuv and c++ ASIO libraries, too.
[+] [-] Matthias247|5 years ago|reply
I'm however wondering what the actual quality level is, whether people used it successfully in production and whether there is an overview with which kernel level which feature works without any [known] bugs.
When looking at the mailing list at https://lore.kernel.org/io-uring/ it seems like it is still a very fast moving project, with a fair amount bugfixes. Given that, is it realistic to think about using any kernel in with a kernel version between 5.5 and 5.7 in production where any bug would incur an availability impact, or should this still rather be a considered an ongoing implementation effort and revisited at some 5.xy version?
An extensive set of unit-tests would make it a bit easier to gain trust into that everything works reliably and stays working, but unfortunately those are still not a thing in most low-level projects.
[+] [-] mwcampbell|5 years ago|reply
One has to be in quite a techie bubble to equate Linux kernel features with actual world-changing events, as the author goes on to do.
More on-topic though, having read the rest of the article, my guess is that while these features will let companies squeeze some more efficiency out of high-end servers, they won't change how most of us develop applications.
[+] [-] zests|5 years ago|reply
[+] [-] the8472|5 years ago|reply
Container problems? Namespaces, Cgroups, ...
Network problems? Netfilter, tc, lots of sysctl knobs, tcp algorithms (cue 1287947th thread on nagle/delayed acks/cork)
Slow disk IO? Now you need to read up on syscalls and maybe find more efficient uses. Copy_file_range doesn't work as expected? Suddenly you're reading kernel release notes or source code.
[+] [-] marcosdumay|5 years ago|reply
Honestly, by hacking it.
There's a famous book about Linux internals that I don't remember the name (but has "Linux" and "internals" on it). But I have never seen anybody doing it by reading a book (despite how excellent it can be). You just go change what you want or read the submodule you are interested in understanding, and use the book, site or whatever when you have a problem.
[+] [-] 01100011|5 years ago|reply
I don't mean to downplay the investment, but if you're already an experienced software engineer you can get into it if it interests you. There is a different mindset among systems software programmers though. Reliability comes first, performance and functionality come second. It's a world away from hacking python scripts that only need to run once to perform their function.
[+] [-] unknown|5 years ago|reply
[deleted]
[+] [-] gpanders|5 years ago|reply
[+] [-] amboar|5 years ago|reply
[+] [-] yobert|5 years ago|reply
[+] [-] dboreham|5 years ago|reply
[+] [-] merqurio|5 years ago|reply
[1]: https://github.com/DataDog/glommio [2]: https://news.ycombinator.com/item?id=24976533
[+] [-] PeterCorless|5 years ago|reply
[+] [-] ganafagol|5 years ago|reply
My work is "programming in Linux", but it's not impacted by any of this since I'm working in a different area.
I'm sure this is important work, but maybe tone down such claims a bit.
[+] [-] capableweb|5 years ago|reply
[+] [-] rajnathani|5 years ago|reply
[+] [-] whateveracct|5 years ago|reply
[+] [-] grahamm|5 years ago|reply
[+] [-] mhh__|5 years ago|reply
It would be nice to see it at a high-level at the syscall interface i.e. currently if I want to attach a probe I have to find the function myself or use a library but it would he nice to have it understand elf files.
[+] [-] hawk_|5 years ago|reply
[+] [-] dathinab|5 years ago|reply
Only with in kernel polling mode is it close to removed. But kernel polling mode has it's own cost. If the system call overhead is no where close to being a bottle neck, i.e. you don't do system calls "that" much, e.g. because your endpoints take longer to complete then using kernel polling mode can degrade the overall system performance. And potential increase power consumption and as such heat generation.
Besides that user mode tcp stacks can be more tailored for your use case which can increase performance.
So all in all I would say that it depends on your use case. For some it will make user mode tcp useless or at least not worth it but for others it doesn't.
[+] [-] qchris|5 years ago|reply
[+] [-] jamesfisher|5 years ago|reply
[+] [-] bauerd|5 years ago|reply
[+] [-] secondcoming|5 years ago|reply
[+] [-] b0rsuk|5 years ago|reply