Show HN: Coinboot – a framework for diskless computing

[+] frzb|7 years ago|reply

Hello, I am the creator of Coinboot and I will try to answer some of your questions.

Coinboot was initially created with only one use case in mind: To run the nodes of GPU-crypto-mining farms diskless. Because you don't need disks for mining. But these farms have only cheap commodity 1Gbit/s network hardware so the filesizes to transfer need to be as small as possible.

Being stateless is also a benefit for operations - all software + configurations are loaded in the RAM during boot, so you always have the most recent version in place and don't have to use a configuration management. There is also a plugin system which makes it easy to extend the functionality with further software. Just put the updated plugin archives in the plugin directory on the Coinboot server and reboot your nodes.

Coinboot is still in Beta phase but has been running for months on hundreds of GPU nodes.

My plan is to make the first stable release by the end of the year.

Centralised logging is already on the roadmap, also load-balancing the file download via local Bit-Torrent during the boot phase, and a Web-UI plus many other things. Ubuntu was chosen as base because the proprietary GPU driver for OpenCL has only support for Ubuntu and RHEL. RHEL was no option. Coinboot does use TFTP only for the NBP, all other file transfers are happening over HTTPS. If your hardware is using UEFI HTTP, then all file transfers are happening over HTTP. Boot time in the production environment is less then 15s.

Running diskless is done since decades and seems to be quite common in HPC - ages before everything got *-less. ;-) There was no implementation which fitted the requirements of the use case or was not outdated. Coinboot is also not using NFS - just because there is only cheap commodity 1Gbit/s network hardware in place. The whole root file system is put onto a TMPFS. After boot ~500MB of the RAM are used for the root filesystem. CoreOS is very impressive but there image size ~380MB was way to big. Coinboot is using currently ~130MB for the kernel + initramfs archives and the GPU driver Coinboot plugin.

[+] blauditore|7 years ago|reply

Can you maybe in a few words summarize (for a non-expert on OS') what's the "secret sauce" in this, i.e. how it differs from running a classic linux distro without permanent storage media?

[+] andrewstuart|7 years ago|reply

I didn't really understand exactly how it works from the linked page.

[+] mikestaszel|7 years ago|reply

Same. It's really confusing. How do I boot a machine exactly? What's Docker used for? Is this like a locally hosted https://netboot.xyz/ ?

[+] jlgaddis|7 years ago|reply

Are you old enough to remember diskless terminals? After a quick look at this, it sounded to me like a more modern, "DevOps-ified" version of diskless terminals/workstations that is 1) more difficult to maintain, 2) lacking any type of persistence (for thing like, you know, $HOME), and 3) supports only a single user whose account must be baked into the container.

As for how it works, well, it looks like it's just your everyday, run-of-the-mill PXE booting, AFAICT.

----

(FWIW, if you know how PXE booting works, there's no reason to read any further; the rest of this wall of text is just my poor, long-winded attempt at a "simplified" explanation of how it works.)

Basically: on startup, the host sends out a message to everyone else on the network saying, "Attention, please! I'm looking for a DHCP server that has an IP address I can use for a little while -- oh, and I also need a kernel to boot. Can anyone please tell me the filename of the kernel I should boot, the IP address of the TFTP server I can download it from, and what IP address I should use to communicate with that TFTP server!?" (Technically, it also asks for a subnet mask, the IP address of a default router, the IP address of a DNS server or two, and a few other things, but those are the ones most relevant to this explanation.)

Assuming there is a DHCP server on the same network -- or another host whose job it is to listen for such requests and pass them along to such a DHCP server -- it will receive a response with the information it asked for. (The computer wants to make sure it heard everything correctly so it asks for "confirmation" from DHCP server -- or its delegate.)

[In case it wasn't clear, I was trying to explain (simply) the four-way DHCP transaction (a.k.a. "DORA".)

Having all of the information it needs, the computer then sends out a message, via TFTP, to the IP address it was told, saying, "Please send me the file named '/boot/kernel'" and (since that's its job) the TFTP server does so. (Note: In some cases, the client that's trying to boot may ask for a few files instead of just one, or it might ask for the file(s) via a different method, such as HTTP.)

Very shortly -- well, maybe not THAT shortly, TFTP can be slooooooow! -- this trying-to-boot-itself-up computer will have most of it what it needs to start the booting process. At this point, that's usually a kernel and an "initial ramdisk" that contains 1) the drivers that it needs (or may possibly need) in order to get up and running.

Now, remember that this computer (or whatever it is) is "diskless". It doesn't have a hard drive of its own that it can mount its root filesystem ("/") from. Now what? Well, lucky for us, some folks at a company named Sun recently (well, "recently-ish"... four decades or so ago, perhaps) came up with a protocol that they called the Network File System ("NFS"). This particular trying-to-boot-itself-up computer doesn't NEED a hard drive of its own. As long as someone that's close by -- let's call them an "NFS server" -- has a copy of the files that the "boot client" needs and is willing to share them, well, that'll do. (With the much larger amounts of RAM that computers have nowadays, it's also possible for the client to download a specially-prepared-for-exactly-this-purpose file that contains an entire (read-only, of course) filesystem with everything it will need, keep that in its RAM, and use it as its root filesystem (this is how many (most?) "live CDs" work nowadays).

(I probably should have just pointed you towards the "Network Boot and Exotic Root HOWTO" [0] instead -- even though the last revision was almost 17 years ago, the basic premise really isn't much different today.)

[0]: https://www.tldp.org/HOWTO/Network-boot-HOWTO/index.html

[+] unknown|7 years ago|reply

[deleted]

[+] CyberDildonics|7 years ago|reply

When I use the internet am I doing computerless computing?

[+] ahartmetz|7 years ago|reply

Yeah, I was going to say... the "-less" terms are pretty stupid. "Serverless" is interesting for the no implied context work packets approach that is also interesting in e.g. highly threaded game engines, but is not about not using servers to run the work. Who comes up with this shit?

[+] asauce|7 years ago|reply

I can't wait for dataless computing

[+] mariopt|7 years ago|reply

What would be a good use case for diskless computing? RAM is very expensive even when compared with top of line NVME SSD drives.

[+] ianhowson|7 years ago|reply

Single-purpose or embedded systems. Eliminating disks means one less thing to go wrong. Many persistent security threats cannot occur. If you want to deploy a software update, reboot the machine. No need to worry about remote updates/reboots/fails. Disk state is irrelevant if there are no disks.

I shipped a system like this many years ago (scoreboard displays). Most netboot schemes require the network to be up for the client machine to operate. If you run from RAM, like this system, you only need the network to boot the machine. It can then tolerate outages or run disconnected.

[+] stonogo|7 years ago|reply

Any datacenter usage, where you save money on hard drives by running your OS in RAM and mounting the work storage over the network. This also provides faster recovery from errors, since you're not storing any state on the working node.

[+] Mave83|7 years ago|reply

It's not. You save money (OPEX and CAPEX) using diskless boot.

For example debian in memory including everything you need on a server is just below 400M.

We builded a lot of datacenter services on PXE on the past years. It's always handy if you do require more resources than a single server can provide. It's like a container on steroids.

[+] ExBritNStuff|7 years ago|reply

Any kind of thin client/zero client usage, where local storage is not required or allowed. Many years ago I used a similar solution for a finance company. The users just needed to get access to a browser to connect to a specific web service.

[+] tfolbrecht|7 years ago|reply

Ephemeral logs.

[+] frzb|7 years ago|reply

Docker is used to provide a working setup (DHCP-, PXE-, Webserver, the configuration and needed files) in the most easy way.

[+] haolez|7 years ago|reply

I’ve used CoreOS tools in the past to achieve the same goals. It’s really useful once it works! I hope CoreOS lives on after the Red Hat and IBM acquisitions.

[+] evol262|7 years ago|reply

You can do this with pretty much any tool you want. I also did it about 10 years ago with livecd-iso-to-pxeboot (and running from the generated initramfs)

Primary obstacles in 2018 are primarily the lackluster support for ro root filesystems. CoreOS/Alpine are fine. Writable memory filesystems are fine.

CoreOS is not even a little bit dead inside Red Hat after the acquisition. If you're not paying attention to the news from Fedora, CoreOS is alive and well, which says a lot about how it will go downstream.

The IBM acquisition is not final yet.

24 comments