top | item 39701358

I summarized my understanding of Linux systems

342 points| lsc4719 | 1 year ago |github.com

82 comments

order

tmalsburg2|1 year ago

I learned a lot about this from the book "The design of the Unix operating system" by Maurice J. Bach.¹ It's an old book and many details deviate from actual present-day Linux, but it nonetheless gives a great overview of the key components and ideas.

¹ https://books.google.de/books/about/The_Design_of_the_UNIX_O...

guerrilla|1 year ago

This is one of my favorite books. A true classic. There are follow-ups in that style for Linux and FreeBSD as well. I think Robert Love wrote the former.

cookiengineer|1 year ago

I can recommend taking a look at /proc and /sys, because that will clear up a lot of how things are intertwined and connected.

procfs is what's used by pretty much all tools that do something with processes, like ps, top etc.

The everything is a file philosophy becomes much more clear then. Even low level syscalls and their structs are offered by the kernel as file paths so you can interact with them by parsing those files as a struct, without actually needing to use kernel headers for compilation.

eBPF and its bytecode VM are a little over the top, but are essential to known about in the upcoming years cause a lot of tools is moving towards using their own bpf modules.

Cloudef|1 year ago

> The everything is a file philosophy becomes much more clear then.

To be honest, everything is a file is kind of a lie in unix. /proc and /sys are pretty much plan9 inspiration.

richardwhiuk|1 year ago

I don't understand what the boxes on this diagram are meant to represent.

It feels like an elaborate mechanism to draw something wrong in the hopes people will correct it.

projektfu|1 year ago

FWIW, I also don't really understand what the boxes are supposed to represent, given that the arrows represent dependencies like PID <-- process. I thought a PID was an attribute of a process?

To me, a block diagram might show [CPU Scheduler], [Virtual Device Manager], [VFS Manager], [Memory Manager], [Interrupt Handlers], etc...

Of course, my knowledge of Linux internals is limited and perhaps it has a separation of the concept of PID and process where there is a literal dependency.

sevagh|1 year ago

Interview prep.

dfc|1 year ago

My mental model of Linux does not have the CPU/Memory in user space. What am I missing?

suprjami|1 year ago

Nor does mine.

Userspace assembly runs directly on the CPU* executing in the unprivileged ring. When the userspace program makes a system call by calling a kernel entry function which is mapped into the process's address space by the dynamic loader, then part of that entry into kernelspace is to put the CPU into the privileged ring and kernel assembly then runs on the CPU.

The process scheduler can stop execution to kick a task off the CPU and switch to another one, depending on OS and kernel some things can be kicked off the CPU and some cannot.

Userspace memory allocations are serviced by virtual memory where the page tables track the translation of virtual memory pages into physical memory pages using the MMU.

The kernel is involved during allocation and page fault, but iiuc a regular successful virtual memory access is a hardware operation only.

I don't have a diagram of how this works. Neither processes nor memory are my usual area of kernel.

You'd be better to read the x86 version of the XV6 book to learn how this stuff really works. It's really well written and implements enough to be tangibly useful. Reading the code is optional when just learning concepts. Reading the XV6 code will hopefully help you understand the O'Reilly Linux books better, which will hopefully help you understand the actual Linux kernel better.

(*yes I'm aware CPUs don't directly execute assembly anymore, but the microcode guarantees the observable CPU state at any Instruction Pointer matches the expectation as if you were running assembly on a PDP or C64, or close enough for 99.999% purposes and definitely enough for debugging your program in gdb)

vbezhenar|1 year ago

Userspace program directly uses CPU and memory (unless you're using VM). In contrast to that, your userspace program does not directly access your network device or SSD, but uses kernel routines to access those indirectly.

knorker|1 year ago

Whenever I've made notes like this, it's never been useful to my future self nor to anyone else.

The only use I've had of this kind of documentation is that the process of writing it, made me understand it better. Basically write-only documentation.

I would call myself a Linux expert, and while I can kinda see what you mean with this diagram, it would not have been useful to me back before I was an expert.

codelobe|1 year ago

Usually I would agree. I typically make a "Crash-Course in $PLATFORM" document while keeping notes. These I very commonly reference in order to externalize my memory since it seems to be approaching capacity. I don't care about Ruby on Rails, but once I did, and I can reference my notes if I ever need to touch that platform again.

persolb|1 year ago

It almost resembles mind mapping. It is a useful ‘process’ to figure out what you think/know. And it might be a pretty picture. But it isn’t very useful as documentation.

falserum|1 year ago

I found it useful. Allowed me to compare if I have similar idea to the author.

timeforcomputer|1 year ago

Nice! I want to do something similar and map my understanding of Linux. I find some diagrams on Wikipedia fascinating (example: https://en.m.wikipedia.org/wiki/File:Linux_Graphics_Stack_20..., but more to do with the user library ecosystem rather than kernel and program runtime). These diagrams make me want to learn about each part and be able to comprehend in principle what is happening on my machine. Eventually...

Jasper_|1 year ago

Any diagram by ScotXW on Wikipedia is somewhere between misleading and completely wrong, and they're a constant pain on the Linux graphics community.

If you're curious about the details in this case, ScotXW confuses EGL and OpenGL, the arrows aren't quite right, and the labels aren't quite right either (DRM is labeled "hardware-specific" but KMS isn't? The label for "hardware specific Userspace interface to hardware specific direct rendering manager" is quite weird), and some important boxes are flat out missing. It's nitpicking for sure, but when the diagram goes out of its way to add extremely weird details, it demands nitpicking.

Nobody in the Linux graphics community would draw the diagram like this.

smitty1e|1 year ago

I think it needs three areas, not two:

1. User space

2. Kernel

3. Hardware/network

The kernel protects users from hardware, and hardware from users.

topspin|1 year ago

This is reasonable and correct. I would also have found places in that map for: dcache, block devices, character devices, scheduler, page cache and console/tty/pty. The first two replace "filesystem hierarchy". The second and third are ancient and fundamental classes of UNIX devices.

t1tos|1 year ago

this is analagous to the fs hierarchy: root protects from the user

thesuperbigfrog|1 year ago

"The Linux Programming Interface" by Michael Kerrisk is one of the best technical resources I have found and used to understand Linux:

https://man7.org/tlpi/

Description from the book's website:

"The Linux Programming Interface (TLPI) describes system programming on Linux and UNIX.

TLPI is both a guide and reference book for system programming:

If you are new to system programming, you can read TLPI linearly as an introductory guide: each chapter builds on concepts presented in earlier chapters, with forward references kept to a minimum. Most chapters conclude with a set of exercises intended to consolidate the reader's understanding of the topics covered in the chapter.

If you are an experienced system programmer, TLPI provides a comprehensive reference that you can consult for details of nearly the entire Linux and UNIX (i.e., POSIX) system programming interface. To support this use, the book is thoroughly cross referenced and has an extensive index."

peter_d_sherman|1 year ago

A future simple linux-like (or unix-like) OS -- could theoretically be created with only 4 syscalls:

open() read() write() close()

Such a theoretical linux-like or unix-like OS would assume quite literally that "everything is a file" -- including the ability to perform all other syscall/API calls/functions via special system files, probably located in /proc and/or /sys and/or other special directories, as other posters have previously alluded to...

Also, these 4 syscalls could theoretically be combined into a single syscall -- something like (I'll use a Pascal-like syntax here because it will read easier):

FileHandleOrResult = OpenOrReadOrWriteOrClose(FileHandle: Integer; Mode: Integer; pData: Pointer; pDataLen: Integer);

if Mode = 1 then open();

if Mode = 2 then read();

if Mode = 3 then write();

if Mode = 4 then close();

FileHandle is the handle for the file IF we have one; that's for read() write() and close() -- for open() it could be -1, or whatever...

Mode is the mode, as previously enumerated.

pData is a pointer to a pre-allocated data buffer, the data to read or write, or the full filename when opening...

(And of course, the OS could overwrite it with text strings of any error messages that occur... if errors should occur...)

pDataLen is the size of that buffer in bytes.

When the Mode is open(), pData contains a string of the path and file to open.

When Mode is read(), pData is read to, that is, overwritten.

When Mode is write(), pData is used to write from.

All in all, pretty simple, pretty straightforward...

A "one syscall Linux or Unix (or Linux-like or Unix-like) operating system", if you will... for simplicity and understanding!

(Andrew Tannenbaum would be pleased!)

Related: "One-instruction set computer" (OISC): https://en.wikipedia.org/wiki/One-instruction_set_computer

richardwhiuk|1 year ago

That's already kind of how syscalls work - you shove the syscall number in a register, and then call an interrupt.

zzo38computer|1 year ago

I had considered that too, but what I had also considered, and that I think is better, is a different single syscall, which is more like a actor model or like a capability-based system. (One problem with the "everything is a file" like Plan9 is that then the operating system has to parse the file paths every time you want to do any I/O; what I describe below ignores that problem since you can link directly to objects instead.)

A process has access to a set of capabilities (if it does not have any capabilities, then it is automatically terminated (unless a debugger is attached), since there is nothing for the program to do).

A "message" consists of a sequence of bytes and/or capabilities. (The message format will be system-independent (e.g. the endianness is always the same) so that it works with emulation and network transparency, described below.)

A process can send messages to capabilities it has access to, receive messages from capabilities it has access to, create new capabilities (called "proxy capabilities"), discard capabilities, and wait for capabilities.

Terminating the process is equivalent to a mandatory blocking wait for an empty set of capabilities; discarding all capabilities also terminates the process. A non-blocking wait for an empty set of capabilities means that you wish to yield, so that other processes get a chance to run, before this process continues.

Some further options may be needed to handle multiple locking and transactions, and to avoid some kinds of race conditions, but mostly that is just it.

This is useful for many things, including sandboxing, emulation, network transparency (this can be done by one program keeping track of which capabilities need to be sent across the network link and assign an index number to each one, and then the other end will create a proxy capability for each index number and use that number when it wants to send back), security with user accouts, etc; the kernel does not need to know about all of these things since they can be implemented in user code.

Other things (outside of the kernel) can also be implemented in terms of proxy capabilities, and I had ideas about those other parts of the operating system too, for example it has a hypertext file system (with no file names, but files can contain multiple numbered streams, which can include both bytes and links to other files (which can be either to the current version or to a fixed version; if to a fixed version then copy-on-write will be used if the file is modified)), and the "foreign links table", and a common (binar) data format, and a command shell with some similarities than Nushell (but also many differences), and the system uses the "Extended TRON Code" character set, and details about the working of the package manager and IME and window manager, etc.

xwowsersx|1 year ago

Could someone suggest hands-on resources for learning about kernels, such as a book or series on writing your own kernel? I'd like to gain a deeper understanding of their workings and I think hands-on or project based learning would help.

hnthrowaway0328|1 year ago

osdev would help. It's not a book but a website.

pjmlp|1 year ago

UNIX IPC is kind of missing, streams, pipes, message boxes, shared memory.

SUN RPC for NFS, yellow pages,...

begueradj|1 year ago

Why a UML book is listed as a reference ?