(no title)
abiloe | 3 years ago
This is a somewhat confusing analysis you have here. Direct read/write from memory for all intents and purposes doesn't block. Why do you say that reads and writes may also block?
The reason memory blocks is because it needs to page in or out from secondary storage - which makes this statement "the only real difference between main memory blocking and disk blocking is the amount of time they may block." not really true
kentonv|3 years ago
Sure it does! Main memory is much slower than cache so on a cache miss the CPU has to stop and wait for main memory to respond. The CPU may even switch to executing some other thread in the meantime (that's what hyperthreading is). But if there isn't another hyperthread ready, the CPU sits idle, wasting resources.
It's not a form of blocking implemented by the OS scheduler, but it's pretty similar conceptually.
> The reason memory blocks is because it needs to page in or out from secondary storage
Nope, that's not what I was referring to (other than in the line mentioning swap).
abiloe|3 years ago
Cache is a memory. And which cache, by the way? Even L1 cache on modern processors doesn't have 0 latency. And this is a rather poor way of describing hyperthreading - the CPU doesn't really "switch" - the context for the alternate process is already available and the resource stealing can occur for any kind of stall (including cache loads), not just memory. Calling this a "switch" suggesting it is like a context switch is very misleading. It's not similar conceptually.
In any event, by this definition even a mispredicted branch or a divide becomes "blocking" - which sort of tortures any meaningful definition of blocking.
The essential difference is - memory accesses to paged in memory (and branch mispredictions, cache misses) are not something you typically or reasonably trap outside of debugging. mmaps, swaps, disk I/O, network accesses are all something delegated to an OS - and at that point are orders of magnitude more expensive than even most NUMA memory accesses. I sort of see where you're coming from - but I don't think it's a useful point.
jandrewrogers|3 years ago
bch|3 years ago
You’re throwing traditional blocking/non-blocking distinctions on their ear.
tremon|3 years ago
Let's define "may block" first, perhaps? What do we mean when we say "network I/O may block"? Usually, this means that the kernel may see your network request and raise you a context switch while it waits for the network response on your behalf. In your last sentence you appear to argue that the reason why the kernel performs a context switch is relevant in determining if an operation "may block", and the GP is arguing that that's a distinction without a difference.
If the definition of "may block" is really just "the kernel may decide to context-switch away from your program", then yes, the GP's assertion that file I/O, memory I/O (mmap) and memory access (swap) are all operations that may block is correct -- the only difference is in degree: from microsecond delays for nvm-backed swap to multi-second delays for network transfers.
Or, of course, I may have misunderstood the GP's train of thought.
unknown|3 years ago
[deleted]
jesboat|3 years ago
Reads and writes from actual, physical, hardware memory might block, depending on how you define "block", in the sense that some reads may miss CPU cache. But once you get to that point, you could argue that every branch might block if the branch misprediction causes a pipeline stall. This is not a useful definition of "block".
The thing is, most programs are almost never low-level enough to be dealing with memory in that sense: they read and write virtual memory. And virtual memory can block for any number of reasons, including some pretty non-obvious ones like. For example:
- the system is under memory pressure and that page is no longer in RAM because it got written to a swap file
- the system is under memory pressure and that page is no longer in RAM because it was a read-only mapping from a file and could be purged
-- e.g. it's part of your executable's code
- this is your first access to a page of anonymous virtual memory and the kernel hadn't needed to allocate a physical page until now
- you're in a VM and the VMM can do whatever it wants
- the page is COW from another process
kentonv|3 years ago
I think what I'm saying is that calling file I/O "blocking" is also not a useful definition of "block". Because I don't really see the fundamental difference between "we have to wait for main memory to respond" and "we have to wait for disk to respond".
> this is your first access to a page of anonymous virtual memory and the kernel hadn't needed to allocate a physical page until now
And said allocation could block on all sorts of things you might not expect. Once upon a time I helped debug a problem where memory allocation would block waiting for the XFS filesystem driver to flush dirty inodes to disk. Our system generated lots of dirty inodes, and we were seeing programs randomly hang on allocation for minutes at a time.
cout|3 years ago
To copy the page, the kernel needs to first find a free page frame. If there are no free page frames, the kernel will attempt to reclaim pages that are in use (https://www.kernel.org/doc/gorman/html/understand/understand...). This may cause in process-mapped pages being swapped to disk, but it also may result in disk writeback activity (https://lwn.net/Articles/396561/). In either case, control cannot be returned to the process until there is a free page to map into the process.
p12tic|3 years ago
Consider that regular memory access via cache takes around 1 nanosecond.
If the data is not in top-level cache, then we're looking at roughly 10 nanoseconds access latency.
If the data is not in cache at all, we are looking into 50-150 nanoseconds access latency.
If the data is in memory, but that memory is attached to another CPU socket, it's even more latency.
Finally, if the data access is via atomic instruction and there are many other CPUs accessing the same memory location, then the latency can be as high as 3000 nanoseconds.
It's not very hard to find NVMe attached storage that has latencies of tens of microseconds, which is not very far off memory access speeds.
slashdev|3 years ago
In addition to the cache misses you mention there's also TLB misses.
Memory is not actually random access, locality matters a lot. SSDs reads, on the other hand, are much closer to random access, but much more expensive.