A few months ago one of my friends sent a picture of some disassembled x86 they didn't understand: they guessed it was something like `alloca` because it was modifying the stack pointer based on the function parameters, but before it did that it had a loop over the size in 0x1000 chunks. The body of the loop, however, was just `test [ecx], eax; cmp eax, 0x1000; jae ...`, which seems trivially like a no-op! `test` is purely to set flags, but `cmp` would overwrite the flags immediately after, making the conditional jump independent of whatever the `test` does.
But in reality, `test` has side effects since it's dereferencing ecx, which causes a memory page access, and potentially trap. Compilers insert what look like pointless loops ("stack probes") for `alloca` because if you try to stack allocate multiple pages worth of memory at once and then read only the later bytes, you can skip over stack guard pages that the kernel uses to page in memory on-demand or crash the program if it could cause the stack to clash into the heap.
It's a really nice example of the "there's always something further down" type thinking you need for low-level stuff imo. This was from reading x86 disassembly, which is an advanced topic that 90%(?) of programmers never have to care about because it's so "low level"...until you have to start caring about virtual memory, or your kernel implementation details.
Holy-moley - you know I’d never thought very hard about how alloca would work if it skipped a stack guard. Here’s a question though - wouldn’t you have the same problem if you just had a really really big stack frame in your function? If you enter the function and then call another function before touching any locals, you could jump past stack guards pushing arguments on to the stack, no? Presumably the compiler needs to anticipate this and grab those pages like in the alloca case?
This was a good article. For my money, one of the best conceptual introductions to virtual memory was an article by Jeff Berryman, over 40 years ago. It's been reprinted many times, including at https://en.m.wikisource.org/wiki/The_Paging_Game.
Curiously, you do not need memory virtualization if all you want is memory protection and process isolation: this can be achieved by a simpler piece of hardware that just ensures that a range of the upper bits of the memory address used by the process matches a certain pattern, or tag. This effectively partitions the physical RAM.
That sounds like a capability machine e.g., CHERI[1]
It seems those might become relatively mainstream in a few years, as ARM seems to be jumping on board [2]
You could also implement something likes software isolated process to have memory safety. However, the idea of virtual memory really fascinates me. Such an elegant layer and it brings you many other merits besides memory safety.
[+] [-] chc4|5 years ago|reply
But in reality, `test` has side effects since it's dereferencing ecx, which causes a memory page access, and potentially trap. Compilers insert what look like pointless loops ("stack probes") for `alloca` because if you try to stack allocate multiple pages worth of memory at once and then read only the later bytes, you can skip over stack guard pages that the kernel uses to page in memory on-demand or crash the program if it could cause the stack to clash into the heap.
It's a really nice example of the "there's always something further down" type thinking you need for low-level stuff imo. This was from reading x86 disassembly, which is an advanced topic that 90%(?) of programmers never have to care about because it's so "low level"...until you have to start caring about virtual memory, or your kernel implementation details.
[+] [-] CountSessine|5 years ago|reply
[+] [-] vincent-manis|5 years ago|reply
[+] [-] Koshkin|5 years ago|reply
[+] [-] fuklief|5 years ago|reply
[1]: https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/ [2]: https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/cheri...
[+] [-] monocasa|5 years ago|reply
[+] [-] t0350|5 years ago|reply