Interestingly enough, Rust does not allow you to access undefined memory, not even if you do not care about the value stored there. People have been proposing a `freeze` operation that replaces uninitialized memory with garbage but initialized data (i.e. a no-op in assembly).
But there is tension about this: Not allowing access to uninitialized memory, ever, means that you get more guarantees about what foreign (safe) Rust can do, for instance.
My impression was that there was some kind of optimization in LLVM that relied on being able to assume values were never undef[0], which is why undefined memory access was always illegal in Rust[1].
Putting that aside, a deliberate "read uninitialized memory with bounded UB" primitive like freeze would only work for types where all possible bit patterns are valid. So no freezing chars[2], references, or sum types. And any transparent wrapper type that has invariants - like, say, slices, vecs, strs, and/or range-restricted integer types - would see them utterly broken when frozen. I suppose you could define some operation to "validate" the underlying bit pattern, but I'm not sure if that would defeat the point of reading uninitialized memory.
[0] LLVM concept that represents uninitialized memory, among other things.
[1] I believe a few other unsafe Rust concepts are actually leaky abstractions around LLVM things
[2] Rust's char must hold valid UTF-8 and will UB if you stick surrogates in there
One thing worth pointing out is that Linux makes it pretty difficult for userspace to access uninitialized memory; the MAP_UNINITIALIZED flag for mmap has to be specifically configured but generally isn't, so the memory does get zeroed at some point. Best you can hope for is that your heap allocator re-uses some un-munmapped memory. The kernel will zero pages on-demand, which helps, but you're still paying a cost for that zeroing.
An elegant optimization, but how would you intersect two of these efficiently? It sounds like you'd need to iterate over the entire dense vector and do a sparse-vector check for each and every value (O(m) with a very high constant factor). Either that, or sort both sparse vectors (O(n log n)).
jojomodding|7 months ago
But there is tension about this: Not allowing access to uninitialized memory, ever, means that you get more guarantees about what foreign (safe) Rust can do, for instance.
thrance|7 months ago
kmeisthax|7 months ago
Putting that aside, a deliberate "read uninitialized memory with bounded UB" primitive like freeze would only work for types where all possible bit patterns are valid. So no freezing chars[2], references, or sum types. And any transparent wrapper type that has invariants - like, say, slices, vecs, strs, and/or range-restricted integer types - would see them utterly broken when frozen. I suppose you could define some operation to "validate" the underlying bit pattern, but I'm not sure if that would defeat the point of reading uninitialized memory.
[0] LLVM concept that represents uninitialized memory, among other things.
[1] I believe a few other unsafe Rust concepts are actually leaky abstractions around LLVM things
[2] Rust's char must hold valid UTF-8 and will UB if you stick surrogates in there
dooglius|7 months ago
Sesse__|7 months ago
dzaima|7 months ago
dooglius|7 months ago