top | item 44049733

(no title)

lhecker | 9 months ago

What I meant is that if I write a UTF8 --> UTF16 conversion function for my editor in C I can write

  size_t convert(state_t* state, const void* inp, void* out)
This function now works with both initialized and uninitialized data in practice. It also is transparent over whether the output buffer is an `u8` (a byte buffer to write it out into a `File`) or `u16` (a buffer for then using the UTF16). I've never had to think about whether this doesn't work (in this particular context; let's ignore any alignment concerns for writes into `out` in this example) and I don't recall running into any issues writing such code in a long long time.

If I write the equivalent code in Rust I may write

  fn convert(&mut self, inp: &[u8], out: &mut [MaybeUninit<u8>]) -> usize
The problem is now obvious to me, but at least my intention is clear: "Come here! Give me your uninitialized arrays! I don't care!". But this is not the end of the problem, because writing this code is theoretically unsafe. If you have a `[u8]` slice for `out` you have to convert it to `[MaybeUninit<u8>]`, but then the function could theoretically write uninitialized data and that's UB isn't it? So now I have to think about this problem and write this instead:

  fn convert(&mut self, inp: &[u8], out: &mut [u8]) -> usize
...and that will also be unsafe, because now I have to convert my actual `[MaybeUninit<u8>]` buffer (for file writes) to `[u8]` for calls to this API.

Long story short, this is a problem that occupies my mind when writing in Rust, but not in C. That doesn't mean that C's many unsafeties don't worry me, it just means that this _particular_ problem type described above doesn't come up as an issue in C code that I write.

Edit: Also, what usefulcat said.

discuss

order

ninkendo|9 months ago

Why wouldn’t you accept a &mut [MaybeUninit<T>] and return a &mut [u8], hiding the unsafe bits that transmute the underlying reference?

Something like:

  fn convert<'i, 'o>(inp: &'i [u8], buf: &'o mut MaybeUninit<u8>) -> &'o mut [u8]
(Honest question, actually… because the above may be impossible to write and I’m on my phone and can’t try it.)

Edit: it works: https://play.rust-lang.org/?version=stable&mode=debug&editio...

lhecker|9 months ago

That's a fair workaround for my specific example. But I believe it's possible to contrive a different example where such a solution would not be possible. Put differently, I only tried to convey the overall idea of what I think is a shortcoming in Rust at the moment.

Edit: Also, I believe your code would fail my second section, as the `convert` function would have difficulty accepting a `[u8]` slice. Converting `[u8]` to `[MaybeUninit<u8>]` is not safe per se.