Zig as an alternative to writing unsafe Rust

bpolverini|3 years ago

I buy the premise that Zig is better if you know you will have lots of pointer arithmetic going on. Having written a fair amount of unsafe C interop code in Rust, I feel like these critiques of the ergonomics are valid. The new #![feature(strict_provenance)] adds a new layer of complexity, that, I hope, improves some of this experience while adding safety. Rust's benefits are not free.

The benefits of Rust's (wonderful) model around references and lifetimes come at a significant cost to ergonomics when having to go into the Mordor of some C library and back. I usually find myself wishing I could have some macro where I just write in C and have it exposed as an unsafe back in Rust. I know I can do this by just writing a C dylib and integrating that, but now I've got two problems.

Even still, I prefer writing unsafe Rust to writing C. std::mem::ptr forces me to ask the right questions and reminds me of just how easy it is to fall into UB in C as well.

WalterBright|3 years ago

> I usually find myself wishing I could have some macro where I just write in C and have it exposed as an unsafe back in Rust.

In D you can just import a .c file mycfile.c with:

    import mycfile; // C file filled with C functions

and they'll be treated as @system code by the D semantics. They're even inlinable.

swsieber|3 years ago

Here's one that does that: https://lib.rs/crates/inline-c

I'm not sure it's what you're looking for, but it seems like a good starting point.

As for the general thrust of your comment and the article, I agree. It'll be interesting to see what changes come to make things nicer.

zdimension|3 years ago

Shameless self-plug: https://github.com/zdimension/embed-c

This transpiles C to Rust at compile time, though it requires a nightly compiler and hasn't been updated in some time. But it's exactly what you're looking for. C code in, unsafe Rust code out.

Scarbutt|3 years ago

You are not alone. The other day I was checking out a new programming language and the author rewrote the unsafe rust part to zig: https://github.com/roc-lang/roc/blob/main/FAQ.md#why-does-ro...

Asooka|3 years ago

One big difference between unsafe Rust and C is that C compilers have flags to turn off the UB, so you have a lot less mental load when writing it. You go from e.g. "if this index calculation overflows, we may read from outside the array, because the bounds check was deleted" to "if this index calculation overflows, we may read from the wrong index, but never outside of the array bounds". UB is Damocles's sword and the speed gains are usually not worth it. With UB your program can enter a buggy state that you cannot detect, because "it cannot happen". Without UB, your program can still enter a buggy state, but you can detect it and potentially recover or crash immediately before even more things go wrong.

likeabbas|3 years ago

> I usually find myself wishing I could have some macro where I just write in C and have it exposed as an unsafe back in Rust.

There’s a macro for doing this with Assembly, I can imagine one could be made for C. But why wouldn’t you just write unsafe rust at that point?

ok123456|3 years ago

This pretty much mirrors my experience. Rust is the inverse of Perl: It makes the easy stuff hard.

Writing basic data structures isn't a niche, esoteric edge case. There may be a crate that "solves" what you're trying to do. But does it rely on the std---(i.e., is it unusable for systems programming)? Is it implemented making gratuitous copies of data everywhere? Does it have a hideous interface which will then pollute all of your interfaces? Does it rely on 'unstable' features?

Then, there's the 'community.' It seems to consist solely of extremely online people who get a dopamine hit from both telling people they're doing things wrong and creating the most complex solutions possible. They do this under a thin veneer of forced niceness, but it's not nice at all.

hgs3|3 years ago

> It seems to consist solely of extremely online people who get a dopamine hit from both telling people they're doing things wrong and creating the most complex solutions possible.

I've observed that certain programming languages have a culture of complexity. I'm not sure why this is. I can only speculate its because these programmers are working on "boring" problems so they make busy work for themselves OR their beginners who think this is how "real programmers" work.

While I think calling them "idiots" is a bit strong, I think this quote from the late Terry A. Davis is worth remembering: “An idiot admires complexity, a genius admires simplicity [...] for an idiot anything the more complicated it is the more he will admire it, if you make something so clusterfucked he can't understand it he's gonna think you're a god cause you made it so complicated nobody can understand it.”

mustache_kimono|3 years ago

> Rust is the inverse of Perl: It makes the easy stuff hard.

It's a bargain. Rust is pretty great, but it doesn't make some things easy because it would make everything else hard.

This comment is amusing, mostly because Perl is so full of tradeoffs. Do you want to write something to do some string parsing quickly? Great language, maybe. Do you want to understand what you've written later? Maybe not so great.

> Writing basic data structures isn't a niche, esoteric edge case.

Not if you're using C, because the batteries are definitely not included. Want a resizable array, or hashmap? The answer is DIY. Not in the std library. Whereas all are provided by default in Rust. Picking a basic data structure off the shelf is a pretty nice feature for most applications.

That said -- should you really need a custom implementation of a linked list, and you need to write and rewrite such an implementation all the time -- I'd understand if Rust wasn't your first choice.

> Then, there's the 'community.'

And I'm not sure anyone loves this attitude either. Keep it technical.

loeg|3 years ago

> [Rust] makes the easy stuff hard.

I totally disagree with that claim. The easy thing to do is not use unsafe Rust. You can use stringly-typed datastructures with lots of refcounting or copying, just like Perl, without ever venturing into unsafe Rust.

> Writing basic data structures isn't a niche, esoteric edge case. There may be a crate that "solves" what you're trying to do. But does it rely on the std---(i.e., is it unusable for systems programming)?

In what world is Perl suitable for systems where possible memory allocation is a problem?

eldenring|3 years ago

> Writing basic data structures isn't a niche, esoteric edge case

Maybe it isn't an edge case (although it should be) it also isn't `easy` in a non GC'd language, and a huge source of memory bugs.

I wouldn't say it makes the 'easy stuff hard' as much as the 'hard stuff appropriately difficult'.

dathinab|3 years ago

> Writing basic data structures isn't a niche, esoteric edge case

writing data structures _properly/well_ was never easy

it just looks easy and is nearly always a sub-par solution

e.g. a list in many lisp like languages seems simple, until you look under the hood what magic tends to be used by more advanced compilers to make that list work fast

the think people most commonly got wrong which was supposedly easy when programming when I was school/stadium where data structures, even comparatively simple ones like double linked lists

it's like sorting, sure you find docents easy to implement sorting algorithms everywhere, but then when you look at the properly implemented sorting build ins of standard libraries their complexity is hundreds of times that of quick sort or similar

xxpor|3 years ago

>Writing basic data structures isn't a niche, esoteric edge case.

It very much should be though! That's exactly the type of thing that should be written once by someone who knows what they're doing, and then reused 1000000 times. People slapping together a quick data structure is a huge problem in C.

ilrwbwrkhv|3 years ago

On top of that Rust might be the ugliest modern language.

Karrot_Kream|3 years ago

One technical criticism I have of the "community" is that a non-trivial portion of the community thinks that if the borrow checker forbids a program then the program is a bad program. None of the core contributors do this and many prominent library authors are upfront that this is not true, but I've read lots of comments and posts in Rust rooms about how the borrow checker should be the way to write and architect correct programs.

The Rust borrow checker imposes a style that is safe, but is not the only safe way to write that program.

nextaccountic|3 years ago

> It makes the easy stuff hard.

> Writing basic data structures isn't a niche, esoteric edge case.

Writing basic data structures isn't "easy stuff" in any low level language. It isn't in Zig by any means, nor in C nor in C++

pjmlp|3 years ago

There is already C for when one wants to do otherwise.

woodruffw|3 years ago

> Unsafe Rust is hard. A lot harder than C, this is because unsafe Rust has a lot of nuanced rules about undefined behaviour (UB) — thanks to the borrow checker — that make it easy to perniciously break things and introduce bugs.

I don't think this is correct: Rust makes writing unsafe Rust correctly more onerous than writing C, but the actual rules for undefined behavior are the virtually same as in C: if you alias where you must not, or mutate where you must not, etc. you're in exactly the same boat.

In other words: Rust makes it hard to write unsafe Rust correctly, but no harder than writing well-defined C. The only difference is that Rust raises the safety expectations by default, making unsafe Rust look more difficult than C.

hra5th|3 years ago

I don't agree that the rules for UB are virtually the same as in C. One example: if your unsafe Rust code modifies any memory address for which there exists a reference elsewhere, that is instantly UB. In C, that is not necessarily the case. https://www.youtube.com/watch?v=DG-VLezRkYQ has some good details on this.

Similarly, in Rust you have to be careful to never instantiate a value that is out-of-range for a given type (e.g. a bool with value > 1), even if you will never read or access that value before it is changed to something valid. In C this same concern does not exist since it is not insta-UB in the same way.

brundolf|3 years ago

This is not correct. Here's a really good video that goes into the differences: https://youtu.be/DG-VLezRkYQ

guipsp|3 years ago

The rust undefined behaviour rules are stricter than C: mutating a non mutable reference is UB, for example. Non mutable references don't exist in C.

kprotty|3 years ago

> the actual rules for undefined behavior are the virtually same as in C

1. Creating a mutable reference when there's other references to the same memory around, even if you don't use/deref that mutable reference, is considered UB in Rust; References there have the `dereferenceable` LLVM attribute, so the compiler is allowed to insert use/derefs at will to facilitate optimizations [0]. C's pointers are more like Rust's raw pointers: they only have to be valid upon use not at creation.

2. References in Rust are transient (as noted in the blogpost) so holding a mut ref to T means you also hold a mut ref to all its fields/subfields semantically. If you're doing intrusive or self-referential data structures, it often requires having UnsafeCell fields to soundly create isolated mut refs from top-level shared refs. Problem being that core, language-level traits in Rust like Iterator and Future (generated by async blocks) take mut refs so implementing them (which is practically useful) on types with intrusive fields potentially being used elsewhere is UB [1]. This doesn't exist in C with no `dereferenceable` & opt-in `restrict`. It's still an unresolved issue in Rust though [2] where they had to disable LLVM annotations on problematic types/traits to avoid miscompilations [3]. Some of these footguns can be avoided by not using references and the core language traits (like the blogpost did), but they found that to not be a great programming experience.

3. Because of `dereferenceable` (again) instances of a type must be valid in-memory representations at all times, even when unused [4]. If you want invalid/uninit representations, you wrap the type in `MaybeUninit` which is fairly unergonomic. C doesn't have this issue as its only UB to deref invalid pointers or branch on invalid values (same case in Rust), not have invalid values at all.

[0]: https://github.com/rust-lang/rust/issues/94133

[1]: https://gist.github.com/Darksonn/1567538f56af1a8038ecc3c664a...

[2]: https://github.com/rust-lang/rust/issues/63818

[3]: https://github.com/rust-lang/rust/pull/106180

[4]: https://doc.rust-lang.org/std/primitive.reference.html

Animats|3 years ago

If you're writing much unsafe code in Rust, you're doing it wrong.

OK, for a garbage collector, maybe you have to, because you're taking over memory management yourself. But very, very rarely do you need to do that. And when you do, you need very thorough testing, test tools, and documentation.

I just got done chasing someone else's pointer bugs with valgrind and gdb, in C code from a public crate three levels down from my code. Valgrind was useful in locating the area of trouble. The code there had too much unnecessary pointer manipulation, and offsets obtained from input which might be un-initialized memory. This never happens in safe Rust. Which is the whole point.

Most things for which C programmers use pointer arithmetic can be expressed as slices. Slices are pointer arithmetic, but with size information and sound rules.

(I'm a bit cranky this week. I've spent the last few weeks finding bugs in Rust crates that ought to Just Work.)

satvikpendem|3 years ago

Note, this is specifically talking about unsafe Rust versus Zig. Personally unsafe does have some rough edges, I'm looking forward to seeing how the Rust team manages to make it better in the future.

j16sdiz|3 years ago

UB in unsafe rust sometimes "leaks" outside the unsafe scope and cause crashing elsewhere.

If rust can pair with a proof checker and let user write some correctness proof, it can be way more useful than the current borrow checker.

slaymaker1907|3 years ago

I think most of the difficulty people experience is when they try to naïvely use references anywhere they would normally use a pointer. That mostly works for functions, but this ends up getting really confusing and difficult for data objects. Instead, people should really be using things like Rc<T> which makes certain patterns much simpler. People seem to have this ridiculous notion that using Rc<T> or heaven forbid Rc<Box<T>> is going to make their code slow, but in reality it can greatly simplify code at minimal performance cost when used places that references would get complex. People generally don't say Swift is slow, but it uses reference counting all over the place.

lll-o-lll|3 years ago

People do say that Swift is slow though, and it has a whole bunch of optimisations to ensure the RC is elided whenever possible.

nextaccountic|3 years ago

Rc<Box<T>> really doesn't make sense (Rc is like a specialized Box), I guess you meant Rc<RefCell<T>>?

pjmlp|3 years ago

Using custom allocators to taint memory for security checks is no better than what C and C++ toolchains have been providing for decades.

I was already using debug allocators in Visual C++ 5.0, with a memory report at the program exit.

For the latest documentation,

https://learn.microsoft.com/en-us/cpp/c-runtime-library/crt-...

Karellen|3 years ago

Wait, the benchmark to find the 35th fibonacci number took 1.077s for the Zig VM vs 1.657s for the Rust VM?

I realise that these VMs are going to be totally idiomatic Zig/Rust, with the most straightforward implementation possible, and little-to-no performance tuning, but even so - that's gotta be a typo, right? Or it's actually finding `fib(350)`? Or each "run" is actually finding `fib(35)` 100 (1000?) times?

maxbond|3 years ago

I think the VM is operative here. My native Rust implementation found fib(35) in 51ms, but my Python implementation took about 1500ms (similar to their measurements).

(I eyeballed the assembly to make sure the native implementation was recursive, but that's the extent of my rigor - consider these napkin numbers.)

TheRealPomax|3 years ago

No caching, every recursive call runs the entire chain from scratch. fib(35) without caching takes roughly 14 million calls to resolve.

fizx|3 years ago

It's intentionally using the naive 2^N solution to stress test lots of tiny function calls.

chubot|3 years ago

The way I would frame this is that Rust has static (compile-time) memory management, and that conflicts with dynamic memory management (garbage collection).

The boundary is awkward and creates complexity.

I wrote a post about problems writing a garbage collector in C++, e.g. annotating the root set, and having precise metadata for tracing.

http://www.oilshell.org/blog/2023/01/garbage-collector.html

https://news.ycombinator.com/item?id=34350260

I linked to this 2016 post about Rust, which makes me think the problem could be worse in Rust, although I haven't tried it:

http://blog.pnkfx.org/blog/2016/01/01/gc-and-rust-part-2-roo...

I didn't write as much about bindings to native C++ code, but that's also an issue that you have to think about carefully. CPython has kind of been "stuck" with their API for decades, which exposes reference counting. So it's extraordinarily difficult to move to tracing GC, let alone moving GC, etc.

---

On the other hand, there was also a paper that said Rust can be good for writing GCs.

Rust as a Language for High Performance GC Implementation

https://dl.acm.org/doi/pdf/10.1145/2926697.2926707

However, I'm not sure it addresses the interface issue. One lesson I learned is that GCs are NOT modular pieces of code -- they have "tentacles" that touch the entire program!

That said, C++ is pretty good at "typed memory" as well, and I think it's more pleasant than C. That is, you get more than void* and macros. So I can believe that Rust has benefits for writing GC.

Not sure about Zig -- I can believe it's a nice middle ground.

kaba0|3 years ago

Dynamic memory management is an entirely different axes from manual/automatic, though.

(Safe) Rust indeed limits the user to a compile-time deallocable subset of what’s expressible (RC being an escape hatch), which depending on the problem domain may be too limiting.

andrewstuart|3 years ago

Zig seems more simple than Rust.

I’m willing to forgoe certain rust features in exchange for simplicity.

nvrspyx|3 years ago

The link is down for me. Perhaps HN hug of death? Below is link to the site on the Wayback Machine.

https://web.archive.org/web/20230307172822/https://zackoverf...

jrsyo|3 years ago

Try ping "zackoverflow.dev" and "cname.vercel-dns.com". If only the former times out, it's possible your local ISP is blocking Vercel's IP. If that's the case, you could contact Vercel support to help assist.

mercurywells|3 years ago

Maybe you don't have *.dev lookups working properly?

graypegg|3 years ago

Check your hosts file, you might be mapping *.dev to localhost

lerno|3 years ago

This claim just killed me: "Apart from [Zig] not having crazy UB like in unsafe Rust". Zig has more UB than even C. Yes, Zig has safety checks you can turn on, but then it's not fast anymore. The claim saying Zig has no UB is like saying C has no UB because you can run it with UB-sanitizers.

Don't get me wrong, it's great to have this directly enabled in "safe" mode like Zig does it. But to use that to say Zig is more safe is extremely misleading.

kristoff_it|3 years ago

You should read the rest of the article, where the appropriate context is given to that sentence.

pyrolistical|3 years ago

Isn’t

    var ptr: [*]u8 = @ptrCast([*]u8, &slice[0]);

the same as

    var ptr = slice.ptr;

?

Or am I missing something

WalterBright|3 years ago

The `slice.ptr` version is allowed in D, but `&slice[0]` is preferred because that comes with a check that the slice has a non-zero length and the pointer will actually point to something valid. That's why the former is allowed in @system code, and the latter is used in @safe code.

nektro|3 years ago

yes it would be the same

nyanpasu64|3 years ago

> If I have a raw pointer to an array of data (*mut T), I can turn it into a slice &mut [T], and I get to use a for ... in loop on it or any of the handy iterators (.for_each(), .map(), etc.).

I wonder if *mut [u8] would be a workable alternative to &mut [u8]. I haven't looked into creating such pointers yet, but `fn f(a: *mut [u8]) {}` is legal while `fn f(a: *mut [u8]) { a.len(); }` doesn't compile on stable due to https://github.com/rust-lang/rust/issues/71146. Looks like raw slice pointers aren't fully baked yet.

evnix|3 years ago

I you look at Deno and Bun, TS/JS runtimes, one written in rust and the other in Zig.

Bun repo is filled with issues surrounding segfaults, but I guess it gave them the advantage to get up and running quickly.

sundarurfriend|3 years ago

How significant is this to embedded development, eg. automotive software? I've been learning Rust on the side, and one of the main applications I have in mind is to get back into some embedded programming. I saw that there were libraries and even whole books about this [1], but I'm curious what the actual experience is like, and how much you have to wrangle raw pointers there.

[1] https://docs.rust-embedded.org/book/

steveklabnik|3 years ago

We do embedded work at Oxide. There's unsafe code, but not a ton of it. I haven't dug into this project's codebase at all, but the way it's described it sounds like there's way more than we use.

noncoml|3 years ago

A hammer better at hammering nails than a screwdriver

unknown|3 years ago

[deleted]

DeathArrow|3 years ago

I get that Zig is better than unsafe Rust. But what about the general use? Is Zig simplicity and speed of writing code worth the trade-off in safety?

SoraNoTenshi|3 years ago

I would say that different technologies serve a different purpose.

So, saying, that e.g., Zig is generally better or worse than Rust doesn't make any sense to me, as both Languages have a different purpose.

I would much rather use Rust to write a Webserver (in production) than in Zig, simply because Rust serves this purpose better than Zig does. And i personally can consider Zig to be my favourite Language.

I would also never (at least right now) write Website frontends in either Zig or Rust, i think JavaScript in this case, is the obvious choice, because it has been developed (over years) for this (and unfortunately other) use case(s).

ksec|3 years ago

>There are endless debates online about Rust vs. Zig

I have never read anything that suggest or argued Zig as better than Rust, or "Rust vs Zig". Not on HN, not on Reddit, not on Twitter. In fact this link / title is the first one. ( I do wish the title was "Unsafe Rust" to better reflect on the content. )

There are however plenty who still prefer Zig over Rust, even knowing when Rust is better.

I also want to note RESF generally does not consider "unsafe" Rust to be Rust.

Edit: LOL I knew this would be heavily downvoted.

infamouscow|3 years ago

[deleted]

henry_viii|3 years ago

[deleted]

dang|3 years ago

We've already asked you to stop taking HN threads into programming language flamewar. We don't want that here—it's tedious, and leads to repetitive discussions and then nasty ones.

No more of this on HN, please.

217 comments