Async Rust: Panics vs. Cancellation

[+] pornel|4 years ago|reply

I really like cancellation in Rust, and I don't mind that it's so forceful. Maybe it's because I've worked with Node.js which is all the way at the other end of the spectrum with no way of aborting promises externally. In Node if you want to abort an async operation, you need to thread a DIY signal/flag/callback throughout your entire program, or you'll be left with orphaned tasks still running. Compared to that, Rust's futures that just immediately stop when you abort them are a luxury.

I've written quite a bit of async Rust code, and I haven't hit the problem described in the blog post. I think there are a couple of reasons:

• Rust already has programming patterns (like Drop guards) that avoid leaving the program in an inconsistent state in case of panics, and this solves the problem for aborted futures too. If your tasks owns a TCP stream, then it will always close the stream when it's aborted.

• Aborting and error handling are usually tied together, e.g. `timeout(copy()).await?` aborts a future, but turns that into an error. Especially for network-related tasks it's natural to treat timeouts as yet another case of an I/O error. Each end needs to handle suddenly closed connections anyway, regardless whether they're closed by Rust or a network error along the way.

• Rust uses layered abstractions for async code, e.g. you have request/response servers or streams for protocols. This way you can't leave the network protocol in an inconsistent state, even if your request handling code is aborted. If you were writing to a multiplexed HTTP/2 stream, then the underlying protocol handler will send an end-of-stream packet for you when your higher-level response stream is dropped.

[+] wging|4 years ago|reply

One current gap is that async drops aren't currently a thing - you cannot do any async work to clean up a resource. I imagine Niko will point that out in the next post. See also https://boats.gitlab.io/blog/post/poll-drop/

[+] codeflo|4 years ago|reply

Maybe the use of the word “cancel” can mislead some people, because the term means a much more controlled way of shutdown in other ecosystems. For example, you can’t really stop Tasks in C# unless you explicitly pass around a CancellationToken and check it at strategic points — very similar to the one in Tokio: https://docs.rs/tokio-util/latest/tokio_util/sync/struct.Can...

What dropping futures does in Rust is much more forceful, and possibly meant for a different usecase, or as a lower-level primitive.

[+] SigmundA|4 years ago|reply

Cancelling in Rust sound a lot like Thread.Abort() in .Net [1] just injects an exception arbitrarily into the thread and generally frowned upon due to potentially corrupt shared state and has been completely removed in later .Nets.

Unless you treat threads like a full process (no shared state) then you need cooperative cancellation (CancellationToken) which generally works well its nice to have common agreed upon cancellation message most commonly used in async I/O calls like a long running a DB query and wanting to cancel it if the client disconnects because they navigated away.

If you have uncooperative code that may say go into an infinite loop then you need to somehow preempt it if it runs on too long so then you have Thread.Abort or really Process.Kill which is the only safe way to do this in .Net (run it in another process).

[1] https://docs.microsoft.com/en-us/dotnet/api/system.threading...

[+] noitpmeder|4 years ago|reply

I find it really interesting that the author's example assumes that the read() and send() calls are the ones you need to worry about w/r/t exceptions. To me, the parse() call seems the most volatile -- what guarantees are there that the bytes you just read are parsable?

I usually code my programs assuming the system will work (reads/writes/sends). While I know this isn't guaranteed, it's a lot more likely my filesystem will work than that a file is assured to contain parsable data.

[+] maxwell86|4 years ago|reply

> To me, the parse() call seems the most volatile -- what guarantees are there that the bytes you just read are parsable?

None, which is why, in Rust, parse never throws an exception on a parsing error, and instead, returns a Result<Ok=T,Err=ParseError>, which is an ADT with either an Ok(T), which means parse succeeded, or an Err(ParseError), which means that an error happened, and contains state about where, etc.

See its documentation: https://doc.rust-lang.org/stable/std/primitive.str.html#meth...

The author isn't talking about this, probably cause its not the point of the article, but in Rust, you can't avoid handling parsing errors. If you want to actually get the value, you need to handle the possibility that an error happened. The programming model does not give the user a choice here.

[+] wallacoloo|4 years ago|reply

it’s a blog about async, and especially cancelation. parse presumably doesn’t do any I/O. read/send are the interesting bits because they hit the network, where latency is practically unbounded. a first approach at cancelation might be to make precisely those I/O routines cancelable via (say) error injection, in such a way that none of the non-I/O code ever needs to care specifically about cancelation. so read/send is where the “interesting” cancelation logic lives or hooks into.

[+] berkes|4 years ago|reply

> assuming the system will work (reads/writes/sends)

I encounter these issues quite commonly. Permissions, being in the wrong dir, buggy setup that forgot to create directories or copy files etc.

Isn't the `read()` and `send()` in rust concerned with those less-exceptional exceptions as well?

[+] msopena|4 years ago|reply

> I find it really interesting that the author's example assumes that the read() and send() calls are the ones you need to worry about w/r/t exceptions.

I didn't read it that way. In my view, the article is explaining a mental model where async code & panics are similar/related in a possible abstract mental model. He's using that snippet of code which from an async perspective, one could reasonably expect that the file or network IO is worth async waiting on but parsing is not.

But I don't think the author assumes parsing couldn't raise an exception since he states at the beginning: "If the parse function or the send function were to throw an exception, whatever data had just been read (and maybe parsed) would be lost.".

[+] arielb1|4 years ago|reply

Its more that in Rust, async cancellation will occur exactly at await points, which are generally when you do IO.

[+] throw10920|4 years ago|reply

> The reason is that long experience with exceptions has shown that exceptions work really well for propagating errors out, but they don’t work well for recovering from errors or handling them in a structured way.

That's why you use a condition system like Common Lisp[1] - conditions can be recovered from using restarts. The thrown condition signals the kind of error (or, even, exceptional non-error circumstance), while the defined restarts provide various error-recovery strategies, from which one can be chosen programmatically or manually.

I think that this part:

> In most programs, you have some kind of invariants that you are maintaining to ensure your data is in a valid state. It’s relatively straightforward to ensure that these invariants hold at the beginning of every operation and that they hold by the end of every operation. It’s really, really hard to ensure that those invariants hold all the time.

Can be fixed by adding immutability - instead of mutating your program's state in a low-level function, either generate a "transaction" object that is finished and applied to some other piece of state, or an exception is thrown and the whole transaction object is thrown away (or a particular loop could be restarted, or the program could be entirely restarted or quit, etc., depending on what restart you choose).

This:

> The problem is that exceptions make errors invisible, which means that programmers don’t think about them.

Can be fixed using smarter tooling (or checked exceptions, but I don't think anyone wants those) that ensures that you handle exceptions/conditions, unless I'm mistaken.

[1] https://en.wikipedia.org/wiki/Common_Lisp#Condition_system

[+] arielb1|4 years ago|reply

Maybe that's the subject for the next blog post, but I think the main reason cancellation causes more troubles than ordinary IO problems, is that with ordinary errors you assume that the resource that suffered the errors is down and don't care about its precise state, while with cancellation the resource is perfectly OK and you want to continue using it.

26 comments