Something to keep in mind - linear types are on their way[1], with exactly this usecase in mind. Simon Peyton Jones gave an excellent presentation on the topic[2], briefly discussing exceptions, as well as giving a mention to ResourceT and the phantom type solution in the article (described as channel-passing).
I'm not convinced the current linear types proposals actually let us solve the problem, in the presence of exceptions. I may very well be missing something, or it may be that exceptions are rare enough that leaking resources until garbage collection only when an exception occurs is fine in practice.
The thing is that, in Haskell, even when you attach a function to run during destruction, the runtime doesn't guarantee that the function will be called promptly, or even at all. Rust drops (runs destructor and deallocates) values as soon as they go out of scope; C++ too. In Haskell you depend on the whims of the GC, which makes RAII unusable. (The Haskell approach of not guaranteeing destructors being called does have its merits; when many C++ and Rust programs are about to end, they spend the last few cycles uselessly deallocating memory that would've immediately been freed via _exit(2)).
Therefore the RAII style wouldn't really work in Haskell. The current bracket approach is still better than RAII in Haskell.
That said, the ST-style trick of a phantom type variable is pretty well-known. Unfortunately not many people knew the same trick can be used for non-ST as well. I feel like as a community we should be encouraging this style more often.
UPDATE: I wrote the original comment with the incorrect assumption that drop functions will always be called in Rust. This is wrong. Please see child comments.
I don't know if it's fair to call that "the Haskell approach", per-se. That destructors are not guaranteed to run, or run predictably, is generally a property of all fast garbage collectors. If you want a GC that can run quickly, which in a language like haskell where you're going to get lots of small allocations in contexts it would be difficult-to-impossible to efficiently determine the exact moment scopes die, or get the programmer to, you absolutely do, then one of the costs of that is you can't afford to run code for every destroyed object.
The linked post is interesting, because I didn't realise "RAII is a much better way of managing resources than destructors" was controversial. It absolutely is, RAII is fast, predictable, and flexible. It's also one of the tradeoffs some languages make to achieve more flexibility in their design by enabling performant automatic garbage collection that doesn't require perfect escape analysis.
> when many C++ and Rust programs are about to end, they spend the last few cycles uselessly deallocating memory that would've immediately been freed via _exit(2)
This isn't useless because memory allocation can happen during destruction/exit, e.g. to write some data to the filesystem.
Suppose you have a container with a billion objects. The container's destructor iterates over each object, doing some housekeeping that requires making a copy and then deleting the original before moving on to the next object.
That requires memory equivalent to one additional object because an original is destroyed following each copy. Stop dellocating memory during destruction/exit and the total memory required doubles, because you have all the copies but still all the originals.
There are also some helpful things that happen during deallocation. For example, glibc has double free detection, which strongly implies potential UAF but it's only detected if the second free() actually gets called.
> The thing is that, in Haskell, even when you attach a function to run during destruction, the runtime doesn't guarantee that the function will be called promptly, or even at all.
However, this is different than the bracket pattern that the article is taking about. No one in the Haskell community advocates cleaning up resources (like file descriptors, etc) using only destructors.
> when many C++ and Rust programs are about to end, they spend the last few cycles uselessly deallocating memory that would've immediately been freed via _exit(2)
thank god they do this. how many times did I have to manually force linux to release sockets because badly coded C programs which opened sockets forgot to release them causing them to hang up for ~5 minutes after the process ended. With proper RAII classes this does not happen.
> The thing is that, in Haskell, even when you attach a function to run during destruction, the runtime doesn't guarantee that the function will be called promptly, or even at all.
There's also no guarantee for Rust/C++ destructors to be called. It's certainly less of an issue then depending on the GC to being called, but if you need absolute correctness, then you shouldn't rely on the destructors.
You can also have the exact opposite problem with RAII, where a resource survives the end of a transaction, because there is still a live reference to it hidden away somewhere (say, due to some debugging code holding on to it).
This is a classical liveness vs. safety dualism. "Something good will eventually happen" and "nothing bad will ever happen" are promises whose solutions are often in conflict with one another.
The general problem — to make transactional state changes and transactional control flow (i.e. expectations about these state changes) match up precisely — is not easy to solve in the general case, especially once you move on to things that are less trivial than simple resource acquisition/release matching.
Oddly, Rust's ownership system really does solve these problems, and Non-lexical lifetimes should eliminate accidental scope-broadening. Unless you are doing some mega-schenanigans, an e.g. MutexGuard gets released precisely when you think.
Your point about this being difficult to solve in the general case is true, it's just worth pointing out Rust intends to do that hard thing anyway.
That's also a problem with garbage collection, by the way. GC means memory safety, but not necessarily correctness. In fact, it invites sloppiness, and a kind of sloppiness that sits at a more conceptual level, which could be harder to fix.
Keeping a debug reference at end of transaction (via a shared-reference type, since a non-shared RAII reference type could never get into that state) isn't a coding error, it's a design error -- the development intentionally requested contradictory things. That is solved by using weak references if you don't want a debug tool to force an object to stay alive.
How do you handle errors at resource release? When you close a file, the final writes take place, and they can fail. What's the idiom in Rust for getting them out?
Python's "with" clause, and the way it interacts with exceptions, is the only system I've seen that gets this right for the nested case.
> How do you handle errors at resource release? When you close a file, the final writes take place, and they can fail. What's the idiom in Rust for getting them out?
That is unclear. Currently, `File::drop` ignores all errors and drops them on the floor ([unix], [windows]). This is a concern both long-standing and ongoing[0].
If you really care about reliability you implement transactional semantics on your output storage: i.e. writes are not globally visible until an explicit system wide atomic commit is preformed.
The destructor would instead be in charge to perform the rollback actions on an uncommitted transaction, if any. Rollback cannot fail and indeed the system must preserve integriy even if not performed as there is no guarantee that the process will not be killed externally.
Of course if you do not care about data integrity, swallowing errors in close is perfectly acceptable.
Edit: in general destructors should only be used to maintain the internal integrity of the process itself (freeing memory, clising fds, maintaining coherency of internal datastructures), not of external data or the whole system. It is fine to do external cleanup (removing temporary files, clearing committed transaction logs, unsubscribing from remote sources, releasing system wide locks etc), but shoud always be understood to be a best effort job.
A reliable system need to be able to continue in all circumstances (replying or rolling back transactions on restart, cleaning up leftover data, heartbeating and timing out on connections and subscriptions, using lock free algos or robust locks for memoryshared between processes, etc).
Python's "with" construct is analogous to the bracket pattern in Haskell that the article is talking about. It also works in the nested case in the presence of exceptions. Furthermore, the issue that Michael has with the bracket pattern in Haskell can also happen in Python.
I don't think Rust can notify on a failing destructor other than panic!ing. AFAIK the best you can do if you want to handle errors on close is to call `flush()` (which does return errors) before dropping the object. Of course that nullifies the benefits from RAII.
I don't know if there's an elegant way to solve this. If Rust had exception you could use that but then again in C++ it's often explicitly discouraged to throw in destructors because you could end up in a bad situation if you throw an exception while propagating an other one. How does Python's "with" handle that?
I do prefer the "with"/"try-with-resources" approach because it is explicit.
With RAII in C++ there's no visual difference between dumb data objects and objects like locks that are created and held on to mainly to cause implicit side effects.
In Rust this also prevents the compiler from dropping objects early - everything must be held until the end of its scope for the 0.1% of cases where you're RAII managing some externally visible resource. In those cases I would like the programmer to denote "The exact lifetime of this object is important", so the reader knows where to pay attention.
Java has the try-with-resources statement, and C# has the using statement. They're alternative forms of the try statement and they're functionally equivalent to Python's with statement using contextlib.closing.
The error handling in close is platform specific, so you have to convert the file into a raw fd/handle and then pass it to the appropriate libc methods.
Interesting! Michael is one of the more prolific writings and practitioners in the Haskell space (I read just about everything he writes) so it is interesting to also read his take on Rust.
fn leak() {
// Create a 1KiB heap-allocated vector
let b = Box::new(vec![0u8; 1024]);
// Turn it into a raw pointer
let p = Box::into_raw(b);
// Then leak the pointer
}
Obviously that's kind of blatant, but there are more subtle ways to leak memory. Memory leaks aren't considered unsafe, so even though they're undesirable the compiler doesn't guarantee you won't have any.
Reference cycles when using Rc<T> are a big one, but generally it's pretty hard to cause leaks by accident. I've only run into one instance of leaking memory outside of unsafe code, and that was caused by a library issue.
The same way that memory leaks are possible in Java: rather than a technical bug (you forgot to `free` some buffer), instead you have a semantical bug (you're holding on to a pointer to the data after you're done with it, and that keeps the data alive).
Granted, the ownership/borrowing semantics of rust make this a lot harder, but anything that uses Rc/Arc can easily fall prey to it — you can use those to create a reference cycle.
Memory leaks are not only possible but they are officially supported. The most obvious being `std::mem::forget` and `Box::leak`. Of course the user of these functions should usually ensure that drop is eventually called for all initialized data but there's no way to enforce this.
If you mean unintentional leaks then that is a harder problem. Others have noted ARC and RC leaks but also thread locals may (or may not) leak[0].
Because the standard library has reference counting with no static checking to avoid cycles, and it was decided to also have safe mem::forget since it can be (mostly) emulated with the former.
It has no such static checking because it was deemed to reduce expressiveness, while not impacting memory safety.
> Rust's safety guarantees do not include a guarantee that destructors will always run. For example, a program can create a reference cycle using Rc, or call process::exit to exit without running destructors. Thus, allowing mem::forget from safe code does not fundamentally change Rust's safety guarantees.
The mechanic point of this article is pretty clear:
- it's possible to be unsafe in both Haskell and Rust when dealing with resource cleanup
- Rust does a bit of a better job in the general case though it has it's own warts (see the other comments, it's hard to deal with issues during `drop`-triggered cleanup)
I want to make a muddier meta point -- Rust is the best systems language to date (does anyone know a better one I can look at?).
- The person who wrote this article Michael Snoyman[0] is mainly a haskell developer, he's the lead developer behind arguably the most popular web framework, yesod[1].
- Haskell developers generally have a higher standard for type systems, and spend a lot of time (whether they should or not) thinking about correctness due to the pro-activity of the compiler.
- These are the kind of people you want trying to use/enjoy your language, if only because they will create/disseminate patterns/insight that make programming safer and easier for everyone down the line -- research languages (Haskell is actually probably tied for the least "researchy" these days in the ML camp) are the Mercedes Benz's of the programming world -- the safety features trickle down from there.
- Rust is not a ML family language -- it's a systems language
- People who write Haskell on a daily basis are finding their way to rust, because it has a pretty great type system
When was the last time you saw a systems language with a type system so good that people who are into type systems were working with it? When was the last time you saw a systems language that scaled comfortably and gracefully from embedded systems to web services? When have you last seen a systems language with such a helpful, vibrant, excited community (TBH I don't think this can last), backed by an organization with values Mozilla's?
You owe it to yourself to check it out. As far as I see it rust has two main problems:
- Learning curve for one of it's main features (ownership/borrowing)
- Readability/Ergonomics (sigils, etc can make rust hard to read)
Admittedly, I never gave D[2] a proper shake, and I've heard it's good, but the safety and the emphasis on zero-cost abstractions Rust offers me makes it a non-starter. Rust is smart so I can be dumb. C++ had it's chance and it just has too much cruft for not enough upside -- there's so much struggle required to modernize, to make decisions that rust has had from the beginning (because it's so new). It might be the more stable choice for a x hundred people big corporate project today or next month, but I can't imagine a future where Rust isn't the premier backend/systems language for performance critical (and even those that are not critical) programs in the next ~5 years.
I'll go even one step further and say that I think that how much rust forces you to think about ownership/borrowing and how memory is shared around your application is important. Just as Haskell might force you to think about types more closely/methodically (and you're often better for it), Rust's brand of pain seems instructive.
> When was the last time you saw a systems language that scaled comfortably and gracefully from embedded systems to web services?
Have a look at ATS[1], it supports many features that are available in Rust, and let you build proofs about your code behaviour. It's quite type annotation heavy though iirc, but it's very efficient.
[+] [-] T-R|7 years ago|reply
[1] https://arxiv.org/abs/1710.09756 [2] https://www.youtube.com/watch?v=t0mhvd3-60Y
[+] [-] dllthomas|7 years ago|reply
[+] [-] thesz|7 years ago|reply
[+] [-] kccqzy|7 years ago|reply
Therefore the RAII style wouldn't really work in Haskell. The current bracket approach is still better than RAII in Haskell.
That said, the ST-style trick of a phantom type variable is pretty well-known. Unfortunately not many people knew the same trick can be used for non-ST as well. I feel like as a community we should be encouraging this style more often.
UPDATE: I wrote the original comment with the incorrect assumption that drop functions will always be called in Rust. This is wrong. Please see child comments.
[+] [-] annabellish|7 years ago|reply
The linked post is interesting, because I didn't realise "RAII is a much better way of managing resources than destructors" was controversial. It absolutely is, RAII is fast, predictable, and flexible. It's also one of the tradeoffs some languages make to achieve more flexibility in their design by enabling performant automatic garbage collection that doesn't require perfect escape analysis.
[+] [-] AnthonyMouse|7 years ago|reply
This isn't useless because memory allocation can happen during destruction/exit, e.g. to write some data to the filesystem.
Suppose you have a container with a billion objects. The container's destructor iterates over each object, doing some housekeeping that requires making a copy and then deleting the original before moving on to the next object.
That requires memory equivalent to one additional object because an original is destroyed following each copy. Stop dellocating memory during destruction/exit and the total memory required doubles, because you have all the copies but still all the originals.
There are also some helpful things that happen during deallocation. For example, glibc has double free detection, which strongly implies potential UAF but it's only detected if the second free() actually gets called.
[+] [-] DanWaterworth|7 years ago|reply
However, this is different than the bracket pattern that the article is taking about. No one in the Haskell community advocates cleaning up resources (like file descriptors, etc) using only destructors.
[+] [-] jcelerier|7 years ago|reply
thank god they do this. how many times did I have to manually force linux to release sockets because badly coded C programs which opened sockets forgot to release them causing them to hang up for ~5 minutes after the process ended. With proper RAII classes this does not happen.
[+] [-] dan00|7 years ago|reply
There's also no guarantee for Rust/C++ destructors to be called. It's certainly less of an issue then depending on the GC to being called, but if you need absolute correctness, then you shouldn't rely on the destructors.
[+] [-] unknown|7 years ago|reply
[deleted]
[+] [-] rbehrends|7 years ago|reply
This is a classical liveness vs. safety dualism. "Something good will eventually happen" and "nothing bad will ever happen" are promises whose solutions are often in conflict with one another.
The general problem — to make transactional state changes and transactional control flow (i.e. expectations about these state changes) match up precisely — is not easy to solve in the general case, especially once you move on to things that are less trivial than simple resource acquisition/release matching.
[+] [-] hawkice|7 years ago|reply
Your point about this being difficult to solve in the general case is true, it's just worth pointing out Rust intends to do that hard thing anyway.
[+] [-] jstimpfle|7 years ago|reply
[+] [-] fjsolwmv|7 years ago|reply
Keeping a debug reference at end of transaction (via a shared-reference type, since a non-shared RAII reference type could never get into that state) isn't a coding error, it's a design error -- the development intentionally requested contradictory things. That is solved by using weak references if you don't want a debug tool to force an object to stay alive.
[+] [-] Animats|7 years ago|reply
Python's "with" clause, and the way it interacts with exceptions, is the only system I've seen that gets this right for the nested case.
[+] [-] masklinn|7 years ago|reply
That is unclear. Currently, `File::drop` ignores all errors and drops them on the floor ([unix], [windows]). This is a concern both long-standing and ongoing[0].
AFAIK discussion has gone no further than https://github.com/rust-lang-nursery/api-guidelines/issues/6...
[unix]: https://github.com/rust-lang/rust/blob/master/src/libstd/sys...
[windows]: https://github.com/rust-lang/rust/blob/master/src/libstd/sys...
[0] https://www.reddit.com/r/rust/comments/5o8zk7/using_stdfsfil...
[+] [-] gpderetta|7 years ago|reply
The destructor would instead be in charge to perform the rollback actions on an uncommitted transaction, if any. Rollback cannot fail and indeed the system must preserve integriy even if not performed as there is no guarantee that the process will not be killed externally.
Of course if you do not care about data integrity, swallowing errors in close is perfectly acceptable.
Edit: in general destructors should only be used to maintain the internal integrity of the process itself (freeing memory, clising fds, maintaining coherency of internal datastructures), not of external data or the whole system. It is fine to do external cleanup (removing temporary files, clearing committed transaction logs, unsubscribing from remote sources, releasing system wide locks etc), but shoud always be understood to be a best effort job.
A reliable system need to be able to continue in all circumstances (replying or rolling back transactions on restart, cleaning up leftover data, heartbeating and timing out on connections and subscriptions, using lock free algos or robust locks for memoryshared between processes, etc).
[+] [-] DanWaterworth|7 years ago|reply
[+] [-] simias|7 years ago|reply
I don't know if there's an elegant way to solve this. If Rust had exception you could use that but then again in C++ it's often explicitly discouraged to throw in destructors because you could end up in a bad situation if you throw an exception while propagating an other one. How does Python's "with" handle that?
[+] [-] tveita|7 years ago|reply
With RAII in C++ there's no visual difference between dumb data objects and objects like locks that are created and held on to mainly to cause implicit side effects.
In Rust this also prevents the compiler from dropping objects early - everything must be held until the end of its scope for the 0.1% of cases where you're RAII managing some externally visible resource. In those cases I would like the programmer to denote "The exact lifetime of this object is important", so the reader knows where to pay attention.
[+] [-] pjmlp|7 years ago|reply
It is done properly in other languages as well, specially if they allow for trailing lambdas.
[+] [-] cpburns2009|7 years ago|reply
[+] [-] Ooveex2C|7 years ago|reply
[+] [-] radarsat1|7 years ago|reply
[+] [-] mark_l_watson|7 years ago|reply
[+] [-] it|7 years ago|reply
[+] [-] EugeneOZ|7 years ago|reply
[+] [-] syn_rst|7 years ago|reply
Reference cycles when using Rc<T> are a big one, but generally it's pretty hard to cause leaks by accident. I've only run into one instance of leaking memory outside of unsafe code, and that was caused by a library issue.
[+] [-] pdpi|7 years ago|reply
Granted, the ownership/borrowing semantics of rust make this a lot harder, but anything that uses Rc/Arc can easily fall prey to it — you can use those to create a reference cycle.
[+] [-] ChrisSD|7 years ago|reply
If you mean unintentional leaks then that is a harder problem. Others have noted ARC and RC leaks but also thread locals may (or may not) leak[0].
[0]: https://doc.rust-lang.org/std/thread/struct.LocalKey.html#pl...
[+] [-] devit|7 years ago|reply
It has no such static checking because it was deemed to reduce expressiveness, while not impacting memory safety.
[+] [-] majewsky|7 years ago|reply
> Rust's safety guarantees do not include a guarantee that destructors will always run. For example, a program can create a reference cycle using Rc, or call process::exit to exit without running destructors. Thus, allowing mem::forget from safe code does not fundamentally change Rust's safety guarantees.
https://doc.rust-lang.org/std/mem/fn.forget.html
[+] [-] hardwaresofton|7 years ago|reply
The mechanic point of this article is pretty clear:
- it's possible to be unsafe in both Haskell and Rust when dealing with resource cleanup
- Rust does a bit of a better job in the general case though it has it's own warts (see the other comments, it's hard to deal with issues during `drop`-triggered cleanup)
I want to make a muddier meta point -- Rust is the best systems language to date (does anyone know a better one I can look at?).
- The person who wrote this article Michael Snoyman[0] is mainly a haskell developer, he's the lead developer behind arguably the most popular web framework, yesod[1].
- Haskell developers generally have a higher standard for type systems, and spend a lot of time (whether they should or not) thinking about correctness due to the pro-activity of the compiler.
- These are the kind of people you want trying to use/enjoy your language, if only because they will create/disseminate patterns/insight that make programming safer and easier for everyone down the line -- research languages (Haskell is actually probably tied for the least "researchy" these days in the ML camp) are the Mercedes Benz's of the programming world -- the safety features trickle down from there.
- Rust is not a ML family language -- it's a systems language
- People who write Haskell on a daily basis are finding their way to rust, because it has a pretty great type system
When was the last time you saw a systems language with a type system so good that people who are into type systems were working with it? When was the last time you saw a systems language that scaled comfortably and gracefully from embedded systems to web services? When have you last seen a systems language with such a helpful, vibrant, excited community (TBH I don't think this can last), backed by an organization with values Mozilla's?
You owe it to yourself to check it out. As far as I see it rust has two main problems:
- Learning curve for one of it's main features (ownership/borrowing)
- Readability/Ergonomics (sigils, etc can make rust hard to read)
Admittedly, I never gave D[2] a proper shake, and I've heard it's good, but the safety and the emphasis on zero-cost abstractions Rust offers me makes it a non-starter. Rust is smart so I can be dumb. C++ had it's chance and it just has too much cruft for not enough upside -- there's so much struggle required to modernize, to make decisions that rust has had from the beginning (because it's so new). It might be the more stable choice for a x hundred people big corporate project today or next month, but I can't imagine a future where Rust isn't the premier backend/systems language for performance critical (and even those that are not critical) programs in the next ~5 years.
I'll go even one step further and say that I think that how much rust forces you to think about ownership/borrowing and how memory is shared around your application is important. Just as Haskell might force you to think about types more closely/methodically (and you're often better for it), Rust's brand of pain seems instructive.
[0]: https://www.snoyman.com/
[1]: https://www.yesodweb.com/
[2]: https://dlang.org/
[+] [-] sumofzeros|7 years ago|reply
Have a look at ATS[1], it supports many features that are available in Rust, and let you build proofs about your code behaviour. It's quite type annotation heavy though iirc, but it's very efficient.
[1] : http://www.ats-lang.org
[+] [-] faissaloo|7 years ago|reply
[+] [-] platz|7 years ago|reply
[+] [-] steveklabnik|7 years ago|reply