top | item 37049198

Candle: Torch Replacement in Rust

365 points| osanseviero | 2 years ago |github.com | reply

182 comments

[+] jstx1|2 years ago|reply

It's cool that this is coming from huggingface and it isn't just someone's hobby project.

Rust is in a weird place where people are trying to push it in a bunch of places where it doesn't belong naturally (web dev, data science) but then when you complain that it's difficult to use you get told that you're using the wrong language for your use case...

I really want to like Rust - I've read the Rust book, done the Rustling exercises, read Zero to Production in Rust and did the project there, and still it's not clicking - writing Rust is a slog where you have to jump through hoops and spend a lot more time thinking about programming language abstractions than thinking about the problem you're solving.

So I'm not in a rush to rewrite anything written in Python just yet.

[+] unshavedyak|2 years ago|reply

It may just be for different people, not sure. I'm a Rust fanatic, i use it for everything, including web dev. We even have a shop full of similar Rust folks.

Personally it's my most productive language, and this is after years in Python, NodeJS, and ~5 years Go.

Which isn't to attempt to invalidate your experience. But i do think:

> writing Rust is a slog where you have to jump through hoops and spend a lot more time thinking about programming language abstractions than thinking about the problem you're solving.

Is quite a leap. A language you describe not speaking natively is, unsurprisingly, difficult for you. Which is a far cry to say it's always that way.

You of course shouldn't be in a rush to rewrite anything unless you see the value in it. I make no claims of that value, either. Rewrite It In Rust (RIIR) is a meme, imo a destructive meme. But i do take issue with labeling any language broadly.

Most languages are pretty good when you're very fluent in them, and while they do have tradeoffs, i don't think Rust's tradeoffs are as you put them.

[+] wokwokwok|2 years ago|reply

Yeah, but the goal of this isn’t “use this instead of Python to hack up a model”

…it’s:

> And simply removing Python from production workloads. Python can really add overhead in more complex workflows and the GIL is a notorious source of headaches.

You know, those hugging face guys kindaaaaa do know what they’re talking about.

Like, really.

Python is a pain in the ass in production environments.

[+] spoiler|2 years ago|reply

> writing Rust is a slog where you have to jump through hoops and spend a lot more time thinking about programming language abstractions than thinking about the problem you're solving.

Have you tried "sticking" with it for a while? Try forcing yoursel to use it[1]. I find that's the best way to learn it (and try reading more code from open source project in Rust you like).

I have an opposite experience writing Rust: it allows me to focus on the problem more easily, especially when the problem domain is complex and the codebase grows.

[1] a trick often used while learning human languages

[+] brigadier132|2 years ago|reply

There are two kinds of complexity. Upfront complexity and hidden complexity. Rust with the borrow checker significantly shrinks the space of possible programs that can be written to solve a problem. Some people think this is bad because they can't write software in a way they find natural, I like it because of the possible programs that rust makes impossible to write the vast majority are incorrect and have bugs.

I also really like that when I use libraries, I know that all the library authors also subject to these same restrictions and I can trust their code much more than I could in other languages.

[+] zer8k|2 years ago|reply

I was with you until you mentioned Python. The use cases for the two languages are entirely orthogonal. Rust is a fantastic C++ replacement (nothing will replace C).

My problem with Rust, now on month 3 of trying to love it, is exactly what you mentioned. It's a slog. Comparing it to other languages you have to know a lot more about systems. Initially, this isn't a problem, but as it turns out it becomes a problem with Rc/Arc/Box/etc. These are really just leaky abstractions in practice with strange behavior in many cases. Sure, you can write Rust without ever needing that. But if you ever need something more specialized than BTreeMap you're going to be chest deep in a kafkaesque nightmare of lifetimes, weird garbage collection, crazy types (Option<Box<Rc<Cell...) etc. It's at this point the helpful compiler becomes a mortal enemy. A sisyphean task made worse by the compiler suggesting the wrong things at the wrong time because you're so deep in lifetime/box/[A]Rc hell that not even God himself can save you.

As long as you can mostly use the standard library and mostly avoid ever needing to do something where resources are shared Rust isn't that bad. Once you start sharing resources your life becomes extremely painful extremely fast. In some cases this is good, requiring you to think, but in many cases it's bad due to the leaky abstractions over memory when you need to do something more. In this case, despite it's major flaws, C++ wins in my opinion. I have a problem with the abstractions over pointers in Rust because they require me to STILL know how things underlying those abstractions works.

Rust is going through its 15 minutes of fame. Much like Ruby, Elixir, etc you will have fans trying to shoehorn every possible case into it. It's a systems language. It is very good at what it is. It definitely does not belong in Data Science though it could be used to replace the underlying systems-level code found in Tensorflow/Torch/Pandas/etc a little safer. But at what cost?

I'm being hard on Rust because I think in 5 years time it'll begin to replace C++ for most new systems level code. I hope they continue to revise the API.

[+] Fiahil|2 years ago|reply

Prototype in Python, but scale in Rust.

I do a lot of AI Engineering in Python, but production Data Engineering, and Feature Engineering workloads are done in parallel with Rust. Simple and Fast.

You can even make custom-made python modules with a Rust core when Numpy matrices are not enough to crack your graph-traversing problem efficiently ! (Thanks PyO3 and Maturin)

[+] lionkor|2 years ago|reply

This has been my experience, too. Somewhere in Rust, there is a beautiful, functional(ish), pragmatic and simple language, hiding in between all the crappy things.

So it's just like C++ ;) except it has pattern matching and const by default.

[+] brundolf|2 years ago|reply

It'll never not have a trade-off! But there are some things that can make it easier in many cases:

- Copy structs- ownership issues vanish when things can be bitwise-copied

- .clone()- don't be afraid to clone outside of hot loops! If allocations are a problem, look at using Rc instead of Box where applicable

- Pure functions/methods- you don't need mutable references if you aren't mutating things

- Macros- reduce boilerplate

- Purpose-built traits- you can hang methods off of different structs, including std or builtin structs, that can make certain flows really ergonomic

In general: Rust asks you to care about all the little details at a language level, but it also empowers you in different ways to build high-level abstractions on top of those details so that you often don't have to care about them. I'm finding that I have the smoothest time with it when I use traits and macros to make my own little DSL for the problem I'm trying to solve, and then write my business logic in that.

And then- many libraries offer you their own abstractions that give you the same sort of benefit. Concurrency normally involves a bunch of ownership and locking headaches, but something like rayon makes it breezy by giving you an abstraction that handles those details for you. Async web route handlers with shared state from scratch would normally be hard, but axum and Rocket give you abstractions that handle the gross stuff for you. One of my favorite crates I recently discovered is just called "memoize", and it just gives you a macro you can slap on any old function to memoize its results! (global mutable state!) https://crates.io/crates/memoize. This would be a mess to implement yourself, but the abstraction makes it breezy

And then personally: the main reason I'd use a Rust library like the OP over a Python library is the tooling/dependency management. Running a Rust project with dependencies Just Works and that's extremely valuable to me. But, it'll always be weighed against the costs of using a lower-level language.

[+] gtani|2 years ago|reply

I'd invite you to keep up on improvements in tooling like analyzer and auto-fixing clippy warnings (tho understanding internals of linters, borrow and type checkers is a full time job and then some)

https://www.infoworld.com/article/3267624/whats-new-in-the-r...

[+] dkarl|2 years ago|reply

Rust needs a prototyping/scripting/application language on top of it, with 80% of the speed and 99% of the safety, with automatic memory management (reference counting or GC) and at least one idiomatic way of doing concurrency that disallows sharing mutable state. A language that is tailor-made for building applications quickly on top of the Rust ecosystem, with a build tool that gracefully handles mixed projects if you need to build part of your application in Rust, and support for writing small shebang scripts.

Also, I'd like a pony, a jetpack, and VIP passes to the premiere of Dune: Part 2.

[+] synergy20|2 years ago|reply

A similar situation here for me. Rust might be a modern Ada for a particular use case(system programming), I was puzzled why stackoverflow ranked it as 'the most loved' language for years. For each language related post at HN, you will always see rust lover's posts there saying "I wish it's in rust", or "there is a rust version", or "rust is even better". The reality does not match up with the enthusiasm for me though, what am I missing?

After each try with Rust, I ended up going back to c++ myself, it's just more productive in practice.

[+] say_it_as_it_is|2 years ago|reply

Python stands on the shoulders of giants writing C libraries. Those people think a lot of about problems you haven't had to deal with.

[+] zengid|2 years ago|reply

I can attest to this. I really think it depends on your background and what your project is. Rust has a lot of baggage from it's use in systems programming that many folks coming from C++ already know, and a lot of strange functional programming things that ML-family language folks already know, so there's just a lot of new things crammed together. Now add move-by-default semantics, and tight restrictions on pointers, and it can get really hard. They key is getting to the point where you just try to use value semantics as much as possible, and stop trying to write everything perfectly the first time (i.e. use lots of `.clone` and `.unwrap` so you can get your code written down to do the thing you're trying to do).

I am hopeful that languages like Mojo and Val are going to find cleaner ways of doing what Rust does best, and that Rust can learn from them.

[+] zackoverflow|2 years ago|reply

I don't do anything related to data science, but I feel like doing it in Rust would be nice.

You get operator overloading, so you can have ergonomic matrix operations that are typed also. Processing data on the CPU is fast, and crates like https://github.com/EmbarkStudios/rust-gpu make it very ergonomic to leverage the GPU.

I like this library for creating typed coordinate spaces for graphics programming (https://github.com/servo/euclid), I imagine something similar could be done to create refined types for matrices so you don't do matrix multiplication matrices of invalid sizes

[+] yieldcrv|2 years ago|reply

Best I can see is that it helps incremental time that adds up over time

Like if your build server is backed up maybe the rust version of your language's compiler and interpreter will make your build server not be backed up because it does everything faster

But for everyone else its not that useful to be using the rust version of everything

Its just another one of those things where you see lucrative job offers requiring rust, so you make up reasons to have impactful-ish experiences in rust

[+] ramesh31|2 years ago|reply

>So I'm not in a rush to rewrite anything written in Python just yet.

I think it's becoming clear that Python is to ML what JS is to the Web. We got stuck with it through a series of historical coincidences, and no one is super happy about it. But it mostly does the job and has evolved to fit the bill.

I'm not a huge fan but it's definitely here to stay. Hopefully Rust can take off but I don't see any other viable replacement.

[+] pjmlp|2 years ago|reply

Yeah, Rust's sweat spot is systems programming and scenarios where no form of automatic memory management is tolerated, regardless of the reasons, for everything else there are enough AOT compiled languages with some form of automatic memory management, and features for low level coding if really needed.

[+] switchbak|2 years ago|reply

I'm wondering as Mojo matures if it'll naturally take over these roles. That's what it's core mission is, and it's being developed by the very best in the biz.

I like Rust, but Mojo seems like it ought to be a much easier sell for projects that are very invested on the Python side.

[+] unknown|2 years ago|reply

[deleted]

[+] unknown|2 years ago|reply

[deleted]

[+] balaji1|2 years ago|reply

I can understand why Rust feels like jumping thru hoops. In a way, working with Rust feels like working with Ruby/RoR. Is there similarities? I understand Rust is just as powerful and way safer and faster than Ruby.

[+] John23832|2 years ago|reply

I never understood why people say that rust isn’t appropriate for web dev. I think Axum is slick.

Typically the people I hear saying that are super Spring indoctrinated.

[+] progrus|2 years ago|reply

I absolutely love the slog of writing Rust… but I would not inflict it on my coworkers for most business-type projects.

[+] echelon|2 years ago|reply

> writing Rust is a slog where you have to jump through hoops and spend a lot more time thinking about programming language abstractions than thinking about the problem you're solving.

It's like this until it isn't. Eventually this falls completely away - I promise.

It's a lot like being a 100% backend engineer trying to develop in React for the first time. You just have to get used to it.

Rust is not difficult, it just takes some soaking in.

[+] alphanullmeric|2 years ago|reply

here comes the slew of rust people claiming you’re using it wrong and that anything another language does easier or more concisely is an antipattern

[+] brrrrrm|2 years ago|reply

I've come to the personal conclusion that the issue with compiled languages and machine learning is that the API surface is so big and the operations themselves are very self-contained.

I find that I am not typically writing complex "language-level" logic that would be expressible in any type of language and would benefit from the performance of compilation. I am guessing APIs and quickly testing ideas in totally unrelated colab notebooks. My productivity goes up when I can quickly answer questions like "how does this transpose impact the shape of the output" or "how do I find the mean of this tensor along only these dimensions."

Compiled languages, and especially Rust, really require you to go in with a plan. I find most of the machine learning code I write is inspired by rapidly playing with ideas.

[+] microtonal|2 years ago|reply

I am so happy about them releasing this. A few years ago I wrote a multi-task syntax annotator in Rust using Laurent Mazare's excellent Torch binding (tch-rs, it looks like he is also working on Candle):

https://github.com/tensordot/syntaxdot

However, the deployment story was always quite difficult. The PyTorch C++ API is not stable, so a particular version of tch-rs will only work with a particular PyTorch version. So, anyone wanting to use SyntaxDot always had to get exactly the right version of libtorch (and set some environment variables) to build the project.

The idea of making an abstraction over Torch and Rust ndarray (similar to Burn) crossed my mind several times, but there is only so much that I could do as a solo developer. So Candle would be a god-given if I was still working on this project.

Seeing Candle wants to make me port curated-transformers to Candle for fun:

https://github.com/explosion/curated-transformers

[+] tomrod|2 years ago|reply

Oh man. I feel like this is a really big deal. I've been wondering what might move data scientists to Rust. Outside working through rustlings and trying new projects in Rust, I wonder if having this available will be it.

[+] polyrand|2 years ago|reply

Interesting. The repo was owned by Laurent Mazare[0] a few days ago, I'm wondering if HuggingFace just hired them to work on candle full time?

[0]: Also maintainer of https://github.com/LaurentMazare/tch-rs

[+] ianpurton|2 years ago|reply

I think the first Rust ML framework to gain massive traction will be the one that supports quantisation out of the box.

At the moment there's no Rust ML library that does what llama.cpp or ggml can do.

[+] srush|2 years ago|reply

Nowhere near as neat as candle or ggml, but just released a 4-bit rust llama2 implementation with simd. Runs pretty fast.

https://github.com/srush/llama2.rs/

[+] FL33TW00D|2 years ago|reply

They already support quantization out of the box!

https://github.com/huggingface/candle/pull/314/files

[+] GaggiX|2 years ago|reply

I would love to see a single binary application with bleeding edge neural networks in it without all the complexity and bloat added when using Python in distributed software programs.

I guess you would be able to achieve it even before but this library should help a lot.

[+] pc2g4d|2 years ago|reply

What's the multithreading story? tch-rs's (and libtorch's) lack of real threading support was causing problems for my async code - at least, I haven't found a way to work around it yet, given my level of rust knowledge.

Tangentially: I keep thinking an ML framework could be done based on Rust macros, so the automatic differentiation is done entirely at compile time. `let grad = autodiff!(x + y);` (The `autodiff` crate does this at runtime IIUC). Then the compiler's optimizations can apply to the static computation graph.

I always thought that was the vision for TF in the early days, but clearly it moved in a more dynamic direction under PyTorch's influence.

[+] sva_|2 years ago|reply

"Torch replacement"? Title should be "Candle: Minimalist ML framework for Rust" if anything.

[+] kvark|2 years ago|reply

Exciting tech, looking forward to see it running on wgpu! There is an issue about WebGPU although I don’t notice references to the native version of it.

https://github.com/huggingface/candle/issues/344

[+] antimora|2 years ago|reply

Burn (deep learning framework in rust) has WGPU backend (WebGPU) already. Check it out https://github.com/burn-rs/burn. It was released recently.

[+] FL33TW00D|2 years ago|reply

ML on WebGPU is coming don't worry!

[+] milliams|2 years ago|reply

I have been occasionally checking https://www.arewelearningyet.com for where these problems are being solved. It doesn't look like Candle is listed there yet.

[+] hackeromm|2 years ago|reply

[deleted]

[+] nuccy|2 years ago|reply

A logically connected name for a Torch-like project in Rust could have been Thermite Torch. You can make a thermite torch given that rust is iron oxide :)

[+] rdedev|2 years ago|reply

Love the shoutut to dfdx. Haven't tried that crate yet but type checked matrix shapes sounds pretty good

[+] timokoesters|2 years ago|reply

This looks like a great project!

I have a question: Why is it necessary to specify the device for every Tensor? Wouldn't it be possible to set the device once and then all allocations are made to that device?

[+] jiangplus|2 years ago|reply

Anyone remember that Torch was first glued with Lua before PyTorch?

[+] GaggiX|2 years ago|reply

Hopefully the library will support in the future the shapes of the tensors in the type system, that would be great to prevent many errors.

[+] skdotdan|2 years ago|reply

Looks great! Rust is the wrong language for developing ML models, but it has a clear potential for being the best one for deploying them.

[+] bastardoperator|2 years ago|reply

In the video game Rust, you spawn with a Torch. They also just did a torch replacement in game. Maybe it's time for a break...

[+] delijati|2 years ago|reply

How big are the executabels ? I would like to move our interence code from tensorflow to a executable

[+] SushiHippie|2 years ago|reply

Could this also run quantized models like llama.cpp?