top | item 25497050

I rewrote a Clojure tool in Rust

169 points| praveenperera | 5 years ago |timofreiberg.github.io | reply

76 comments

order
[+] chrisulloa|5 years ago|reply
While I definitely agree Rust is a much faster language than Clojure, I would be interested to see benchmarks on your code that show just how much faster your Rust code was on the same data.

I also noticed that you mentioned avoiding lazy sequences is not idiomatic in Clojure. I disagree with this since using transducers is still idiomatic. I wonder if you could've noticed some speed improvements moving your filters/maps to transducers. Though I doubt this would get you to Rust speeds anyway, it might just be fast enough.

[+] j-pb|5 years ago|reply
I just moved a medium sized codebase from clojure transducers to JS, and after having used clojure for 7+ years, and done so professionally, I don't wanna go back, ever. The JS solution is shorter, faster, and easier to understand. I'm thankfull for the insights into reality and programming clojure has provided, but highly optimised clojure is neither idiomatic nor pretty, you end up with eductions everywhere. Combine that with reaaaallllyy bad debuggability with all those nested inside out transducer calls (the stack traces have also gotten worse over the years, I don't know why, and a splintered ecosystem (lein, boot, clj-tools)) I'd pick rust and deno/js any day for a greenfield project over clojure. sadly.
[+] bjoli|5 years ago|reply
I doubt it will bring much. If properly implemented, there is nothing that makes generator-like laziness slower than transducers, and since it is pretty central to clojure I doubt you will see much speed gain by using transducers.

In scheme, the srfi-158 based generators are slower than my own transducer SRFI (srfi-171) only in schemes where set! incurs a boxing penalty and where the eagerness of transducers means mutable state can be avoided.

Now, I know very little clojure, but I doubt they would leave such a relatively trivial optimization on the table. A step in a transducer is just a procedure call, which is the same for trivial generator-based laziness.

[+] dgb23|5 years ago|reply
Transducers are definitely idiomatic. They are more general over "similar things to transform in steps" (including sequences, messages and so on), so you can apply them to collections ("I have the whole data in advance") or channels ("I get the data piece by piece") and so on.

Another idiomatic way to improve performance are transients[0]. From the outside your function is still a function, but on the inside it's cheating by updating in place instead of using persistent data structures. See the frequencies function for a simple example[1].

Clojure and Rust are both very expressive languages and even though they both can be considered niche, they have _massive_ reach: Clojure taps into JVM and the JS ecosystems, Rust can also compile to WASM or be integrated with the JVM via JNI.

The big difference between the two, and why I think they complement each other nicely, is that Clojure is optimized for development, and does its best at runtime, but Rust is optimized for runtime, and tries its best at development. (A similar take in the article). In other words: they both achieve their secondary goal well, but resolve trade-offs by adhering to their primary in the vast majority of cases.

[0] https://clojure.org/reference/transients

[1] https://github.com/clojure/clojure/blob/clojure-1.10.1/src/c...

[+] systems|5 years ago|reply
i like how in the end the system was replaced by a database solution

every language need easy access to a query-ble database many problems are a lot simple when solved declaratively as a query-ble database

the relational model, is functional, and is a very good solution to a wide range of problems

i think the Sqlite engine should be integrated in the standard library of every language, and either use sql, or the language can provide a native sql alternative in the original language itself, or we can create a new standard language (because yes, sql can be improved upon)

I think Chris Date D language can be a place to start to investigate SQL alternative , or as a language that can be more easily emulated in other languages

[+] wwweston|5 years ago|reply
What would you say the advantages of Date's D over SQL are?
[+] didibus|5 years ago|reply
A bit sad there was no profiling done, or at least the article doesn't mention it. Maybe optimizing Clojure wouldn't have been that hard, could have been only a few places needed tweeking. In any case, Rust is obviously targeting high performance in a way Clojure isn't. Rust is faster than Java, and Clojure can only ever match Java in performance, not exceed it. But still, it's not clear if the author tried to optimize the Clojure version or not?
[+] fulafel|5 years ago|reply
> Rust is faster than Java, and Clojure can only ever match Java in performance, not exceed it.

The Rust vs Java question translates to the age old C++ vs Java argument, where the counterpoint is that Java can be faster because JVM has no significant disadvantage in code generation but JIT and GC can be faster than AOT and malloc, and then there are many back and forth arguments and nobody changes their mind.

In another sense, ease of use and HLL properties of languages can in practice give performance advantages. Given the same amount of time, the programmer of a more expressive high level language might have more time to iterate and to do algorithm work that end up being much bigger effects than the relatively small differences of compiler code generation.

(The word performance of course also has meanings other than code execution speed...)

[+] dgb23|5 years ago|reply
> Rust is faster than ...

Not if you use the wrong constructs, copy stuff around in the heap, use ref counting everywhere in longer running processes.

I'm not nitpicking here, in Rust you can get really fast, but its on you to make that happen.

For example persistent data-structures (used in Clojure) are really fast and for some operations and cases even close to optimal.

Performance is hard, and I very much agree with your question here. What has been measured and what are the results.

[+] burnthrow|5 years ago|reply
> Clojure can only ever match Java in performance

You're technically correct, but the typical Java program making heavy use of threads has inefficiencies (and incorrectness) that would be avoided with Clojure's higher level async APIs. As it's easier to write idiomatic, performant C than the "faster" ASM.

[+] raspasov|5 years ago|reply
Not very clear what the diff tool is attempting to do.

Just looking at the Clojure code, I feel there's better approaches in Clojure to achieve the same or better results.

More clarity would be helpful.

Also, transducers are a big performance win for long sequences of values.

You cannot even begin to guess where your performance problems are until you use something like YourKit (https://www.yourkit.com) which is an excellent tool. With very little effort and a few type hints you can sometimes more than double your Clojure performance.

[+] ithrow|5 years ago|reply
The articles makes a good a case of how Rust can look as clean as other high-level programming languages when writing high-level business logic.
[+] dgb23|5 years ago|reply
I agree that Rust has some really great concepts. The example code makes heavy use of traits for example, which are very ergonomic and provide a dynamic feel.

The context is a rewrite AKA runtime optimization. So the result is already understood. A great use-case for Rust is top-down implementation.

Also the code doesn't show any of the more painful cases. From the article:

> There are some inefficiencies visible here, and they're probably the most important spots for performance improvements. But they're still there as fixing them was too hard and/or time-consuming for me.

Resolving these "inefficiencies" is where Rust really shines. Because it _can_ resolve them and does it internally consistent on top of it. But at the same time, this is where you really _slow down_ in development and need to think about the more complex and intricate concepts such as lifetimes and borrow semantics.

[+] brabel|5 years ago|reply
As someone who has been writing a lot of Rust, I think that's only apparent... there's lots of things you can't do in Rust that you can in a GC-based language. The restrictions Rust imposes on you (mostly about the borrow checker, especially when you have mutable values) makes it much harder to write code... if you make it easier by cloning happily, you might end up with worse performance than in the GC-based language. If you don't, I guarantee your code won't look pretty with all those lifetime annotations.
[+] omn1|5 years ago|reply
Great post! I like articles where the author has practical experience in two languages and compares them. Helps me make better decisions for future projects.
[+] burnthrow|5 years ago|reply
> After switching to Rust, I had to implement more complicated logic that resolved dependencies between the rules I was diffing. This became complicated enough that even with a static type system I could only barely understand it. I doubt I would have been able to finish it in Clojure.

I have no idea what the author's trying to say here.

[+] mattrepl|5 years ago|reply
I believe they are saying the static type system and type checker provided by Rust helped them write complicated code.
[+] SomeHacker44|5 years ago|reply
I personally was quite astonished (not in a good way) by the use of #_"comments" instead of ;[;[;[;]]] comments.
[+] tw25497050|5 years ago|reply
Sometimes the architecture/algorithm matters, and sometimes the architecture/algorithm needs to align with the language. Absent seeing the broader code base [1], I'm inclined to think that the author's larger design led to these expensive functions existing as they did [2].

Pure speculation on my part, but if one has a lot of experience with imperative, mutable languages, one might design a system that ends up being not so great when written in a functional, immutable language. If so, then seeing improvements when directly porting to an imperative, mutable language might be not so surprising.

Tangent: Regarding the power and importance of code structure, I highly recommend watching "Solving Problems the Clojure Way" by Rafal Dittwald at Clojure North 2019 [3].

[1] I didn't see a link, but if it's available, I'd love to take a look.

[2] The `rule-field-diff` function for example seems to be burdened with some odd choices, e.g., taking in two "rules" as arguments, (which seem to be collections of rules keyed by field), then using two hard-coded "operations" (also keyed by field), and yielding a map whose values are sequences by field (I think). Off the top of my head I don't see why this fn needs to work across multiple fields in the first place (i.e., any field-specific "loop" should be in a surrounding context. Ditto for `diff-rules-by-keys`.

[3] https://youtu.be/vK1DazRK_a0?t=461

[+] blunte|5 years ago|reply
That Rafal Dittwald video is excellent. It gives a small but illustrative comparison of procedural, oop, and finally functional... and using javascript (thereby making it accessible to non-lispers).

Most developers should watch it.

[+] Narishma|5 years ago|reply
The code snippets in the article are unreadable due to low contrast.
[+] kimi|5 years ago|reply
Not sure about Rust, but the on the Clojure side, code does not seem pretty.
[+] qz2|5 years ago|reply
This is the most HN article I have ever seen.
[+] kristoft|5 years ago|reply
Nice post, very well written! Just not sure why compare non-gc low-level statically-typed language with vm-based gc dynamic language.
[+] nindalf|5 years ago|reply
It’s a fair comparison because many developers, including the author, would have to make a choice between languages, some of which would be GC and while others are non GC.