C/C++ vs. Rust Performance

[+] loeg|5 years ago|reply

Fact-check:

> FreeBSD has been supporting C++ modules for a while.

No, we don't. The link provided is to a user forum where someone asks about it, and the first response is "I don't believe that this is a good idea." I don't know why the author would just make up this false statement, but IMO it really hurts their credibility.

[+] dllthomas|5 years ago|reply

The first response starts "I don't believe that this is a good idea", then makes some suggestions as to how to make it work anyway. There is plenty of room, at this point in consideration, for the thread to have reached a conclusion very different than that implied by the first sentence of the first comment.

I read the whole thing. It doesn't.

[+] krizhanovsky|5 years ago|reply

But did you? My FreeBSD knowledge is quite outdated, I believe the last version I worked with is 7, but I believe at some point, it was possible to compile C++ kernel modules for FreeBSD. This time I can not find the proof, but plus to that post I also found links like

https://lists.freebsd.org/pipermail/freebsd-hackers/2009-Feb... https://lists.freebsd.org/pipermail/freebsd-hackers/2006-Jul...

The last one mentions: "There's actually a fair amount of experience with people doing C++ in FreeBSD kernels. People have been doing things with it for about 8 years now."

[+] simias|5 years ago|reply

>The Rust's generics and macros are much weaker than provided by C++ templates coupled with C macros. Although, this is also not so crucial.

That's a very strange way to put it IMO, I'm not sure what is meant by "weaker" here. They are stricter and as such can require more work to use effectively, but weaker is a weird way to put it.

[+] Karliss|5 years ago|reply

C++ template provides features that rust generics don't. I am not a Rust developer so correct me of some of these have been recently added. Rust generics being stricter would be something like them allowing to use only methods provided by trait, but I would call all of following design limitations making rust generics weaker.

* passing integer as template argument - from what I understand some of rust standard library functions can't handle arrays with more than 64 or 32 elements due to this or something similar. And there is active work for adding const generics to rust.

* variadic templates - there is a RFC for this

* passing template as template argument

* template specialization - C++ is somewhat moving away from this

[+] steveklabnik|5 years ago|reply

I think that the way they mean is that there are less restrictions, which means you can do more with them, which makes them more powerful, in a sense.

Words are hard.

[+] wtetzner|5 years ago|reply

I would say Rust macros are considerably stronger, given it supports procedural macros.

[+] ashtonkem|5 years ago|reply

This is part of why I think we should begin to move away from “weaker” and “stronger” as adjectives for programming; these terms add a value judgement that is not properly reflective of the more nuanced trade offs involved.

[+] ordu|5 years ago|reply

> A real high-level system programming language must be compatible with C.

I cannot agree with this. After I learned rust and tried to mix it with C code I believe that Stroustrup made a mistake by making C++ compatible with C on a syntax level. Strictly defined FFI boundary is a good thing. It liberates. It allow to track precisely what boundary APIs are, it makes boundary APIs the separate documented thing.

> An opposite example to use Rust for an Nginx module is CloudFlare's Quiche, an Nginx extension to support QUIC and HTTP/3 protocols. While it's definitely possible to use Rust for such kind of tasks, the guys, besides the FFI code for C/C++ bindings, still had to write some C code to patch Nginx.

I took a look at the patch, and it is not clear to me, that the patch was needed because rust wouldn't allow to do it the other way. It seems that cloudflare split their code into rust and C for some other reason, because there is a LOT of C code, it doesn't seem to me as a glue code at all. I believe it does more than just glue rust module and nginx. I'm not sure what the reason, but I could guess that the idea was to add possibility to use different implementations of QUIC and HTTP/3 in nginx, not just this particular rust implementation. So they extended nginx for this, and then implemented rust module.

[+] steveklabnik|5 years ago|reply

This was pretty heavily discussed on /r/programming a few days ago https://www.reddit.com/r/programming/comments/jjtobp/fast_pr...

[+] jeff-davis|5 years ago|reply

That thread brings up per-container memory allocators, and mentions Vec. What's going on there?

I have been trying to find a good way to deal with hybrid memory/disk structures. One example is external sort, but there are a whole class of such structures used for databases (Hash Join, Hash Aggregation, etc.). The idea is to constrain the memory footprint of a structure and efficiently use the disk.

One challenge is simlly knowing how much memory is being used. Another is supporting multiple heaps/arenas/whatever and allocating in the right one. And a third is safely clearing parts of the structure that have been written to disk for later processing (I guess calling the destructors if necessary?).

Is there any work in this area?

[+] AlchemistCamp|5 years ago|reply

Thank you!

[+] vvanders|5 years ago|reply

I've written my share of systems level C++ in performance constrained environments professionally (gamedev, embedded systems).

Never touched goto, alloca and branching usually stayed out of hot code through DOD approaches. Not really sure I agree with the conclusion.

Alternatively they didn't talk at all about aliasing/restrict which is one really neat area of Rust because &mut guarantees one of the key aliasing constraints.

[+] creata|5 years ago|reply

Isn't it still unable to actually make those aliasing optimizations because LLVM doesn't compile them correctly?

https://github.com/rust-lang/rust/issues/54878

[+] klyrs|5 years ago|reply

I've used goto a handful of times over my 20 year professional history with c and c++. A few were used for simplifying cleanup code (actually this is quite common in c, less so in c++); a few were used in weird state machines and were accompanied by a page of documentation.

[+] speedgoose|5 years ago|reply

> There are a lot of poor programs misusing goto, so they just removed the operator: good for juniors, but too limited for professionals.

It's a bit condescending. I think the article would have been better without that kind of comment.

[+] FreeFull|5 years ago|reply

I also suspect that goto is missing not because of potential misuse, but because it's actually difficult to implement given all of Rust's other semantics.

[+] tornato7|5 years ago|reply

Especially because a "professional" programmer should be able to get along just fine without goto. Goto is bad practice in many languages where it does exist, too, like Go.

[+] krizhanovsky|5 years ago|reply

I didn't mean anything condescending here actually. The point is that when people starting programming, it's good to put them into a restricted environment, so that they get used to use the right tools and in the right way. For example, it's quite easy to misuse goto in normal programs and make the programs unreadable. However, if a developer is already perfect with writing good code, but (s)he needs to do something special, like the state machine in the article, the restriction just makes the developer's life harder.

[+] dehrmann|5 years ago|reply

The only times I've really wanted goto were around exception handling. Java (not sure if it started there or was borrowed) actually has a construct that can do this without the issues with goto:

    block: {
        ...
        if (err){ break block; }
        ...
        if (err){ break block; }
        ....
    }

[+] virgilp|5 years ago|reply

Yeah, I caught that too; it's more than a bit condescending and is also completely false. Complexity makes all people trip up, not just "juniors".

[+] identity0|5 years ago|reply

> Secondly, the killing feature of C++ is that it is C. If you don't want to use exceptions or RTTI, then you can just switch the features off. Most of C programs can be just compiled with a C++ compiler with very small changes or without any changes at all.

If your C code happens to be compileable with a C++ compiler, then either your program is very small or you are writing really shitty C.

[+] MaxBarraclough|5 years ago|reply

Why's that? C has a few features that C++ lacks, but it's possible to get by fine without using any of them.

The Lua interpreter is written in a subset of C so that it also compiles fine as C++. Or, if you prefer, in a subset of C++ so that it compiles fine as C.

https://www.lua.org/pil/24.1.html

[+] dathinab|5 years ago|reply

Idk about "shitty C" but indeed C++ being a super set of C is a common misconception, followed by the misconception that it's nearly a superset of C. ;=)

Simplest example is that a `void func()` decl in C and C++ have different semantic meanings.

[+] attractivechaos|5 years ago|reply

> If your C code happens to be compileable with a C++ compiler, then either your program is very small or you are writing really shitty C.

It is easy to write C code that is compatible with C++ if you have compatibility in mind from the beginning. Most of time: 1) cast malloc() – you can define macros for less code; 2) avoid C++ keywords; 3) don't use VLA which is not recommended anyway; 4) add __STDC_LIMIT_MACROS for uint64_t etc. Use -Wc++-compat also helps. It is harder to modify existing C code for the compatibility with C++.

[+] krizhanovsky|5 years ago|reply

Well, we didn't mention this in the article, but this is very practical actually. About 10 years ago we had a request significant reworking an open source DNS server and the main problem with the project was that almost whole logic was placed in about 10 functions and most of the functions exceeded 1000 LoC. Each code update involved plenty of headache with dynamic memory freeing. We spent about 1-2 days to compile it with C++ and replace the most crucial pointers with smart pointers, that we didn't have to care about memory freeing. That time we had to make the job done ASAP, surely we should have spend time in proper code refactoring.

Surely you need to modify C code to compile it with C++ and in the article we made an example of the modification. But the point is that the changes are pretty straightforward and small.

[+] pengaru|5 years ago|reply

> If your C code happens to be compileable with a C++ compiler, then either your program is very small or you are writing really shitty C.

And yet the standard practice for games written in C attempting to use Valve's C++ API for Steamworks integration is to switch to using a C++ compiler.

[+] GirkovArpa|5 years ago|reply

I pasted the contents of this(1) ~100 line C file into this(2) C++ file with no problems.

[1] https://github.com/nospaceships/raw-socket-sniffer/blob/mast...

[2] https://github.com/GirkovArpa/raw-socket-sniffer/blob/master...

[+] simias|5 years ago|reply

I see what you mean but I think the point the article is making here is true: you can effectively strip most of the features of C++ and end up with something with a footprint and runtime cost almost identical to C. You can effectively port almost any C program to C++ by making purely syntactical changes that won't have any impact on the execution.

[+] ncmncm|5 years ago|reply

Almost everything the article says about the prospects for using C++ in an OS kernel is wrong.

It is hard to understand how someone could know so many basic facts about the language and still be so thoroughly confused.

None of the reasons cited for not using C++ in kernel code is valid. None of the things claimed to be impossible are.

Example: RTTI. Really, RTTI is hardly ever used in good code. (I used in once in 10 years.) Despite its low use-value, nothing interferes with using it in a kernel.

Name mangling is absolutely no problem; neither would using `extern "C"`, if you wanted to.

Operator new is wholly compatible with kernel allocators.

The Standard Library does have read-write locks, although it is usually foolish to use such a lock in real code. It is trivial to wrap any synchronization primitive you like with a zero-overhead abstraction. Kernels tend to have their own anyway.

Exceptions in kernel threads would work identically the same as in user-level threads.

Static ctor sections could be run by kernel startup code as easily as they are run in regular programs. (Probably one would not bother running static dtor sections.)

I could go on and on, but enough.

This article joins others packed to the brim with out-and-out falsehoods. When you need to rally so many falsehoods to make your case, you end up making the opposite case--except where your audience is easily fooled.

[+] krizhanovsky|5 years ago|reply

Please read the discussion in https://www.reddit.com/r/Cplusplus/comments/jjtn5v/fast_prog... . In short, everything is doable, but before starting your project, which you're paid for, you have to grow quite a large C++ infrastructure, which double the native Linux kernel API in many ways.

> Operator new is wholly compatible with kernel allocators.

Please read the referenced thread. The things aren't so simple.

[+] andy_threos_io|5 years ago|reply

The article brings up some good points, but there are several misconception. You can't compare normal user program execution with operating system kernel code execution. Total different beasts. There are reasons, that most operating system kernels are written in C and assembly. And also there is a reason why kernel code seldom use any FPU or SIMD instructions beside saving and restoring the FPU and SIMD context on context switch for user threads. ( The size of the SIMD registers in byte x86_64 SSE (256) x86_64 AVX (512) x86_64 AVX-512 (2048))

Ex. You don't want your interrupt server code use any FPU/SIMD instructions, as you don't want to save and restore the register file for the FPU/SIMD registers for the kernel code. It's just takes to much time. And there are other architectural execution penalty on some CPUs when using large SIMD instruction codes.

We measured simple operating system functions, like page clear with SIMD, and it's just does not worth it even if we only use SIMD instructions in that function and save/restore only the necessary SIMD registers.

Also heavy inter operation between low level assembly and higher level system code ( C ) are essential in kernel. You have to be able to handle the same structures in assembly and C without any misalignment and misaddressing. ( we use special macros to write down kernel structures, that are used in both assembly and C, all of the assembly files are C preprocessed)

And there are other several issues with OS kernel codes (handling special registers, changing address spaces (virtual), invalidating/flushing caches (specially on ARM), managing any kind of exception (hard and soft, page fault, invalid instructions etc. ), real-time handling, kernel context handling if any, etc.).

[+] timClicks|5 years ago|reply

Rust would be compelling even if it didn't offer comparable performance to C. Its safety guarantees mean that much easier for "mediocre" programmers to feel like they can contribute meaningfully to large codebases without introducing massive memory leaks/unsound behavior/...

[+] pizza234|5 years ago|reply

> It's also worth mentioning that the memory garbage collection (GC) in Java leads to high tail latencies and it's hard or even impossible to do anything with the problem

This is amusing, because a few days ago, there was an article¹ reporting that the C4 garbage collector solved this problem.

¹=https://news.ycombinator.com/item?id=24895395

[+] ncmncm|5 years ago|reply

People can report anything.

Solutions to pervasive, debilitating GC problems are as perennial as the Spring rain. Every year, we get a new crop, but the problems are still there the year after.

[+] vips7L|5 years ago|reply

Hopefully ZGC will solve this in HotSpot so no one has to rely on a custom jvm.

[+] zamalek|5 years ago|reply

> Rust doesn't provide stack allocations at all.

What? Rust allocates on the stack by default.

[+] virgilp|5 years ago|reply

> Thus, in this single case when a Rust implementation is more or less faster than C, the performance difference is not about better compiler, but about more efficient structure of the program, which allows a compiler to optimize code better.

This is strange takeaway.... I would say that's almost the only thing that matters. They compared one Rust compiler to 3 C++ compilers and picked the best result? Who does that in practice - who compiles their codebase with 3 different compilers and picks the most efficient one for each object file? Also, the compiler can often improve, where as the language itself is much more difficult to improve - the fact that the language lets you to write a more efficient (and more readable!) structure is crucial.

Also, they seem to misunderstand the point of Rust entirely. The main point of Rust is "safety" (w/o sacrificing performance, yes - but safety was the primary design goal). And for a good reason! Systems programming is more about safe systems than it is about fast systems - fast buggy system programs are useless. The authors decry the loss of goto saying it's "good for juniors but too limiting for professionals" - as if professionals aren't humans too! I'm sorry to say that, but whenever I saw that attitude before - it was with programmers that greatly overestimated their skill level.

[+] jrimbault|5 years ago|reply

Strong upvote for :

> The authors decry the loss of goto saying it's "good for juniors but too limiting for professionals" - as if professionals aren't humans too! I'm sorry to say that, but whenever I saw that attitude before - it was with programmers that greatly overestimated their skill level.

The "I know how to use goto [replace with any feature] everyone else is just stupid" reasoning upsets me so much.

[+] krizhanovsky|5 years ago|reply

While I addressed safety in a separate section in the article, I wouldn't argue about that: it seems Rust designers made the perfect work in safety. However, C++ is moving in this directly, bu there is the "gap" as it was described in the cited talk from the CppCon 2020.

However, there article is about FAST programming languages. Which means, and I stated this explicitly at the beginning of the article, that the main factor for the article is the speed of the generated code.

This is why I compared the single Rust implementation with 3 C/C++ implementations. The question was: whether Rust does something unreachable for C/C++? And the answer is "NOT". Also please keep in mind that all the benchmark programs, using the same algorithms, are still coded in bit different ways. And the differences impact performance significantly. I analyzed two programs, in Rust and C, to show the differences.

[+] lr1970|5 years ago|reply

Too bad they did not try Nim [1]. IMHO, Nim is better than Rust for C/C++ expats.

[1] https://nim-lang.org/

[+] ezekiel68|5 years ago|reply

Thinly veiled slashvertizement by software services company stirs up conflict among programmer communities to raise awareness.

News at 11.

[+] matklad|5 years ago|reply

This is an exceptionally great article, which indeed discusses strong Rust limitations.

I have one question: the parser in the article uses computed goto, which, as far as I know, is not standard C. Are there many cases where even the standard goto yields significantly better performance?

[+] Subsentient|5 years ago|reply

Just occurred to me, since Rust has LLVM IR asm directives, can't you write a portable goto in an inline function that jumps directly to pointers? Publish that crate to crates.io? Problem solved.

[+] legulere|5 years ago|reply

Exception epilogues also make optimization harder/impossible for compilers if I remember correctly. Rust suffers from this too though because panics use the same mechanism.

[+] ph2082|5 years ago|reply

> Not only Rust is immature, but it seems the language designers intentionally limited the language.

I am learning rust but what criteria any language must meet to call itself mature?

[+] dhanna|5 years ago|reply

> Golang can not be considered for high performance programming also due to the GC.

People repeat this like Go isn’t designed for use in resource constrained environments.

[+] bfrog|5 years ago|reply

I don't see Go running cortex-m level devices, atmega level devices. So no, it's really not.

[+] kupopuffs|5 years ago|reply

Performance requirements have a huuuuuuuuuge range and Go can't really be considered for extreme cases. Like the latest, hottest, prettiest video gamez in VR HDR 8K resolution

[+] Daishiman|5 years ago|reply

"resource constrained" is a very subjective interpretation. OP is talking about SIMD intrinsics and compiler micro optimizations. Go is definitely out.

150 comments