top | item 45389744

Why we develop EloqDB mainly in C++

58 points| the_precipitate | 5 months ago |eloqdata.com

90 comments

order

f311a|5 months ago

One thing I like about Rust is that it prevents you from doing stupid things on the compiler level. I write a little bit of C/C++ and Rust. If you don't do C++ on a daily basis, you will silently introduce problems in the code that are very hard to spot. You just need to have a very good mental model of how to write good C++. It requires constant exercising.

For Rust, you just have to fight the compiler. This is especially useful when you have people on your team with some experience who also want to contribute, but you don't want to constantly point them in the right direction.

I actually have no idea how big teams work on large C++ codebases. Usually, you need to have a good idea of how the whole thing works. You can change one part of the code, and it will introduce bugs in the whole project because of how the memory is handled. Isolated changes are hard. And historically, a lot of C++ codebases lack good test coverage.

dataflow|5 months ago

> If you don't do C++ on a daily basis, you will silently introduce problems

Even if you do, you still will. Just less often.

> I actually have no idea how big teams work on large C++ codebases... You can change one part of the code, and it will introduce bugs in the whole project because of how the memory is handled

Part of it is lots of tests, sanitizers, assertions, etc.

Part of it is keeping things modular and avoiding spooky action at a distance to the extent possible.

Part of it is unavoidable, and that's why people are moving to safer languages.

jmull|5 months ago

> I actually have no idea how big teams work on large C++ codebases.

They choose a memory management strategy and stick to it. Of course, the problem, relative to something like rust, is the compiler doesn't enforce it. You can use linting tools and/or reviews.

> Usually, you need to have a good idea of how the whole thing works. You can change one part of the code, and it will introduce bugs in the whole project

That's not a problem with C++ specifically. That's a problem with organization. It's probably best know as the "Big Ball of Mud" architecture[1]. Rust has no particular defense against it, nor other languages that I am familiar with. If you don't see it as much with rust it's only because it takes time to develop. (counter-intuitively, it's an impressively successful architecture -- so many long-lived projects use it).

[1] http://www.laputan.org/mud/

wavemode|5 months ago

> I actually have no idea how big teams work on large C++ codebases

Well yeah, you don't. Most people who comment on these sorts of threads don't, which obviously colors their bias in favor of the solution they do understand.

on_the_train|5 months ago

> You can change one part of the code, and it will introduce bugs in the whole project because of how the memory is handled

Why would it do that?

cmrdporcupine|5 months ago

Honestly after 6 months to a year of constant Rust development you don't even fight with the compiler anymore. Instead it's mostly just your friend.

There are still logical holes in the borrow checker, but they're mostly irrelevant.

pie_flavor|5 months ago

This lists all the reasons to use Rust, then handwaves about some nonsense, then declares victory for C++. The maturity argument made a lot more sense when Rust was 3 years old, rather than 10; the libraries argument is plain silly because library management is horrible in C++ and all the listed libraries are by comparison essentially one-click to use in Rust; the entire article also seems AI-generated.

cmrdporcupine|5 months ago

From my experience no language bitrots and degrades worse than a C++ codebase.

The language standard has changed so much, the tooling, trendy libraries and the established conventions... It takes a herculean effort to keep a given source tree up to date.

Dive into a C++ repo started even 10-15 years ago and it can be a revolting experience, let alone one from back in the 90s.

And then from company to company conventions and expectations just vary dramatically.

When I was at Google we had a large committee of very smart people who applied monorepository wide modernizations across the whole repository, introduced amazing tooling and analysis tools, and imposed a very strict style guide that kept people fairly disciplined. But that was a herculean effort which most other organizations can't afford.

Rust has all sorts of problems (including specific ones for DB internals or OS development). But what's amazing when I read these articles is they don't actually seem to mention those specific problems that I've encountered in my last 3 years of professional Rust work. Instead they read like rationalizations by people who have a certain hammer they've gotten really skilled at using, and don't want to give it up.

That's fine if it keeps your organization productive, but I see no reason to publish about it?

If I were to make a list of gripes about Rust for this kind of work it would primarily emphasize the continued lack of acceptance/conclusion of the allocator-api (or competing) proposals, and the rather chaotic and unprofessional (and potentially insecure) nature of the way Cargo project dependencies explode into a hard-to-reason-about mess.

But the list they make? io_uring, mimalloc, and performance oriented networking are... not problems to use in Rust, not complicated at all. I assume the same (or better) for Zig.

groovy2shoes|5 months ago

> the rather chaotic and unprofessional (and potentially insecure) nature of the way Cargo project dependencies explode into a hard-to-reason-about mess.

this is one of my biggest gripes, too. that alone has been enough to cause me to avoid Rust for projects wherefore it would otherwise be a good fit. you can pull in "one" dependency and find yourself downloading hundreds of gigabytes of zillions of tiny dependencies, sometimes the same one at multiple versions. it's by no means a problem exclusive to Rust, but that's no excuse.

it's been a while, but my other major gripe was the way so many crates would require the nightly. the rust devs have done a good job maintaining backward compatibility between stable releases, but afaik there isn't any guarantee regarding the nightly. keeping up with the nightly is infeasible when each compiler release and all your dependencies needs to be vetted by your security team.

i also long found myself disappointed by the lack of a real specification, but that one is relatively minor. less of a frustration.

cjfd|5 months ago

Good for them. I like C++. It is a language that supports both being close to the computer and abstraction. I studied Rust a bit but it seems that their rules exclude some perfectly good software designs. If two classes need to work together as equals so class A has a reference to B and class B has a reference to A this is not very well possible. Especially if both A and B have multiple instances that are stored in containers. This is common with the bridge design pattern.

jeffbee|5 months ago

I don't know if the stuff about the JVM is even true. I grant that Redpanda is written in C++, but it isn't clear that its performance advantages over Kafka are due to that rather than to the fact that Kafka was implemented in a performance-oblivious way by people who did not know anything about software efficiency. This doesn't reflect on the JVM. You can write a high performing system in Java and the modern JDK is a state-of-the-art toolchain that provides features that many C++ projects struggle with.

whizzter|5 months ago

The by far biggest issue for Java is that they still haven't gotten their act together on value-types (project Valhalla), going back in time, the one thing someone should have told the designers was to not to release the erasing generics in Java 1.5 and go for something like C# did with value type structs.

Not really for "purity" issues, but rather due to the fact that memory speeds and main memory latency patterns that started to emerge as problems in the early 00s only got worse over time and having the erasing generics kind of cemented the memory access patterns.

The Java teams has done some truly amazing things in terms of GC research, but much of it is needed simply because the Java and JVM memory model (while "simple") is very allocation-heavy compared to C# that went for value types very early.

Take a peek at the QuestDB source code(Java) for heavy data-manipulation tasks, it's not really written in an idiomatic Java style to avoid GC costs (strongly reminicent of the way some people coded for JavaME back in the early 00s), a C# port would not be entirely idiomatic either but far more so than the existing code.

jurschreuder|5 months ago

I would also choose C++

The language is improving and improving. Some years ago it was way too difficult for the speed gain to be worth it.

But it's become more and more easy to write. Many of the safety arguments Rust has are still technically true, but 90% less true than 6 years ago.

The C++ community is also really friendly and open minded.

It's hard to explain but C++ also has this nice relaxing feel when writing it. Like doing a puzzle. Maybe the cognitive load is very evenly spread? Or the header hpp model forces you to think first in data models and interfaces? I have no idea it's mainly a feeling.

jhoechtl|5 months ago

I guess the Rust workforce is tiny, opinionated and mentally demanding.

sebstefan|5 months ago

I might be biased but I found it that the people who came to interview for the Rust roles of my company were noticeably better (or at least better at interviewing) than the applicants for the Java roles. More knowledgeable on the theory, struggled less on the hard things, more up to date on their tech watch

systems|5 months ago

I think, that complexity cannot be eliminated, but it can be hidden and distributed, using the right abstraction

that being said C++ being a big language adds complexity (stemming from the language itself, i.e. stemming from the tool)

So you can use a complex tool, to make a complex task simple, or a simple tool and keep the task more complex, requiring more steps etc..

But with C++ its a complex tools, that while it takes some complexity from the task, I think it adds enough complexity, that could outweigh the complexity it reduces

We need better languages, C++ is not it

brabel|5 months ago

> that being said C++ being a big language

Rust has become fairly big now, no? Is there some objective metric that can show Rust is a "smaller" language (I bet it is, but I don't think it's by a lot)?

raincole|5 months ago

> memory unsafeness, can be significantly mitigated when developing with a certain modern subset of the C++ language.

Right.

> Most existing and popular databases are developed in C/C++, providing a wealth of resources and innovations we could leverage.

Right.

But two rights can make one wrong. How are you enforcing 'good part of C++' when you're interoperating with others' code?

kccqzy|5 months ago

Encapsulation.

This is our code. That is their code. Depend only on the interface of their code and not the implementation. You can look at their code for curiosity but don't depend on the implementation of their code in our code.

Then you don't care what subset of the language their code is written in.

mempko|5 months ago

The reasons to use C++ over rust in this article are not good. The reason I would have picked is C++ does a better job matching mental models (partly because of it's flexibility) and it's easier to say what you mean. Value semantics by default also make it easier to write functional style code.

pharrington|5 months ago

Am I reading too much into this, or did Mr. "EloqData Core Team" choose C++ because that's what they're comfortable with, but don't want to say it?

MangoToupe|5 months ago

> Over the past several years, the EloqData team has worked tirelessly to develop this software, ensuring it meets the highest standards of performance and scalability. One key detail we’d like to share is that the majority of EloqKV’s codebase was written in C++.

This is a very interesting approach to marketing a new database

saghm|5 months ago

It sounds like the rationale is that existing database technology is already written in it, and they want to re-use some of it. That's reasonable, but I do think that it only makes sense with the assumption that the flaws in C++ aren't large enough to be worth using something else, at which point there isn't really much need to justify using C++ at all. If someone is concerned about the flaws in C++, the benefit of relying on existing C++ libraries isn't going to seem as compelling to them for the exact same reasons they don't want to use C++ for their own code.

At the end of the day, the choice seems to be a bit circular; if you don't have concerns about C++, you'll find plenty of reasons to use it, and the arguments against it aren't going to be compelling. If you have concerns about it, the reasons to use it won't be compelling, and you'll likely agree with the arguments against it. I have to imagine that whether someone agrees with this choice will be entirely consistent with their existing opinions of C++; it doesn't seem like there are any new arguments left to make on this topic, so debates on the topic will inevitably rehash existing arguments (regardless of which side they come from) and only appeal to the people who already have formed their opinions based on finding those arguments compelling to begin with.

wavemode|5 months ago

I don't think it's quite as circular as you're making it sound. If someone has a prior contraint of needing to move quickly (which is common in startups) it can make sense to choose any arbitrary technology, if it allows them to do so. I don't think someone developing a new game in C++ necessarily has no concerns about C++, that's just the language that all the console SDKs use. I don't think someone doing data science in Python necessarily likes Python, that's just the language that most models and libraries use (and that person probably has a deadline to publish a paper!)

Another factor to consider is that, if one is indeed trying to reuse code from existing databases (regardless of the reason for doing so), code from projects like Sqlite and FoundationDB is simply far less likely to contain serious bugs than any newer Rust-based option. There are way more mistakes one can make when writing a database than just memory safety mistakes, and the mistakes tend to be extremely subtle. Code having been run in production for long periods of time under significant amounts of load is basically a fundamental prerequisite for it to make any sense to trust the data of your users to it.

zenethian|5 months ago

Unfortunate that this blog can't be read on mobile without considerable pain. Seems fitting, though, I guess.

rapsey|5 months ago

These articles can always be surmised down to "because we wanted to". Rarely is there some real reason.

dijit|5 months ago

As Richard Restak postulates in his book “The Naked Brain”[0]: the limbic system provides a gut feeling (usually from comfort) and we rationalise our way backwards from that without being able to really pinpoint “why”; usually the “why” is secondary and only added as justification for the feeling post-decision.

[0]: https://www.amazon.com/Naked-Brain-Emerging-Neurosociety-Cha...

jmull|5 months ago

They do list specific reasons in the article...

One, C/C++ interop is a priority since they will interoperate with a large variety of C/C++ APIs (sounds like one of the main points of their project is to integrate things that are largely implemented in C/C++).

Two, they say their aim is "building a lasting system that will support decades of continued improvements." You want confidence that 99.9% of the code you write today remains just as good 20, 30, 50 years from now. I don't think rust is quite there yet (or maybe it is but hasn't yet proven it).

Klonoar|5 months ago

The "I drive an H-pattern manual transmission" of programming language discussions.

meindnoch|5 months ago

[deleted]

noworriesnate|5 months ago

If Rust was like Communism, there would be dozens of examples of massive codebases that ported to Rust and ended up with massive infighting, culminating in 20% of the developers being assassinated by the moderation team.

TimorousBestie|5 months ago

> In particular, the most harsh arguments against using C++, i.e. memory unsafeness, can be significantly mitigated when developing with a certain modern subset of the C++ language.

So the quest for the one true “modern subset” of C++ continues.

How do developers continue believing in this after a decade of the standards committee proving over and over again that they’re not interested in this and won’t contribute toward it?

WCSTombs|5 months ago

The article isn't claiming there is a "one true modern subset" of C++ that they use. It's merely pointing out that you can significantly mitigate the main criticisms of C++ by making certain sacrifices, which is pretty much true.

There are good reasons the standards committee doesn't make those sacrifices on your behalf, because ultimately there are tradeoffs there that the programmer is supposed to understand and have control over. However, there is an argument to be had about what the default "safety setting" should be and whether C++ makes a good choice. IMO that's actually the main difference between safety in Rust and C++, since you can make Rust just as unsafe as C++ if you want, only you need to explicitly mark your code as unsafe.

Also, I believe the C++ standards committee does care about this, which is why Profiles [1] are being considered.

[1] https://github.com/BjarneStroustrup/profiles

rapsey|5 months ago

Because they are grasping for reasons not to learn Rust.

coolThingsFirst|5 months ago

Mitigate having bad legs by walking on your hands

guywithahat|5 months ago

As far as I can tell, modern C++20/23 is as safe (if not safer) than rust. So much of rust compares itself to C++99, where modern C++ doesn't use exceptions, has smart pointers (RAII), improved casting and array management, and has an extensive suite of checking tools and flags. The conversations I've seen at my company for using rust tend to be "well it would be tun to do something different", which just aren't very compelling to me. I worry Rust is going to end up like Haskell in 5 or so years

jcranmer|5 months ago

> As far as I can tell, modern C++20/23 is as safe (if not safer) than rust.

It is not. Rust will, for example, prevent the following memory-safety issue from compiling:

    std::vector<T> meow;
    T &x = meow[i];
    meow.push_back(...); // Oops, x is now dangling, maybe!
    x.a = ...;
(This sort of pattern is responsible for nearly 100% of the C++ memory safety issues I know I've committed in the past several years.)

wffurr|5 months ago

C++ is getting safer, but it has a long way to go to match Rust's safety guarantees. Google is doing a lot with spatial safety with hardened libc++, bounds checks for C-style arrays, and safe buffers; but temporal safety is a lot harder without more information in the source code.

Running sanitizers and such is quite expensive too. It burns a lot of cycles to run msan, asan, tsan, valgrind, etc.

Whereas catching these bugs at compile time saves everyone a lot of time and money.

TBH I don't find the reasons in the article particularly compelling. Rust has a lot of industry backing now and is pretty clearly the way forward to systems programming. Writing Rust wrappers over the various libraries they use is largely a one-and-done issue, and they can publish them to Cargo and share the load of keeping them updated. If ISO or various governments get their act together with a real software liability regime or cyber security requirements, companies with big legacy C++ code bases will be in a tough spot. Second best time to start writing safe code in your project is now.

tcoff91|5 months ago

The sanitizers and static analysis tools are not as good as the borrow checker for preventing data races.

ancwrd1|5 months ago

You can easily introduce memory-related issues in the "modern C++" and the compiler won't say a word even with pedantic checks.