It may seem daunting, but I think many people make it more complex / difficult than it needs to be.
I have rolled out two transactional databases of my own. In both cases I had to provide very specific properties and for some reason I could not find an existing product that would meet all requirements. For example, one of them was an embedded device that was very restricted in memory, all operations needed to run with hard bounds on time and memory and the storage for the data was a flash chip without wear levelling which required the database itself to manage writes to prolong the chip's life.
The key is to notice how your database system is going to be different from others and what properties are not essential.
Also, making general purpose DBMS tends to be much more complex vs making more niche solutions where you know a bit more about what the uses are going to be and what kinds of loads you can expect.
Creating a custom engine for a given application can be very simple task because you can easily cross out requirements you don't care about and you only care that it works well for the loads that this particular application can generate.
Also, it is unlikely you are going to beat fierce competition in general purpose "and a kitchen sink" database management system market, but much easier to find a niche that is underserved and create a usable, competitive product with relatively little effort. That's how SQLite started.
Today, building a database from scratch is extremely difficult, for several reasons:
1. it anyways takes a long time;
2. there are so many successful (open-source) databases;
3. hiring top engineers are so expensive.
4. you won't get enough attention unless your system is drastically better than existing ones.
An interesting observation is that very few database was built since 2020 - almost all the newly built databases were developed on top of existing databases (PostgreSQL, ClickHouse, etc).
I started building RisingWave (https://github.com/risingwavelabs/risingwave) in early 2021. The only reason we built the system from scratch was that none of the existing systems can address the problem we are solving - distributed SQL stream processing at cloud scale. We tried Flink but gave up, as it's too heavy and it's architecture was not designed for the cloud environment.
If you want to build a database from scratch, or are simply interested in databases, we may talk.
You're obviously the expert here, but I was surprised that you found it notable very few databases have been released in the last three years. That seems like a very short timeframe. Per Wikipedia ClickHouse started as an experimental project in 2009 and was first released in 2016.
If you're interested in the idea of databases built from scratch since the time this post was written in 2017 (based on GitHub contributions info), here are a few:
According to my estimation, a new database engine is born every week - mostly key-value and document databases. Only a small subset of them survive after one year. According to a guess by Stonebreaker, a DBMS takes around 7 years to become mature enough for general applications.
I am building a new immutable cryptographically verified database using IPLD data structures and prolly trees. This allows changes made anywhere to be transparently synced, and for operations to be commuted amongst untrusted peers, for instance allowing for shared index maintenance.
KùzuDB[1] is an in-process graph database built from scratch and came out of academia too. We are from Data Systems Group at University of Waterloo, started since Sep 2020, and have a small team actively work on it now.
These two posts[2,3] explain where we are from and where we're going, if anyone is interested.
I started rqlite[1] in 2014[2], FWIW. While I didn't build the storage engine, or the consensus system, I've built the entire "management" part of the RDBMS from scratch. I'm almost 10 years at it, and there is still plenty to do.
and DuckDB came out of academia too and is not based on Postgres either (highly relevant and notably absent in the authors list of academic DBs at the end of the article)
Correct me if I’m wrong, but I don’t think RedPanda is a database. I see it as a streaming data solution, which the novelty factor can be discussed as well, since it’s basically Kafka.
For unsuspecting readers: this article talks about the feasibility of building a new practical DBMS. Database is the most critical piece of software for businesses, so it has already been thoroughly explored and researched. It's very difficult to find a better solution for existing problems. One should either invent a new paradigm or tackle unsolved problems to justify the cost of development.
Technology-wise, writing a toy DBMS is nothing difficult. Even undergraduates can do it.
[+] [-] onetimeuse92304|2 years ago|reply
I have rolled out two transactional databases of my own. In both cases I had to provide very specific properties and for some reason I could not find an existing product that would meet all requirements. For example, one of them was an embedded device that was very restricted in memory, all operations needed to run with hard bounds on time and memory and the storage for the data was a flash chip without wear levelling which required the database itself to manage writes to prolong the chip's life.
The key is to notice how your database system is going to be different from others and what properties are not essential.
Also, making general purpose DBMS tends to be much more complex vs making more niche solutions where you know a bit more about what the uses are going to be and what kinds of loads you can expect.
Creating a custom engine for a given application can be very simple task because you can easily cross out requirements you don't care about and you only care that it works well for the loads that this particular application can generate.
Also, it is unlikely you are going to beat fierce competition in general purpose "and a kitchen sink" database management system market, but much easier to find a niche that is underserved and create a usable, competitive product with relatively little effort. That's how SQLite started.
[+] [-] yingjunwu|2 years ago|reply
Today, building a database from scratch is extremely difficult, for several reasons: 1. it anyways takes a long time; 2. there are so many successful (open-source) databases; 3. hiring top engineers are so expensive. 4. you won't get enough attention unless your system is drastically better than existing ones.
An interesting observation is that very few database was built since 2020 - almost all the newly built databases were developed on top of existing databases (PostgreSQL, ClickHouse, etc).
I started building RisingWave (https://github.com/risingwavelabs/risingwave) in early 2021. The only reason we built the system from scratch was that none of the existing systems can address the problem we are solving - distributed SQL stream processing at cloud scale. We tried Flink but gave up, as it's too heavy and it's architecture was not designed for the cloud environment.
If you want to build a database from scratch, or are simply interested in databases, we may talk.
[+] [-] iudqnolq|2 years ago|reply
[+] [-] stakhanov|2 years ago|reply
Did the DBMS ever come into existence? (If so: link, please). If not: Why should we be interested in this announcement in 2023?
[+] [-] paddw|2 years ago|reply
Not sure what op's intention with this was
[+] [-] whoevercares|2 years ago|reply
[+] [-] dang|2 years ago|reply
Building a Database System in Academia - https://news.ycombinator.com/item?id=13931752 - March 2017 (15 comments)
[+] [-] pcthrowaway|2 years ago|reply
Not a good look for people browsing at work
[+] [-] apavlo|2 years ago|reply
[+] [-] eatonphil|2 years ago|reply
- Materialize: 2017
- DuckDB: 2018
- RedPanda: 2019
- TigerBeetle: 2020
[+] [-] zX41ZdbW|2 years ago|reply
[+] [-] jchrisa|2 years ago|reply
https://use-fireproof.com/docs/architecture
It's also the easiest way to write React apps. Here are some ChatGPT expert builders that I've trained to use the CSS framework of your choice with Fireproof: https://use-fireproof.com/docs/chatgpt-quick-start/#react-ex...
[+] [-] guodong|2 years ago|reply
[1]: https://github.com/kuzudb/kuzu
[2]: https://kuzudb.com/blog/meet-kuzu.html
[3]: https://kuzudb.com/blog/what-every-gdbms-should-do-and-visio...
[+] [-] otoolep|2 years ago|reply
[1] https://www.rqlite.io
[2] https://www.philipotoole.com/9-years-of-open-source-database...
[+] [-] tlarkworthy|2 years ago|reply
https://duckdb.org/pdf/SIGMOD2019-demo-duckdb.pdf
EDIT: oh the article is old
[+] [-] zachmu|2 years ago|reply
Yes we have commit history from 2015 but that's from an earlier db project (noms) that we forked and built on top of
[+] [-] frankdejonge|2 years ago|reply
[+] [-] whoevercares|2 years ago|reply
[+] [-] esjeon|2 years ago|reply
Technology-wise, writing a toy DBMS is nothing difficult. Even undergraduates can do it.