top | item 33609082

Show HN: Kùzu: An Embeddable GDBMS like DuckDB/SQLite from UWaterloo

42 points| guodong | 3 years ago |kuzudb.com

Hello HN!

Today, we are pleased to publicly release Kùzu: a new embeddable graph database management system under a permissible license. You can see our blog post in the above link that gives an overview of the system and our goals/vision.

The system is in its early stages but please try it out and give us your feedback, tell us your feature requests, and please report bugs!

12 comments

order

omatkafa|3 years ago

This looks cool and thanks for the MIT licence. I like this `pip install ...` type easy installations and will test its performance on very large graphs. will keep an eye on it.

xkcd99|3 years ago

When you say scalability, how much are we talking about here ? Can it handle around 50 gb or 100 gb datasets ?

Also, going through the docs it seems you support only csv data ingestion currently right ? Any plan on supporting json or parquet and other formats ?

Another thing, do you have support for any in built graph algorithms (like path finding, shortest path, centrality) ?

guodong|3 years ago

Thanks for your comment.

For scalability, we can scale to several hundred GBs, and we routinely test on LDBC up to 300GBs. Our goal is to support efficiently querying over data at TB scale.

Right now, we only support CSV import. We are currently working on the integration of arrow, and aim to support more data formats through arrow. Hopefully that will bring us to support parquet, json, etc.

Built-in graph algorithms are coming along, but step by step. We are focusing on shortest path quries for now.

As always, any suggestions and discussions on these are welcome.

smartyboi|3 years ago

Different wrapper for the same underlying techniques. Might be good to write some papers, but in my opinion, very unlikely to have any impact in practice.

guodong|3 years ago

Thanks for your comment. We don't want to only base our research on Kùzu but instead are focused on implementing Kùzu seriously and support actual users. so expect a few but not many papers these upcoming years.

Also not sure what techniques you had in mind, but our position is that graph dbms's should be built on relational principles and state-of-the-art analytics data management techniques (e.g., that's why Kùzu is a columnar system). but we have many new techniques (e.g., factorization, new join algorithms, new storage designs) that are all optimized for graph data with a lot of many-to-many connections between nodes/entitites. these techniques are optimized for finding patterns over such data. we wrote about prototype implementations of these techniques over many previous research papers and now we are focusing on implementing them very seriously in Kùzu.

Hope this clarifies a bit. Welcome to share more of your opinions in more details.