RethinkDB Expands Beyond SSDs, Launches Its Speedy Database To The Public

[+] samstokes|15 years ago|reply

This looks like an interesting product for cache-like use cases.

Have you considered including Redis in your benchmarks? From the FAQ it sounds like there are some use cases where both Redis and RethinkDB would be suitable. The main functional differences I can see are:

* RethinkDB supports datasets larger than RAM, whereas Redis doesn't really. This could be big, for some use cases.

* Redis supports lists, sets, priority queues etc, whereas RethinkDB values can only be strings for now (if I understand correctly?).

* With Redis, writes are not immediately durable by default (though they can be configured that way at a performance cost), whereas with RethinkDB they are? (In particular I'd like to see RethinkDB benchmarked against Redis with 'appendonly yes' and 'appendfsync always'.)

* Redis has built-in master-slave replication, whereas RethinkDB does not yet.

* Redis is open-source; RethinkDB you have to pay for updates. (Will updates eventually trickle down to the free plan, or is 1.0 all she wrote for free users?)

One more question, based on your hint that "support for more protocols is coming". Salvatore has said he considers Redis to be a protocol first and a database second (http://antirez.com/post/redis-manifesto.html). Have you considered implementing the Redis protocol in RethinkDB? That would actually be awesome, to be able to switch between Redis and RethinkDB as requirements changed.

[+] coffeemug|15 years ago|reply

Have you considered including Redis in your benchmarks?

Yes. The trouble is that doing performance benchmarks correctly is incredibly time consuming. Most of the time people aren't actually measuring what they think they're measuring, so most benchmarks end up comparing apples to oranges. We spend a lot of time to make sure that our benchmarks measure the right things and expose the right data, at the expense of doing more benchmarks. Naturally, this requires learning a great deal about the products we're benchmarking against. We're working to automate much of this process and putting an organizational infrastructure in place to add more (hopefully almost all) competitors, but it will take some time.

Redis supports lists, sets, priority queues etc, whereas RethinkDB values can only be strings for now (if I understand correctly?).

Correct. We think of RethinkDB as a database first, and a protocol second. We've built really good, largely protocol-independent technology that allows to execute most protocols with very high performance. We think that part is hard, while protocols are easy (in a sense most people can implement a known protocol, but few people can build a system that makes them all run fast in the same product). We'll be adding more protocols shortly, and Redis is definitely on the list (among Cassandra, Hadoop, and beyond).

With Redis, writes are not immediately durable by default (though they can be configured that way at a performance cost), whereas with RethinkDB they are?

This is fully configurable. See here: http://rethinkdb.com/docs/#durability

Redis has built-in master-slave replication, whereas RethinkDB does not yet.

Correct. RethinkDB 1.1 (currently in QA) has full support for master-slave replication, automatic failover, and range queries (via memcached rget extension).

Redis is open-source; RethinkDB you have to pay for updates. (Will updates eventually trickle down to the free plan, or is 1.0 all she wrote for free users?)

Our strategy is to set up the pricing structure in such a way that companies that have the money and the demand will have to pay for the product, and customers that don't have the money (companies + individuals) will be able to use it for free. Getting the details on the pricing structure is difficult - we're working it out now, so I can't share the details yet.

[+] Meai|15 years ago|reply

I have little experience with databases, please forgive me: How useful is a database with no horizontal scaling support? Is your target market small businesses and using it as a caching tier? I can imagine that everyone dreaming of a popular website eventually needs to scale beyond a single server pc. I think a very strong selling point of Cassandra and MongoDB are

1. sharding (distributed database load over multiple machines)

2. replication (simultaneously running backup databases over multiple machines)

Am I correct that I would have to make my own [nightly] backups of my database?

[+] coffeemug|15 years ago|reply

At the moment our target customers do horizontal scaling in the application layer, and we built exactly what they need (they wouldn't use the horizontal scaling features).

V1.1 of RethinkDB (currently in QA) will include replication. V1.2 will include full-blown clustering support (currently in development).

[+] gruseom|15 years ago|reply

Best of luck, guys. I like super-ambitious projects that make things go fast!

[+] gleb|15 years ago|reply

Congrats on the launch!

9 comments