(no title)
danielmewes | 10 years ago
We're constantly improving performance and a lot has happened within the past year. I think that at this point RethinkDB is as good a database for analytics as many of the other general-purpose databases when it comes to features and performance.
From what I can tell, there are still two main limitations that apply in some, but not all scenarios:
* Grouping big results without an associated aggregation requires the full result to fit into RAM. I believe this was the limitation that you ran into a year ago, which lead to RAM exhaustion. This limitation is still there ( https://github.com/rethinkdb/rethinkdb/issues/2719 in our issue tracker). However we're shipping a new command `fold` with the upcoming 2.3 release of RethinkDB, which can be used in the vast majority of cases to perform streaming grouped operations (in conjunction with a matching index). See https://github.com/rethinkdb/rethinkdb/issues/3736 for details.
* Scanning data sets that don't fit into memory on rotational disks is still inefficient. Most SQL databases deploy sophisticated optimizations to structure their disk layout in order to minimize the effects of high seek times. RethinkDB's disk layout it built with a stronger focus on SSDs. This limitation hence doesn't apply if the data is stored on SSDs.
sandstrom|10 years ago
Rotational disks are on their way out. For example, Samsung recently introduced a 15TB SSD[1], able to compete even with the largest rotational disks.
[1] https://news.samsung.com/global/samsung-now-introducing-worl...
lobster_johnson|10 years ago