RethinkDB 1.13: pull data via HTTP, push data via changefeeds

[+] sync|11 years ago|reply

Huge release this time around. Congrats guys & gals!

`r.args` isn't very flashy but is a huge step forward for ReQL.

Removing protobuf makes deploying to certain platforms (e.g. heroku) so much easier.

Promises support for the Javascript driver brings it to the modern age.

And of course changefeeds and `r.http`!

If you've been hearing about RethinkDB and thinking about trying it out, now's a great time to spin the wheels.

[+] coffeemug|11 years ago|reply

slava @ rethink here. I forgot to add `r.args` (http://rethinkdb.com/api/javascript/#args) to the blog post! People have been requesting that feature for a few releases now, and it should make lots of code much less painful.

FYI the upcoming 1.14 release is scheduled to include a distributed file system and geospacial indexing (probably the most requested RethinkDB features of all time).

If you have any questions about Rethink, I'm here all day to answer questions. (We can also grab lunch or coffee if you're in the Bay Area).

[+] mglukhovsky|11 years ago|reply

Mike @ RethinkDB here. We're co-hosting a RethinkDB + Firebase meetup on July 1st in San Francisco on building realtime apps. We'll be doing more in-depth demos of the new 1.13 features, and showing how you can use them to build realtime apps faster.

RSVP to the meetup here: http://www.meetup.com/RethinkDB-Bay-Area-Meetup-Group/events...

We'd love to see you come hang out with the RethinkDB and Firebase teams. Even better, give a lightning talk on how you're using RethinkDB! Christina is finalizing the schedule, if you'd like to do a lightning talk shoot her an email ([email protected]).

[+] piotrkaminski|11 years ago|reply

I must admit this makes me confused -- do RethinkDB and Firebase integrate in some non-obvious way? Both appear to be JSON databases with real-time change feeds so it would seem they'd be competitors if anything... Any chance you can clarify the relationship? Thanks.

[+] troyk|11 years ago|reply

I really hope I get to use Rethink in anger before I die. Last time, I had to move on because we are multi-tenant SaaS and it seemed the compound indexes just were not up to that task, does this release fix any of those issues? (I seem to recall I was having to use a between query and not being able to sort, but it was a while ago and I quickly got back to work with old reliable (postgresql))

[+] coffeemug|11 years ago|reply

I think you're talking about https://github.com/rethinkdb/rethinkdb/issues/1227. It hasn't been fixed yet, and is a big deal for many multitenancy users.

I'll bump it up in the roadmap -- sorry it's taking so long to get this fixed. It turns out that everyone only needs 5% of RethinkDB features, but it's always a different 5%. That makes product roadmaps really hard. I think in this sense databases are a bit like word processors :)

[+] dkhenry|11 years ago|reply

It is tough to keep up with all these changes. I mean Rethink was pretty feature complete for me like four or five releases ago.

I don't know how I feel about the http command but I am digging the other changes. Keep up the good work.

[+] neumino|11 years ago|reply

@dkhenry -- Thanks for all your work on the Java driver!

[+] vishy1618|11 years ago|reply

Absolutely amazing release! A quarter back we had a small chat with @coffeemug, to see if RethinkDB would be a good fit for our product (geospatial indexing, ad-hoc queries, realtime updates, ...). He said then that it wasn't yet, and promised that it would be in a year's time. I see now what he means, Rethink is already looking like a very strong contender!

[+] transientbug|11 years ago|reply

Damn, rethink won't stop getting better and better! I'm looking forward to getting to use the changefeeds, as I've currently got a redis list that I use to get notified of changes to rethinkdb, which is fine but just another cog in the system to debug.

Seriously guys, thanks for the great product and awesome database experience (PyRethinkORM author here).

[+] coffeemug|11 years ago|reply

Thanks for PyRethinkORM! I haven't used it yet, but really looking forward to taking it for a spin with Django in a few weeks. We'd love to feature it in our docs when we get through a few more immediate issues.

[+] imslavko|11 years ago|reply

Built-in notifications on writes are really cool! This is something Meteor had to implement on the application layer because MongoDB never supported it natively (and the oplog consumed by Meteor didn't really have enough context). RethinkDB gives you both oldValue and newValue which is super cool!

[+] coffeemug|11 years ago|reply

You can also query on them, which is really cool IMO! E.g. give me every document where the score has increased:

  r.table('games')  \
   .changes()       \
   .filter(r.row['old_val']['score'] < r.row['new_val']['score']) \
   .run(conn)

Or, give me every document for user X:

  r.table('games')  \
   .changes()       \
   .filter(r.row['new_val']['user_id'] == X) \
   .run(conn)

So you can write really cool apps out of the box without having to filter things on the client.

(BTW, we have a lot more planned for this)

[+] davidbanham|11 years ago|reply

Change feeds look great. r.http seems like a frippery, though. Not a great signal to see things like that getting added when there are much bigger fish yet to fry.

[+] coffeemug|11 years ago|reply

The r.http command was controversial, even internally. I had to call in a lot of favors to convince people to get it in, but here is why I was convinced it's a good idea:

  - It fits! It just seems to work magically well with the rest of
    ReQL on so many levels! JSON fits, streams fit, lazy evaluation
    fits, even batch prefetching fits! Everything works so
    wonderfully and provides such a great interactive experience,
    it almost would be silly not to add!
  - It makes a use case that's really important to me
    personally (adhoc analytics) an order of magnitude easier. Over
    the past few months I ran into a few other people that also do
    a lot of interactive adhoc data analysis with Rethink, which
    redoubled my resolve to add r.http.
  - It makes example datasets *much* more elegant. Reading
    RethinkDB docs? Just call this command to get the dataset in so
    you can play around with a command. (We haven't updated the
    examples yet, but we will). That makes the first experience
    with the product better, and makes the learning experience much
    more pleasant.

So I hear what you're saying, but I respectfully disagree. If you think of it as a signal, think of it as a signal of deep passion for the product and our users! Ornamentation for the sake of ornamentation isn't what we do.

(On an unrelated note, I really love your use of 'frippery' -- I'm a bit of a word nerd, and this is a really neat word!)

[+] hrjet|11 years ago|reply

I am not a database expert hence this question.

RethinkDB looks better than MongoDB in many respects. But how does it compare to Postgresql (latest versions with document support)?

If I were to start on a new project with modest scalability requirements, what should I choose and why?

[+] barosl|11 years ago|reply

I think the biggest difference is the scalability built-in. It has a beautiful Web admin and CLI admin that ease the scailability works including sharding and adding a node to the cluster.

PostgreSQL also has built-in streaming replication, which is useful and works nicely. But the scenario it covers is scaling reads, not writes. To scale writes, you should shard the data manually.

There are efforts to bring write scalability to PostgreSQL. One of them I'm keeping my eyes on recently is Postgres-XL, which uses statement-based replication that focus on writability and availability. It is a premature product that started recently, though. And also, some easy-to-administer features like automatic failover and configuration propagation (adding/removing nodes) are not the objective of the project. You should adopt another solution, or write it manually.

However, instead of getting the scalability, you lose ACID in RethinkDB - whose impact can be huge on some use cases. But personally I think it is worth losing to gain scalability easily. And there is two-phase commit which is at least doable to imitate transactions, though it isn't elegant at all.

If your application is read-intensive, I think staring with PostgreSQL and adopting Postgres-XL or pgpool later is a good strategy. But if you are to shard the data to scale writes, or you want an all-on-one solution that helps the administrative works a lot, I think RethinkDB is highly recommended.

[+] coffeemug|11 years ago|reply

Hi - Slava @ Rethink here.

Postgres is an amazing product, and incredibly stable by virtue of being very mature. You can't go wrong with it and will almost certainly be just fine.

However, you can't quite compare Postgres document support with RethinkDB. You'd really need to play with the product to understand the difference, but think of it this way. You could sort of treat Java as a dynamic language by casting everything to object, it would work just fine, but it is night and day compared to how it would work in Python. The difference between the experience of using Postgres and RethinkDB for JSON data is just as vast. You'd really how to try both before you can viscerally understand how amazing a dedicated JSON environment can be.

RethinkDB is behind Postgres on raw performance and we're still shaking out scalability quirks (see http://rethinkdb.com/stability/) but if you take the long view (a year or so), these will be worked out in on time. I'd encourage you to play with the environment -- it's easy and fun, and see if you like it. Feel free to shoot me an email ([email protected]) and I'll be happy to help you out if you have questions.

[+] orkj|11 years ago|reply

Great stuff as always. I have had a RethinkDB instance in production for 13 months now (since 1.4 - developed it while on 1.2) and looking forward to upgrading and playing with the new features in other projects.

Thanks again for your awesome work, guys!

[+] kclay|11 years ago|reply

Great release, time to get my Scala driver updated. Who wants to help getting the `changes` api to work with itertees?

[+] dkhenry|11 years ago|reply

Yeah I think that change is going to be another fun one to implement.

[+] Seich|11 years ago|reply

Congratulations on the release guys. The update to the Javascript driver is huge and will make development a lot less painful. I get to remove a lot of redundant code thanks to that.

I am a huge fan, keep up the outstanding work!

[+] maxpert|11 years ago|reply

Great job! Totally loved it. Thought it would be a good idea to have channels instead of tables for pub/sub mechanism.

[+] coffeemug|11 years ago|reply

We debated this for a while and decided not to do channels for now.

Feeds are great because you can use them to integrate with other pieces of the infrastructure like RabbitMQ or ElasticSearch, or write reactive apps where clients instantaneously react to changes in other clients. Incidentally, you can use them to easily get pubsub, but it wasn't the original intention.

There are much better pub/sub services out there, so we decided to stick to having a really good feed API and avoid implementing channels for the time being.

[+] selvakn|11 years ago|reply

Is the changefeeds feature comparable with changes api of couchdb?

[+] coffeemug|11 years ago|reply

Unless I'm mistaken RethinkDB changefeeds are more powerful than the couch API. You can perform filters on changefeeds, do joins, transformations, etc., while the Couch API gives you a feed that has to be fully processed on the client.

[+] doug1001|11 years ago|reply

am i correct that the biggest hurdle to a python 3 driver has been the lack of python 3 support in Google's protocol buffer library? If so then, it seems that the 1.13 release which removes the protobuf dependency, will substantially accelerate development of the python3 driver

[+] coffeemug|11 years ago|reply

That's right! In fact, there is a pull request to make the driver Python 3 compatible that's being reviewed right now (plus some additional testing changes, etc.) We should be able to get Python 3 support in pretty soon.

[+] Goranek|11 years ago|reply

Is using http out of the db really useful? Can someone give me a simple case when this should be used?

[+] coffeemug|11 years ago|reply

It's extremely convenient for interactive use. See this tutorial -- http://rethinkdb.com/docs/external-api-access/.

You can grab JSON data out of APIs in seconds, filter and manipulate it, and enrich it with more APIs. I've been using `r.http` for some time (since an internal beta) and I now find it invaluable for ad-hoc analysis.

[+] miralabs|11 years ago|reply

anyone using rethinkdb in production? Hows the experience like?

[+] samstave|11 years ago|reply

No offense, but I found the voice of the engineer difficult to listen to seriously, found the video too humorous.

[+] coffeemug|11 years ago|reply

Michael can be an acquired taste, like fine wine or good scotch. If you don't push past that first sip, you'll never know the deep, tantalizing world known to us connoisseurs as @mlucy. (But if you do, please let it be a matter of public record that we discovered him first)

[+] da02|11 years ago|reply

Still, his voice seems better than many other software developers. But, I know what you mean: I've had problems listening to some UK developer videos. (I'm US.)

On closer inspection of the video: I'm jealous of any company that lets their employees walk around the office with their shoes off. Not only are they super-smart, they're cool too.

(Sorry for this shallow, superficial observation. I'm still learning about databases and programming.)

[+] jeff_5nines|11 years ago|reply

What, are you two years old? I understood him perfectly. Why even make a comment like that anyway?

52 comments