top | item 7515995

Manhattan: Real-time, multi-tenant distributed database for Twitter scale

96 points| jaboutboul | 12 years ago |blog.twitter.com | reply

42 comments

order
[+] ffk|12 years ago|reply
From what I understand, Manhattan is based on the ideas from ElephantDB. Unfortunately, development has pretty much stopped on ElephantDB despite the fact a book by Nathan Marz is being written about big data that is dependent on it. http://www.manning.com/marz/

Summingbird (bear with me, I'll tie this in) is also twitter's answer for writing code once and seeing it run on a variety of execution platforms such as hadoop, storm, spark, akka, etc... Not all of these have been built out, but the platform was designed to be a generic framework to support write once execute everywhere.

Summingbird is written to support Manhattan's model as well. The high level idea is to use versioning to determine whether a request is precomputed (batch), computed (realtime) or a hybrid (precomputed + computed). These are expressed as monads with basic functionality present in algebird. One way to bring support to this model to the open source world would be to implement storehaus bindings for elephantdb and to resurrect elephantdb or build a similar service to provide storage similar to Manhattan.

Overall, very early yet promising work in the open source community.

[edit: book is not about elephantdb, but is a critical component. modified wording. Also added link]

[+] lenn0x|12 years ago|reply
When Manhattan first started, our first use case was this. To support batch (something we talk about in our blog post). Similar to ElephantDB, the biggest difference was we wrote our own storage engine (seadb), instead of using BDBJE. We built the hadoop pipeline service so developers only needed to supply us with sequence files and we did the work for them by watching.

Over time, Manhattan evolved into a fully fledged read/write database that is able to support batch + read/write in the same system. Batch is great for some use cases, but sometimes the cost is too expensive when you factor in how much processing power, storage, etc is needed for that model. We support both of course still we want developers to have the freedom to increase their productivity.

[+] teodimoff|12 years ago|reply
I think its not an exaggeration at all. Twitter literally pulls data from thousands of servers but we are missing the point of Manhattan. As some of us know twitter services scale dynamic according the load they are serving and some engineers at Twitter decide to buy a hole truck of steroids to this idea. Here is a trick - We make container agents (storage services) "clients" of the Manhattan database(the core). They are Mesos processes which scale dynamic to the needs of the service (i.e 1 container = 10000 reads/writes per second, 2 containers = 20000 read/writes per second and so on) which allows the dynamic scaling of requests per second and writes per second. The core handles finding actual machines which have the data, replicating it and so on. There might be realtime storage service contaners which need fast data access, batch importer and timeseries they mention and so on. This requires a lot of gymnastics but offer a lot of nice features.The Manhattan database acts as virtual layer over thousands of machines and storage services allows for customized data manipulation. Cool...huh? According importance and scale (multi dc) this operates on i think it almost impossible to open source this. But who knows. miracles are happening now and then.
[+] teodimoff|12 years ago|reply
Or at least i think that is happening something along those lines.
[+] iLoch|12 years ago|reply
Interesting.. The database sounds almost too good to be true. I wonder if they'll open source this. They've done so in the past with projects like Storm, so I'm hopeful.
[+] gopalv|12 years ago|reply
Consistency, that's the usual performance killer.

They have got the hadoop watcher right as well as the BTree/SSTable update mechanisms - heavy import, light update is a use-case that is not catered to usually.

The rest of it feels a lot like memcache (membase/zbase impls) when reading up on its architecture - plus SSDs, that's always going to beats the pants off anything with disks.

Looks good overall, this is probably not for you if want to do exact counters or match up different counters (i.e +1,+1,+1 for a counter funnel).

If you don't need updates to be consistent, like if you imported hadoop data in without modification, this starts to look like a really good model.

[+] fizx|12 years ago|reply
When I was using it, it felt too good to be true! :) It's got a lot of moving parts and internal integrations, however, so it's got to be a ton of work to make open. Still hopeful though.
[+] RcouF1uZ4gsC|12 years ago|reply
Can someone enlighten me as to why 6000 tweets a second is something to make a big deal about? At 140 characters per message that comes out to 840,000 bytes/s < 1 Megabytes per second. In 2014 is a service that can handle 1 Megabytes/s impressive?
[+] jandrewrogers|12 years ago|reply
A single tweet is closer to 2500 bytes on average in most of the data feeds. However, your point still remains that this a "human-scale" data model that is relatively small and with a low data rate. It is pretty easy to engineer systems that can keep up with this even if you are not an expert in designing real-time database systems.

By comparison, many complex machine-generated data sources (e.g. real-time entity tracking) that are sometimes fused with the Twitter firehose operate at millions of complex records every second (often tens of gigabytes per second) that need to be processed, indexed, and analyzed in real-time. You can't deal with this kind of data model using something like Twitter's current architecture because the several order of magnitude difference in velocity and volume exposes the limitations of most database platform designs people typically use.

[+] druiid|12 years ago|reply
I think the issue is less with the number of tweets per second (not very impressive as you suggest), and instead the requirement that those same 6000 tweets fan out to all of the various feeds/locations that they need to show up at and in a timely manner. If it was a single page with those 6000 tweets/sec it would as you say be entirely unimpressive.
[+] njharman|12 years ago|reply
It's not. Figuring out which of the 240 million accounts they should be sent to is.
[+] rhizome|12 years ago|reply
The word "handle" is doing a lot of work there. It's a collation engine.
[+] lenn0x|12 years ago|reply
We store more than just tweets :)
[+] swah|12 years ago|reply
I don't even look at databases before Aphyr verifies it they keep their promises...
[+] coolsunglasses|12 years ago|reply
You'd be better served by using his Jepsen series as an opportunity to learn how to evaluate and break databases yourself instead of waiting to be spoonfed.

A $DATABASE_COMPANY was recently looking to hire somebody to be a dedicated database breaker. The Jepsen series was mentioned in the job post.

Worshipping celebrity is unhealthy in what should be an engineering profession and is a sign of a deeply embedded pop culture.

Don't be lazy.

[+] swang|12 years ago|reply
Article calls Gizzard, "strongly consistent" Gizzard's GitHub page says, "eventually consistent"[1]. What?

[1] https://github.com/twitter/gizzard

[+] lenn0x|12 years ago|reply
Yes, I think the article got that wrong. You are right that Gizzard is eventually consistent. The storage engine for Gizzard is mySQL, so in the event you want to do some node-local indexing, you at least have mySQL's strong consistency, but you won't achieve it in the overall system across keys.
[+] dfcarney|12 years ago|reply
"Real-time" is a bit of a misnomer as far as databases are concerned, especially if you're talking about a system that defaults to eventual consistency. I think they would have been better off saying "high-availability".
[+] SamReidHughes|12 years ago|reply
High availability isn't the same thing as having consistently low latencies. Some databases have spiky latency graphs without it being an availability issue.
[+] feelstupid|12 years ago|reply
Does anyone else find the opening statement a little misleading? Yes they originally come from one place, but they are sent from Twitter to the app of your choosing via JSON or similar. Sure there's going to be more than one request for the icon sprite and user avatars, but all from Twitter.

"When you open the Twitter app on your smartphone and all those tweets, links, icons, photos, and videos materialize in front of you, they’re not coming from one place. They’re coming from thousands of places."

[+] Nacraile|12 years ago|reply
It sounds like you're misinterpreting. The way I'm reading it, by "thousands of places" they mean "thousands of machines". Which may be an exaggeration, but it should be trivially obvious that data is sharded and must be collated for display.
[+] hoodoof|12 years ago|reply
So this is an internal system right? What is the point of telling the world if it's not available for anyone to look at or use. Perhaps a recruiting exercise?
[+] dfcarney|12 years ago|reply
Google, for one, has been doing this kind of thing for years. Dremel (http://research.google.com/pubs/pub36632.html), for instance, was the subject of a paper they published after they'd been developing and using it internally for years. It's never been open sourced, but there's now Apache Drill (http://incubator.apache.org/drill/) which is based on the same paper. In short, it's part marketing, part recruiting, and part ego. I'm hopeful that Twitter will at least publish a research paper that digs into some of the nuances of Manhattan and compares/contrasts against other technologies.