top | item 17402241

Migrating Messenger storage to optimize performance

104 points| tagx | 7 years ago |code.facebook.com | reply

33 comments

order
[+] oerpli|7 years ago|reply
They say that they brought some new search features to mobile, though I would prefer if they fixed their search on desktop (or in general) first, as it only returns results from recent messages.

Until sometime in 2016/2017 it was possible to look for specific things at whatever time (e.g. 2011) and now even things from <3y ago don't return anything.

[+] spyspy|7 years ago|reply
I don't see the ability to search within the app. Have they not rolled it out yet?
[+] fao_|7 years ago|reply
One solution is to download an archive and grep the JSON returned
[+] dingo_bat|7 years ago|reply
Whatever you may think of social networking, Facebook takes huge amount of pains to ensure reliability. They moved all the data from spinning disks to flash, also migrating to a new db. No downtime at all.

I'm a bit unclear as to how this process resulted in 90% decreased storage use.

[+] martincmartin|7 years ago|reply
"The simplified data schema directly reduced the size of data on disk. We saved additional space in MyRocks by applying Zstandard, a state-of-the-art lossless data compression algorithm developed by Facebook. We were able to reduce the replication factor from six to three, thanks to differences between the HBase and MyRocks architectures. In total, we reduced storage consumption by 90 percent without data loss, thereby making it practical to use flash storage."
[+] nemothekid|7 years ago|reply
Facebook has been using RocksDB more and more (now messenger, but also Instragram’s flavor of Cassandra). I wonder if Google has a similar penetration of LevelDB (and we just don’t hear about it because Google’s infrastructure work isn’t open source)
[+] rakoo|7 years ago|reply
There seems to be a pattern with Google: they have internal infrastructure that every ex-Googler seem to miss when they leave the company, because it's so good that it felt like being 10 years in the future.

As far as I understand, Google doesn't have a bunch of tools that merely work together, they have one huge system with different bits that _live_ together, so much that separating and open-sourcing them is cool but won't give you the same thing as being from inside:

- They use Blaze, a build system that integrates directly with the object store. They open-sourced Bazel as a kind of equivalent, but the build system won't shine unless you have an integrated object store and an integrated vcs client - They have open-sourced Kubernetes, a successor of what they were using for they were using internally for cloud management - They have open-sourced LevelDB, a successor of the fundamental brick they are using for BigTable

So in a way LevelDB isn't used as-is inside Google, but its spirit is in use at a fundamental level by pretty much everyone

[+] asfasgasg|7 years ago|reply
No, it does not. Generally speaking, Google "converges" a layer up (BigTable, Spanner, Colossus). There is no layer like LevelDB that is common to all of these.
[+] Zaheer|7 years ago|reply
During the dual-write phase, what happens if one request succeeds while the other doesn't?
[+] tagx|7 years ago|reply
Iris retries the failed request
[+] bigato|7 years ago|reply
What about giving me an easy option to export my messages while you are at it?
[+] mtgx|7 years ago|reply
Unfortunately, the feature to export your messages is only available to Facebook's data partners.
[+] tomnipotent|7 years ago|reply
TL;DR - Moved from HBase to MyRocks engine on MySQL sitting on top of NVMe storage via their Lightning Server [1], which is a JBOF (just a bunch of flash) setup using x16 PCIe.

[1] https://code.facebook.com/posts/989638804458007/introducing-...

[+] wenc|7 years ago|reply
Anyone know why Facebook moved off HBase? (article doesn't address this)

I get that MyRocks is truly amazing, but I'm wondering what issues they were facing with HBase. I heard HBase was picked over Cassandra (developed at FB) because it was strongly consistent vs Cassandra's eventual consistency.

[+] midom|7 years ago|reply
+ app server / schema rework
[+] godelmachine|7 years ago|reply
Have they published a paper on this on Facebook Research?
[+] philip1209|7 years ago|reply
Great, now they can serve video ads in Messenger more easily.
[+] aylmao|7 years ago|reply
This has nothing to do with video though.