garrettf's comments

garrettf | 4 years ago | on: Lessons learned from sharding Postgres at Notion

Oh yes—this footnote is not knocking non-relational DBs, in fact we have deployed DynamoDB in production for other use cases today.

We were on a tight timeline due to impending TXID wraparound. Switching database technologies would have required us to rewrite our queries, reexamine all of our indexes, and then validate that queries were both correct and performant on the new database. Even if we had the time to derisk those concerns, we'd be moving our most critical data from a system with scaling thresholds and failure modes we understand to a system we are less familiar with. Generally, we were already familiar with the performance characteristics of Postgres, and could leverage a decent amount of monitoring and tooling we built atop it.

There is nothing inherent about non-relational DBs that make them unsuitable for our workload. If we were designing our data storage architecture from scratch, we'd consider databases that are optimized for our heaviest queries (vs. the flexibility afforded by Postgres today). A large number of those are simple key-value lookups, and a plain key/value store like DynamoDB is great for that. We're considering these alternatives going forward, especially as we optimize specific user workloads.

Re: Cloud Spanner: we didn't consider a cross-cloud migration at the time due to the same time constraints. Still sounds like a wonderful product, we were just not ready at the time.

garrettf | 4 years ago | on: Lessons learned from sharding Postgres at Notion

Plenty! But not significant compared to page content in terms of storage or load. That unsharded data is stored in separate unsharded databases today, but each table could be sharded later using a different shard key (often user id).

Luckily we did not have many join queries prior to sharding. The few cross-database joins were trivial to implement as separate queries in application logic.

Our main concern here was referential consistency: if you need to write to multiple databases at once, what happens if one of your writes fails? This was not a problem in practice for us since (1) our unsharded data is relatively static and (2) there are very few bidirectional pointers between data across databases.

Long term there are more interesting problems to solve when distributing unsharded data globally. However, given that our unsharded data is less dynamic and consistency is less critical, we have many levers we can pull here.

garrettf | 4 years ago | on: Lessons learned from sharding Postgres at Notion

> Can you share details on the routing?

All in the application layer! All of our server code runs from the same repo, and every Postgres query gets routed through the same module. This means that it was relatively easy to add a required "shard key" argument to all of our existing queries, and then within our Postgres module consult an in-app mapping between shard key range and DB+schema.

Plumbing that shard key argument through the application was more difficult, but luckily possible due to the hierarchical nature[0] of our data model.

> Did you consider using several databases on the same Postgres host

If I recall correctly, you cannot use a single client connection to connect to multiple databases on the same host, and so this could have ballooned our connection counts across the application. This is not something we explored too deeply though, would love to hear about potential benefits of splitting tables in this way.

[0] https://www.notion.so/blog/data-model-behind-notion

garrettf | 4 years ago | on: Lessons learned from sharding Postgres at Notion

Good question, that was an option. The main motivating factor here was that vacuums were beginning to take dangerously long. O(weeks) to complete, independent of the load on the database. While migrating spaces in segments would have reduced the number of records future vacuums need to scan, we were already running against the clock to complete one vacuum prior to TXID wraparound[0]. To kick off replication for specific spaces we would have needed to write our shard key to all data owned by those spaces. That would further contribute to TXID growth, and was not something we were comfortable doing.

At the end of the day, this is something we could have explored in more depth, but we were ultimately comfortable with the risk tradeoff of migrating all users at once vs. the consequences of depending on the monolith for longer, largely thanks to the effort we put into validating our migration strategy.

[0] https://blog.sentry.io/2015/07/23/transaction-id-wraparound-...

garrettf | 4 years ago | on: Lessons learned from sharding Postgres at Notion

Howdy all, author here! Sharding our Postgres monolith was a huge undertaking for a small team. We got a huge boost from folks that joined within weeks before the switch-over date. If you’re interested in thinking about problems like this, I’d love to chat. Plus we’re also hiring across all roles—check my profile for details.

I’m happy to answer questions about the project here, feel free to reply below.

page 1