(no title)
tytso | 3 years ago
[1] https://blog.google/products/google-cloud/dynamic-hybrid-smr...
[2] https://www.t10.org/pipermail/t10/2018-September/018566.html
On the production kernel team, colleagues of mine worked on some really cool and new shit: ghOSt, which delegates scheduling decisions to userspace in a highly efficient manner[3]. It was published in SOSP 2021/SIGOPS [4][5], so peer reviewers thought it was a pretty big deal. I wasn't involved in it, but I'm in awe this cool new work that my peers in the prodkernel team created, all of which was not only described in detail in peer-reviewed papers, but also published as Open Source.
[3] https://research.google/pubs/pub50833/
[4] https://www.youtube.com/watch?v=j4ABe4dsbIY
[5] https://dl.acm.org/doi/10.1145/3477132.3483542
We have some really top-notch engineers in our production kernel team, and I'm very proud to be part of an organization has this kind of talent.
vl|3 years ago
For example:
RePD is at just wrong level at all. It should have been at CFS/chunk level and thus benefit other teams as well.
BigStore stack is beyond bizarre. For years there were no object-level SLOs (not sure if there are now), which meant that sometimes your object disappeared and BigStore SREs were "la-la-la, we are fully within SLO for your project". Or you would delete something and your quota would not get back, and they would "or, Flume job got stuck in this cell, for a week...".
Not a single cloud (or internal, for that matter) customer asked for a "block device", they all want just to store files. Which means that cloud posix/nfs/smb should have been worked on from the day 1 (of cloud), we all know how it went.
tytso|3 years ago
As far as "proper stack refactoring" is concerned, again, the key is to make a business case for why that work is necessary. Tech debt can be a good reason, but doing massive refactoring just because it _could_ help other teams requires much more justification than "it could be beneficial". Google has plenty of storage solutions which work across multiple datacenters / GCE zones, including Google Cloud Storage, Cloud Spanner and Cloud Bigtable. These solutions or their equivalent were available and used internally by teams long befoe they were available as public offerings for Cloud customers. So "we could have done it a different way because it mgiht benefit other teams" is an extraordinary claim which requires extraordinary evidence. Speaking as someone who has worked in storage infrastructure for over a decade, I don't see the calcification you refer to, and there are good reasons why things are done the way that are which go far beyond the current org chart. There have been a huge amount of innovative work done in the storage infrastructure teams.
I will say that the posix/nfs/smb way of doing things is not necessarily the best way to provide lowest possible storage TCO. It may be the most convenient way if you need to lift and shift enterprise workloads into the cloud, sure. But if you are writing software from scratch, or if you are internal Google product team which is using internal storage solutions such as Colossus, BigTable, Spanner, etc., it is much cheaper, especially if you are writing software that must be highly scalable, to use these technologies as opposed to posix/nfs/smb. All cloud providers, Google Cloud included, will provide multiple storage solutions to meet the customer where they are at. But would I recommend that a greenfield application start by relying on NFS or SMB today? Hell, no! There are much better 21st century technologies that are available today. Why start a new project by tying yourself to such legacy systems with all of their attendant limitations and costs?