top | item 32325622

(no title)

tytso | 3 years ago

Oh, you can certainly do big projects. My project[1] spanned 3 departments, and involved dozens of engineers, and required that we work with multiple hard drive vendors (our first two partners for Hybrid SMR were Seagate and WDC) on an entirely new type of HDD, as well as the T10/T13 standards committees so we could standardize the commands that we need to send to these HDD's. So this was all a huge amount of "new shit" that was not only new to Google, it was new to the HDD industry. You just have to have a really strong business case that shows how you can save Google a large amount of money.

[1] https://blog.google/products/google-cloud/dynamic-hybrid-smr...

[2] https://www.t10.org/pipermail/t10/2018-September/018566.html

On the production kernel team, colleagues of mine worked on some really cool and new shit: ghOSt, which delegates scheduling decisions to userspace in a highly efficient manner[3]. It was published in SOSP 2021/SIGOPS [4][5], so peer reviewers thought it was a pretty big deal. I wasn't involved in it, but I'm in awe this cool new work that my peers in the prodkernel team created, all of which was not only described in detail in peer-reviewed papers, but also published as Open Source.

[3] https://research.google/pubs/pub50833/

[4] https://www.youtube.com/watch?v=j4ABe4dsbIY

[5] https://dl.acm.org/doi/10.1145/3477132.3483542

We have some really top-notch engineers in our production kernel team, and I'm very proud to be part of an organization has this kind of talent.

discuss

vl|3 years ago

I'm not saying storage didn't do big projects. I'm saying that over time it got calcified and instead of doing proper stack refactoring and delivering features beneficial for customers, it continued to sadly chug along team boundaries.

For example:

RePD is at just wrong level at all. It should have been at CFS/chunk level and thus benefit other teams as well.

BigStore stack is beyond bizarre. For years there were no object-level SLOs (not sure if there are now), which meant that sometimes your object disappeared and BigStore SREs were "la-la-la, we are fully within SLO for your project". Or you would delete something and your quota would not get back, and they would "or, Flume job got stuck in this cell, for a week...".

Not a single cloud (or internal, for that matter) customer asked for a "block device", they all want just to store files. Which means that cloud posix/nfs/smb should have been worked on from the day 1 (of cloud), we all know how it went.

tytso|3 years ago

No one asked for a "block device"? Um, that's table stakes because every single OS in the world needs to be able to boot their system, and that requires a block device. Every single cloud system provides a block device because if it wasn't there, customers wouldn't be able to use their VM, and you can sure they would be asking for it. Every single cloud system has also provided from day one something like AWS S3 or GCE's GCS so users can store files. So I'm pretty sure you don't know what you are talking about.

As far as "proper stack refactoring" is concerned, again, the key is to make a business case for why that work is necessary. Tech debt can be a good reason, but doing massive refactoring just because it _could_ help other teams requires much more justification than "it could be beneficial". Google has plenty of storage solutions which work across multiple datacenters / GCE zones, including Google Cloud Storage, Cloud Spanner and Cloud Bigtable. These solutions or their equivalent were available and used internally by teams long befoe they were available as public offerings for Cloud customers. So "we could have done it a different way because it mgiht benefit other teams" is an extraordinary claim which requires extraordinary evidence. Speaking as someone who has worked in storage infrastructure for over a decade, I don't see the calcification you refer to, and there are good reasons why things are done the way that are which go far beyond the current org chart. There have been a huge amount of innovative work done in the storage infrastructure teams.

I will say that the posix/nfs/smb way of doing things is not necessarily the best way to provide lowest possible storage TCO. It may be the most convenient way if you need to lift and shift enterprise workloads into the cloud, sure. But if you are writing software from scratch, or if you are internal Google product team which is using internal storage solutions such as Colossus, BigTable, Spanner, etc., it is much cheaper, especially if you are writing software that must be highly scalable, to use these technologies as opposed to posix/nfs/smb. All cloud providers, Google Cloud included, will provide multiple storage solutions to meet the customer where they are at. But would I recommend that a greenfield application start by relying on NFS or SMB today? Hell, no! There are much better 21st century technologies that are available today. Why start a new project by tying yourself to such legacy systems with all of their attendant limitations and costs?