What does "Reliability of 99.9999999999% (twelve 9s)" even mean? Obviously you have to exclude large classes of user-visible failures (network outage, account over quota) to achieve that. I don't think they're claiming less than 0.00000000001% chance of a zombie apocalypse/Mad Max/ex Machina/asteroid impact end-of-times situation. So just what failures are counted?
For comparison, public telephony systems aimed for five 9s. That was usually expressed as "20 minutes downtime over 40 years, combined hardware and software budget, for outages affecting more than 32 users." One software crash requiring human intervention would count for more than 20 minutes, so you were allowed <1 of these in 40 years system lifetime.
That's a claim of the durability of the data, or the odds that a chunk of data will be lost in a year.
They calculate this through the odds that a single node fails, and then multiplying that odd out through all replicas. This covers the most easily quantifiable failure mode.
Obviously the real odds are somewhat higher when you consider that a rogue admin, malicious actor, or buggy code could delete multiple instances of replicated data at once. There's no way to estimate these odds though, and really they don't matter - they're big enough events that they could spell the end of Dropbox if they happened.
The video linked by kylequest below [1] speaks about durability, not reliability:
"Create a system that provides annual data DURABILITY of 99.9999999999. Create a system with availability of over 99.99%"
>For comparison, public telephony systems aimed for five 9s. That was usually expressed as "20 minutes downtime over 40 years, combined hardware and software budget, for outages affecting more than 32 users." One software crash requiring human intervention would count for more than 20 minutes, so you were allowed <1 of these in 40 years system lifetime.
And all of that is total bogus (the "aim", not your information), as no public telephony system (and surely not in my country) ever had anything close to that.
A few hours of downtime a few times a year is much more like it, although it has been getting better over time.
The PSTN and similar systems do target five-9s, but fortunately that only requires keeping it to ~20 minutes downtime over 4 years. ~20 minutes over 40 years would be six-9s.
I feel like they mean resiliency, not reliability. I could see 12x 9's resiliency with them factoring it based on x amount of data stored for y days. There's 0 chance they could claim that level of reliability for the reasons you mentioned among others.
> What does "Reliability of 99.9999999999% (twelve 9s)" even mean?
Sort of worst case: what it could mean is that every hour, the system reboots and this process takes 10^-12 of an hour, which doesn't seem like much, but you'd have to restart your client as well, which may take longer and is annoying, and you could lose data. So basically, the system would be useless :)
Hum, I'm not sure what this means. Is it saying go is a productice language, or just that you'll master go quickly and reach peek go productivity quickly?
Both, really. Go's so small you can be quite proficient with in months. While it's not as productive as, say, Python (wrt how quickly you can get your code up and running) it's much quicker than other languages (and nicer to use in the long run).
I think it's only slightly ambiguous. It doesn't seem to be making a claim about mastering go or peak go productivity. It just says it's easy to be productive (not maximally productive, just "productive"). Which seems to be borne out by the huge amount of new code written in go just in the past few years.
Go is a very productive language in several senses in my eyes. It is a language you can master quickly, and it is a very productive language in day to day work. There it is important to consider the overall productivity. There are certainly languages, which allow you to implement something a little bit quicker, but you have to consider also the effort of maintaining and further extending a program. For long-term maintenance, Go especially shines. It is easy to come back to some software and pick up modifying it again, and also when there are properly set up packages, Go software tends to be maintainable and extendable.
Oh, I thought Dropbox was using Rust instead of Go for a lot of things, but maybe they ended up using both. I can see why they'd have wanted to be just moving to either Rust or Go since from what I understand they used to be mostly Python for everything.
Last I checked, server-side Rust usage at Dropbox is reserved for the very bottom of the stack, for the bits that are performance-sensitive enough that the alternative would have demanded they be written in C++. Apparently there's a significant amount of Rust in Dropbox's Windows client as well, though I don't know the story there...
I have an off-topic question: This is the second company (after GitLab) I see with an “about” subdomain. Is this a new trend of using “about.x.com” for the marketing website and “x.com” for the web app? Is there a blog post or discussion about this?
At GitLab we first used www dot the marketing site and the apex for the app but many people assumed they would have the same content. That is why we introduced about. Cool to see we might have started a trend.
Browsers are thankfully highlighting the https-verified part of the URL (hostname) relative to the rest, so that a "paypal.com.fake.com" phishing attack is easier to spot. It was just a matter of time before UX people would put that highlighting to creative use. I like it.
On the technological side, I guess separate hostnames might make some ops things a little easier. But that alone can hardly be the reason. Easier and good looks can. Also: In a large scale outage, an about.x.com that is not running on your main cloud provider could be valuable for status updates, because far more people would know about about.x.com than about some status.x.com you might have if your "about" content was on the main hostname.
Anyone know where in the talk it has the mention of "Debugging tools (mostly!) work well"?
I'm skipping back and forwards through it, but the talk isn't in the same order as this article which is making it very difficult without watching the whole talk from start -> end.
Asking because debugging is a pain point I've been having with Go for a few months, so am surprised to see it described as mostly working well. I'd like to get my debugging experiences to at least that level of "(mostly) working well". :D
Sad that Rust advocates are not learning. This sort of comment is what drives people away from Rust. Stop ramming your stuff down other people's throats. Go build that exclusively Rust based Dropbox clone that outperforms Dropbox and show how well Rust performs in that situation.
Rust has trade-offs just like Go has trade-offs. Being honest about the deficiencies of ones chosen platform is a good thing, it helps to keep you sharp and to avoid problems associated with those deficiencies.
Besides having an over-zealous community that posts off-topic comments all over threads that have nothing to do with Rust, Rust has deficiencies too.
Note also that Dropbox is already using Rust in some places.
Yeah, Go has the worst possible model for concurrency there is - shared memory multithreading. Hopefully more and more companies will realize how bad this model really is and start looking into languages with decent concurrency models, like Erlang and Elixir or at least stick to event loops.
I think that is, in part, because it's a summary of a talk. So it's written a bit awkwardly and probably misses maybe some more interesting in-depth details from the actual talk. It does seem on the surface though like potentially a good text summary, having not watched the actual talk.
On the flip side, I have been frustrated particularly at linux.conf.au last year at how "not deep" many of the talks were. Having done quite a few talks over many years at similar conferences, it's actually quite hard to nail something technical and be entertaining for a presentation at the same time. Someone who nails that quite consistently is Aaron Paterson (from the Ruby/Rails world). Watch some of his talks on Youtube.. I aspire to produce more content on a level similar to his. To combine good entertaining presentation with actually educating an audience about the technical non-obvious details of something they probably didn't know -- and something that was relevant in a practical project he took on. Working on it, not there yet...
[+] [-] tlb|8 years ago|reply
For comparison, public telephony systems aimed for five 9s. That was usually expressed as "20 minutes downtime over 40 years, combined hardware and software budget, for outages affecting more than 32 users." One software crash requiring human intervention would count for more than 20 minutes, so you were allowed <1 of these in 40 years system lifetime.
[+] [-] agrajag|8 years ago|reply
They calculate this through the odds that a single node fails, and then multiplying that odd out through all replicas. This covers the most easily quantifiable failure mode.
Obviously the real odds are somewhat higher when you consider that a rogue admin, malicious actor, or buggy code could delete multiple instances of replicated data at once. There's no way to estimate these odds though, and really they don't matter - they're big enough events that they could spell the end of Dropbox if they happened.
[+] [-] sllabres|8 years ago|reply
[1] https://youtu.be/5doOcaMXx08?t=220
[+] [-] coldtea|8 years ago|reply
And all of that is total bogus (the "aim", not your information), as no public telephony system (and surely not in my country) ever had anything close to that.
A few hours of downtime a few times a year is much more like it, although it has been getting better over time.
[+] [-] tc|8 years ago|reply
[+] [-] pebers|8 years ago|reply
FWIW Amazon make a similar claim of 11 9s for data durability on S3: https://aws.amazon.com/s3/faqs/
[+] [-] oconnor663|8 years ago|reply
[+] [-] tw04|8 years ago|reply
[+] [-] amelius|8 years ago|reply
Sort of worst case: what it could mean is that every hour, the system reboots and this process takes 10^-12 of an hour, which doesn't seem like much, but you'd have to restart your client as well, which may take longer and is annoying, and you could lose data. So basically, the system would be useless :)
[+] [-] kylequest|8 years ago|reply
[+] [-] didibus|8 years ago|reply
Hum, I'm not sure what this means. Is it saying go is a productice language, or just that you'll master go quickly and reach peek go productivity quickly?
[+] [-] barsonme|8 years ago|reply
[+] [-] SwellJoe|8 years ago|reply
[+] [-] _ph_|8 years ago|reply
[+] [-] 0xCMP|8 years ago|reply
Cool that they use Go a lot.
[+] [-] kibwen|8 years ago|reply
[+] [-] nsm|8 years ago|reply
(I work at Dropbox)
[+] [-] didibus|8 years ago|reply
[+] [-] 0xFFC|8 years ago|reply
[+] [-] mostafah|8 years ago|reply
[+] [-] sytse|8 years ago|reply
[+] [-] usrusr|8 years ago|reply
On the technological side, I guess separate hostnames might make some ops things a little easier. But that alone can hardly be the reason. Easier and good looks can. Also: In a large scale outage, an about.x.com that is not running on your main cloud provider could be valuable for status updates, because far more people would know about about.x.com than about some status.x.com you might have if your "about" content was on the main hostname.
[+] [-] greenhouse_gas|8 years ago|reply
[+] [-] justinclift|8 years ago|reply
I'm skipping back and forwards through it, but the talk isn't in the same order as this article which is making it very difficult without watching the whole talk from start -> end.
Asking because debugging is a pain point I've been having with Go for a few months, so am surprised to see it described as mostly working well. I'd like to get my debugging experiences to at least that level of "(mostly) working well". :D
[+] [-] ctrlrsf|8 years ago|reply
[+] [-] apta|8 years ago|reply
> Data races are the hardest type of bug to debug, spot, fix, etc.
Exactly what Rust aims at preventing. Sad to see that the industry is not learning.
[+] [-] jacquesm|8 years ago|reply
Sad that Rust advocates are not learning. This sort of comment is what drives people away from Rust. Stop ramming your stuff down other people's throats. Go build that exclusively Rust based Dropbox clone that outperforms Dropbox and show how well Rust performs in that situation.
Rust has trade-offs just like Go has trade-offs. Being honest about the deficiencies of ones chosen platform is a good thing, it helps to keep you sharp and to avoid problems associated with those deficiencies.
Besides having an over-zealous community that posts off-topic comments all over threads that have nothing to do with Rust, Rust has deficiencies too.
Note also that Dropbox is already using Rust in some places.
[+] [-] didibus|8 years ago|reply
[+] [-] general_pizza|8 years ago|reply
[+] [-] Perceptes|8 years ago|reply
[+] [-] zzzcpan|8 years ago|reply
[+] [-] falcolas|8 years ago|reply
That said, Go is the best of the C-style concurrency breed; having typed message passing and green threads built into the language.
[+] [-] twic|8 years ago|reply
http://catb.org/esr/intercal/ick.htm#Multithreading-using-CO...
[+] [-] innocentoldguy|8 years ago|reply
[+] [-] 43224gg252|8 years ago|reply
[deleted]
[+] [-] BelleMe|8 years ago|reply
[deleted]
[+] [-] jxi|8 years ago|reply
[deleted]
[+] [-] dan-compton|8 years ago|reply
[+] [-] lathiat|8 years ago|reply
On the flip side, I have been frustrated particularly at linux.conf.au last year at how "not deep" many of the talks were. Having done quite a few talks over many years at similar conferences, it's actually quite hard to nail something technical and be entertaining for a presentation at the same time. Someone who nails that quite consistently is Aaron Paterson (from the Ruby/Rails world). Watch some of his talks on Youtube.. I aspire to produce more content on a level similar to his. To combine good entertaining presentation with actually educating an audience about the technical non-obvious details of something they probably didn't know -- and something that was relevant in a practical project he took on. Working on it, not there yet...
[+] [-] TRManderson|8 years ago|reply
Cut them some slack.
[+] [-] nolanpro|8 years ago|reply
Ya'll too young to remember OG Gopher https://en.wikipedia.org/wiki/Gopher_(protocol)