top | item 12085868

Highly Available Block Storage

262 points| dineshp2 | 9 years ago |digitalocean.com | reply

140 comments

order
[+] wiremine|9 years ago|reply
Spun one up and ran some quick numbers on a 100GB volume:

root@ubuntu-1gb-nyc1-01:~# time dd if=/dev/disk/by-id/scsi-0DO_Volume_volume-nyc1-01 of=test.dat bs=1024 count=10000000 10000000+0 records in

10000000+0 records out

10240000000 bytes (10 GB) copied, 58.0655 s, 176 MB/s

real 0m58.248s

user 0m2.608s

sys 0m41.604s

Some quick observations:

* Easy to add one when creating a droplet; by default they let you create volumes with these sizes: 100GB, 250GB, 500GB, 1000GB, 1.95TB; it's also really easy to create your own size.

* You can resize in any increments; took about 4 seconds to go from 100GB to 110GB with no downtime; you obviously need to resize/manage the mounted volume yourself.

* [Edit 1] Deleting the droplet does NOT destroy the volume. Worth keeping in mind when you spin them up/down.

* [Edit 2] Remounting an existing volume to a new droplet was quick and painless.

[+] olavgg|9 years ago|reply
Thank you for telling us these details.

I can't help but I note that you benchmark the drive with a block size of 1024. Minimum block size of a modern SSD is 4096, anything lower would just cause unnecessary load. Also if you want to benchmark a drive/block storage I highly recommend using fio.

[+] Thaxll|9 years ago|reply
Don't use dd for those tests it's really bad, especially on VMs.
[+] bjacobel|9 years ago|reply
Reminder that just a few weeks ago DigitalOcean rolled over on one of their customers and took down 38,000 websites after receiving a claim of infringement from the NRA against a parody site hosted on surge.sh:

http://motherboard.vice.com/read/nra-complaint-takes-down-38...

[+] corobo|9 years ago|reply
Reminder that you have to act on abuse notifications sharpish. You're providing a service, it's on you if you ignore abuse notifies.

"We received notice on behalf of a trademark holder that a customer of DigitalOcean was hosting infringing content on our network. DigitalOcean immediately notified our customer of the infringement, and the customer was given a five day period to resolve the issue. The infringing content was not removed within the specified period even though several notifications were issued. Per DigitalOcean’s terms of service, a final reminder was issued to our customer and, when no action was taken, access to the content was disabled. The infringing content was subsequently removed by the customer and all services were restored in less than two hours."

[+] 15155|9 years ago|reply
I haven't forgotten about their little "we don't automatically zero-out SSDs, it's a feature, the customer is wrong" issue.

I also haven't forgotten about their private blog censorship debacle.

DO is not a serious player and should not be trusted.

[+] treehau5|9 years ago|reply
Just a reminder that most businesses would do this exact same thing in this position and the problem is with the policy and enforcement.

Get out and vote for the right people this November.

[+] mark212|9 years ago|reply
exactly why I moved off of Digital Ocean a couple of years ago, after a similar incident.
[+] happyslobro|9 years ago|reply
DMCA takedown: the new orbital ion cannon
[+] dastbe|9 years ago|reply
Don't be confused: the article makes the mistake of comparing DOs new block storage service with other companies object stores. EBS is the competitor to this, not S3. Same for gce persistent disks and azure drives.

Unfortunately this means the pricing comparison is just wrong.

[+] mwcampbell|9 years ago|reply
I think this might be a mistake. Ever since Joyent's commentary on one of the big Amazon EBS failures in 2011 [1] [2] [3], I've been suspicious of all network-attached block storage. Then again, I haven't heard of any big EBS failures recently; I wonder what changed.

[1]: https://www.joyent.com/blog/on-cascading-failures-and-amazon...

[2]: https://www.joyent.com/blog/magical-block-store-when-abstrac...

[3]: https://www.joyent.com/blog/network-storage-in-the-cloud-del...

[+] boulos|9 years ago|reply
Network block storage isn't inherently broken, the initial EBS implementation was frankly just unreliable.

We've not had anything like those dark days with Persistent Disk. It's still true that having your storage across the network opens you to networking failures taking out your storage, but the gain in durability and maintenance pays for it (in our case, live migration would just be crazy with local spinning disks, we tried it didn't work).

Disclaimer: I work on GCE, and we want your business ;)

[+] Mister_Snuggles|9 years ago|reply
This is EXACTLY the thing that I need for one of my droplets! I love how there is nothing "special" about it - it's just a disk that you can attach to a droplet. I'm sure that under the hood there's some kind of magic going on, but it looks like it's nicely abstracted away. This is what I hoped block storage would turn out to be - here's a block device, use it like one.

As soon as this rolls out to the region I've got that droplet in, I'm going to pull the trigger on it. I might even spend the effort to migrate my droplet to a supported region just to get this.

[+] mikeash|9 years ago|reply
Same here. My $5/month droplet is sufficient for all my needs, except it's a bit limiting on storage. $2/month for an extra 20GB, doubling my storage, would be great. I don't want to migrate anything and I don't care about more CPU or RAM, I just want more space!
[+] 3pt14159|9 years ago|reply
I have been asking for non-SSD on DO for a long time now. My heart jumped when I saw the HN title, only to be dashed on the rocks.

What are us data nerds supposed to do? We want to take 10 terabytes, run a batch process on it, keep the 20TB, then continue with about 5GB of working data until the next month's terabyte comes in, then we want to batch through the 21TB. Right now the price slider doesn't even go up to 21TB, and clicking on the "need more storage button" doesn't go anywhere, but I'm assuming it would be $2100 / month which is more than 3x as expensive than vanilla S3.

[+] Veratyr|9 years ago|reply
You're looking at the wrong market.

At Hetzner, you can rent a dedicated server with 2x3TB drives for about $25/month ($0.005/GB, to scale to multiple servers, use Ceph) and a few larger machines for under $100. At OVH you can buy object storage for around 1c/GB and rent a few dedicated servers for less than $100/month, or use their cloud.

If you go to the _really_ low end, time4vps will give you a 1TB VPS for 2EUR/month, if you pay for 2 years (and they give you 4x the storage as bandwidth).

I don't work for any of these companies but I have services with all of them.

[+] boulos|9 years ago|reply
Not being snarky, but at that scale why would you still be using Digital Ocean? Are you processing this dataset with just a single droplet?

Are you going to keep each months 10 TiB forever? Something like Nearline is going to be a much better fit even if you process the data and have to pay the $.01/GB retrieval fee.

Disclaimer: I work on compute engine (so of course I want your money), but I'm honestly curious.

[+] KaiserPro|9 years ago|reply
For the price of 16 Tbs plus, real steel becomes very appealing.

You've got to remember that block storage is >> faster than s3/http. Its also has significantly lower latency.

Also if you want fast processing you need to start looking at shared filesystems/NFS

[+] raiyu|9 years ago|reply
You can actually create multiple volumes and then stripe them together so you should be able to get above 16TB with a bit of work.

The reason we focus on SSD exclusively for most of our services is that there are obvious performance benefits and as we look into the future the prevalence of spinning disk will continue to decrease over time.

That being said, if we are only providing SSD there are going to be customer use cases where it's not going to make sense, but that's also part of looking into the future.

So you should be able to create a larger striped volume to run through your batch processing and then depending on how you plan to access that data afterwards you can ship that off to an object store if it's going to be accessed infrequently.

[+] jclulow|9 years ago|reply
If you have a lot of data that you want to ingest and store, without keeping it in an active instance, you might want to take a look at Joyent Manta[1]. We allow you to run compute jobs directly on the storage servers themselves -- basically anything you can do in a UNIX environment with a script you can run against an object in the store, from "grep" up to "python" and beyond. If you were to select "single copy" storage, the public service pricing[2] might be more interesting for you; if you grow too big or want to move onto your own hardware, the stack is open source[3] and/or we can sell you support!

Disclaimer: I work at Joyent.

[1]: https://www.joyent.com/manta

[2]: https://www.joyent.com/pricing/manta

[3]: https://github.com/joyent/manta

[+] mattbee|9 years ago|reply
At Bytemark we've done network as standard since 2012, so we can do 100GB of higher-capacity storage for £2/month, and all discs are expandible / migrateable online: bigv.io/prices
[+] mikeash|9 years ago|reply
"All SSD all the time" is kind of DO's thing. There are lots of other VPS providers with round storage if that's what you're after.
[+] andybak|9 years ago|reply
This helps me with a nicer deployment setup. I was always keen on 'rebuild from scratch' rather than 'update stuff and hope you're idempotent and have captured all changes' but transitory data was always the problem. Now I can start building a new updated droplet and the only downtime will be that needed to detach and reattach the block storage containing the db etc.

Anyone see a flaw in this? (I know there are other ways to achieve similar benefits - my files could be on S3 and the database could be a separate droplet etc but these introduced various drawbacks and added complexity)

[+] Jedd|9 years ago|reply

  > Anyone see a flaw in this?
Perhaps not a flaw, but some issues with your setup are implied.

If you're rebuilding from scratch because you're not sure that you can update things, then you're probably in need of a configuration management tool (I'm a big fan of saltstack[1], mostly because I don't like Ruby or DSL's, but there's lots of options out there[2])

If you're worried you're going to lose transitory data, it sounds like you don't have a trusted and tested backup/archival/recovery process in place. So having it stored on a single EBS / DO BS / etc means you're still exposed. If you're rebuilding and rolling data over, in this scenario, I'd be copying, rather than relocating, any precarious data repositories.

[1] https://saltstack.com/ [2] https://en.wikipedia.org/wiki/Comparison_of_open-source_conf...

[+] skrowl|9 years ago|reply
I like their straight forward pricing. $0.10 USD per GB per month. No IOPS limits.

That said, how do you prevent a rogue droplet from going crazy and hogging up all of the SSD I/O?

[+] mdasen|9 years ago|reply
Google doesn't charge for IOPS either, but they do explain how things are limited. For example, speed in MB/s is limited by the instance size so that you don't use the entire network link for your connection to the disk. IOPS are limited by the size of the disk. If you're sharing the disk with other people, you're going to be limited in your IOPS. For example, if you get 10GB of SSD storage, you'll get 300 random IOPS, but another user with 50GB of storage will get 1,500 IOPS because they're renting a larger portion of the disk.

If DigitalOcean isn't doing something limiting, they'll have to in order to prevent the situation that you're describing. You don't have to charge for it in order to prevent bad neighbors.

https://cloud.google.com/compute/docs/disks/performance

[+] zbjornson|9 years ago|reply
Unless I missed it, I saw "no need for complicated formulas to determine the overall cost for transactions or IOPs limit," which I do not read as "no IOPS limit." I was looking but could not find any word on performance.
[+] marcstreeter|9 years ago|reply
There are limits - they are marketing their simplicity by avoiding complicated pricing schemes. This is typical DO, and I see that as a good thing especially if you're just dipping your feet into the concept. The past year or two that we've had block storage at Codero we've been marketing the ability to set your own IOPS with even our smallest blocks (1000/2000/3000 iops). That said I'd be interested to know exactly what performance they offer because they are definitely cheaper.
[+] askmike|9 years ago|reply
They might limit IO speed (based on usage of previous 24h for example). Like how most cheaper AWS EC2 instances get their CPUs throttled after you're running CPU heavy stuff.
[+] koolba|9 years ago|reply
This has been a much requested feature and I'm sure it will be very popular. I'm still reminded of this quote though:

"He was a bold man that first ran a production database on a brand new block storage service!"

[+] johnwheeler|9 years ago|reply
I love how DO focuses on what matters the most: Inexpensive VMs and scalable block storage.

If I had to pick two, those would be them!

[+] brianwawok|9 years ago|reply
Well except being 4 years late to the block storage game? Seems they have an uphill fight. Aws and gce match them on low end droplet price and offer much more. No local ssd but not sure what % of apps really need local ssd. DO effectively forces you to pay for local ssd for all of your servers.
[+] aibottle|9 years ago|reply
Thank god! Highly Available Block Storage. From Digital Ocean. Great! Now I can finally store all the 300mb/s streaming in on my server. Oh wait. I cannot, because DO cancelled the service again. Bummer.
[+] cgag|9 years ago|reply
Sweet. This makes digitalocean much more appealing as a potential substrate for a kubernetes cluster.
[+] pstadler|9 years ago|reply
I just migrated my Kubernetes/Rancher stack from NYC2 to NYC1 in order to use block storage. Eager to see whether this plays well with GlusterFS.
[+] Mister_Snuggles|9 years ago|reply
I can't wait for this to roll out to more regions.

This is EXACTLY the thing I need for some stuff I'm working on!

[+] scurvy|9 years ago|reply
What's the backend? Ceph?
[+] marcstreeter|9 years ago|reply
Ceph is not block storage -- ceph is object storage
[+] mrmondo|9 years ago|reply
How is it taking huge cloud providers so long to catch up with things we do self hosted every day? It obviously has to be well engineered, yet it's relatively simple. Woefully poor performance too.
[+] simos|9 years ago|reply
Some early benchmarks about the new block storage, https://simos.info/blog/trying-out-lxd-containers-on-ubuntu-...

I did not get good speeds and I am wondering why that may be...

[+] zbjornson|9 years ago|reply
> The immediate benefits are that the latency is much lower with the new block storage

I think you might have misread the units: locally attached 50105 us (50 ms) vs block 546 ms.

The throughput numbers are on par with AWS and GCP block storage. This seems reasonable aside from the high latency.

[+] drtse4|9 years ago|reply
A bit pricey for the long term but great if you just need to add some disk space to your vm and don't need the other improvements more expensive vms give you.

I use DO mostly to compile stuff on Linux when i don't have access to a physical server, and storage size is always a problem.

[+] marcstreeter|9 years ago|reply
The purpose of the block storage in this instance isn't about giving your vm/droplet more space. It's separation. That way any data that's on that device can be attached to another vm/droplet. It probably would be more cost effective just to upgrade the vm/droplet if space were a concern. It's at least how we've marketed the same feature for the past year or two via Codero's portal. Not to say I don't like how DO has entered the space: keeping it simple.