top | item 19508243

New Amazon S3 Storage Class – Glacier Deep Archive

85 points| nnx | 7 years ago |aws.amazon.com | reply

42 comments

order
[+] Twirrim|7 years ago|reply
It's a little irritating this is only available via the S3 API rather than something I can just move my archive too. I guess maybe it's time to upload to S3 and be finally retire my Glacier archive after ~5 years.

Reading between the lines, I wonder if they're really trying to just deprecate the Glacier API, and move to Glacier being a different storage tier in S3. Which is probably what it should have been in the first place. AWS will likely never actually retire the Glacier API (much like SimpleDB has never actually been retired), it'll just hang on, not receiving new features.

[+] sly010|7 years ago|reply
> [...] move to Glacier being a different storage tier in S3

I am almost certain they are playing catchup here, given that GCP had this feature for ever ("coldline" storage class), last time I checked it was more expensive though.

[+] donavanm|7 years ago|reply
Dont forget to access your archive over the s3 bittorrent endpoint.
[+] nerdponx|7 years ago|reply
Call me silly, but this seems like a great option for backing up nonessential personal data like a movie collection. It's even cheaper than Backblaze B2, and the limitations (180 day minimum storage, 12 hour retrieval time) don't seem like bad tradeoffs.
[+] guitarbill|7 years ago|reply
Just the thing I was looking for to back up my NAS offsite. Although not sure how to test the backup cost effectively, so I'm treating it as separate from my 3-2-1 strategy for now.
[+] MHordecki|7 years ago|reply
Bandwidth is between 9x and 5x more expensive compared to B2, though.
[+] chucky_z|7 years ago|reply
For backing up something personal, why don't you just use actual Backblaze?
[+] planetjones|7 years ago|reply
1 USD per month per TB. Finally a storage tier that allows me to back up my media collection for what’s almost loose change.
[+] Elect2|7 years ago|reply
But every time you download your 1TB media, you pay $100 transfer+request fee. At that cost you can nearly buy a 1T ssd hard drive.
[+] CaliforniaKarl|7 years ago|reply
Excluding API call charges, retrieval fees, and transfer out charges:

If my math is correct, for N. California, 100 TB in Deep Archive is $204.80 per month, vs. $512 per month in regular Glacier.

For other non-Gov US regions, 100 TB in Deep Archive is $101.376 per month, vs. $409.60 per month in regular Glacier.

[Edit: Addl. pricing info]

[+] CaliforniaKarl|7 years ago|reply
This pricing is finally at the point where I can start considering this for some of the research labs I support.
[+] patrickg_zill|7 years ago|reply
If you have to retrieve all of it, what is the cost?
[+] teraflop|7 years ago|reply
It's about half that cost in the us-east regions.
[+] azinman2|7 years ago|reply
Off topic, but did anyone else press play to listen to Polly’s TTS of this article? They seem to have added “inhales” to ostensibly make it more natural, but the pauses and inhales (let alone the quality of the speech) are so off the mark it just sounds really odd. If your voice sounds robotic already, a fake breath only makes it worse.
[+] jjeaff|7 years ago|reply
Ya, weird. The breathing isn't really integrated with the speech. It sounds like someone on a breathing machine, like Christopher Reeve when he would pause as his machine took a breath for him.
[+] ttul|7 years ago|reply
I imagine Polly suffers from halitosis.
[+] DoctorPenguin|7 years ago|reply
Isn't glacier the product which seems very cheap if you only look at the storage cost but is extremely expensive when retrieving the data? I remember reading an article by someone who had to pay something like 2000 USD to restore his data which wasn't even in the terrabytes. Although I might be mistaken here.
[+] dfrage|7 years ago|reply
Amazon says the original retrieval pricing matched their cost, but was hard to grok and easy to run up a huge bill if you didn't use it carefully and extract data patiently.

They realized that was a mistake and significantly streamlined the pricing, and with this Deep product it doesn't look like they've even supporting the original somewhat opaque Glacier API, just S3.

If you're patient and can wait 48 hours for data, bulk storage retrieval is cheap at $0.0025/GB and $0.025 per 1000 requests. The standard AWS $0.09/GB egress is the really big cost, but if you have enough data you can mitigate that with a Snowball. Not a big issue if you're recovering from a catastrophe that destroyed all your local backups, it looks great as insurance for those of us with modest time to recovery desires.

[+] amaccuish|7 years ago|reply
Is this the service that is rumoured to be using bluray disc libraries?
[+] allen37|7 years ago|reply
Perhaps this is the case, but I remember digging deep one night and seeing a picture of a StorageTek tape system on what I thought was an AWS page. I can't remember the URL. Oracle is trying to compete in this space, so maybe I'm misremembering.
[+] Twirrim|7 years ago|reply
I'm struggling to see the value proposition there. I would imagine you'd get way higher data / rack density using even plain old hard disk drives than you would with bluray disc libraries.

The only way to get close to that would be to have some insanely complicated automated jukebox/storage mechanism, because you'd be relying on essentially piling up disks in a rack.

Just rough back-of-napkin figures: blu-ray is 4.75" in diameter, and 0.05" in depth. 42U racks: 78″ x 42" x 24″ (more or less.)

Assuming you just pack them all in, in great big tall piles: You could get 5 x 8 columns, 1560 discs deep.

5 * 8 = 40. 40 x 1560 = 62,400 discs in a rack. At 150Gb a disc, that's a total of 9,360,000gb (9.36pb).

Of course you probably need to cut that in half at the very least, to be able to provide some kind of mechanism for extracting and remove the discs, and for safe storage. I'd consider that generous, but it's a good figure to work from.

And of course you have to consider that any jukebox for doing the writing / retrieval is effectively wasted space. The amount of discs you could have stored in transit there aren't likely to be sufficient enough to worth bothering with, so at the very least halve it again. That's also assuming you'd have one jukebox per storage rack.

So... 9.36 / 2 / 2 = 2.34 petabytes average rack density.

Current backblaze PODs come in at https://www.backblaze.com/blog/open-source-data-storage-serv... 480 TB per 4U. Rack I used for disc storage is the standard 42U. I'm going to go assume you'd lose 6U for top of rack switches, power, etc. so 9 servers per 42U. 9 * 480tb = 4.32 petabytes.

So even just with Backblaze PODs you'd get more than double the data density, and it would all be on-line and retrievals could be nearly instantaneous. Plus you'd be dealing with tried and true technology that is likely to be way more reliable and have much less unknowns, comprising of fairly easy to replace storage media, vs relatively new technology with less certain supply lines and specialised devices associated with it.

edit: I see below that bluray archive disks are reaching 150gb per side, so 300gb total. So the rack data density probably reaches 4.68 petabytes, just slightly over that of the hard disks, but still my final point remains. Choosing a newer technology vs well known hard disks and supply chains would be crazy, unless there was a strong advantage that it supplied. I don't see 0.36pb per rack as being that significant an advantage.