top | item 15272004

How to Move 8PB from NYC to SJC?

5 points| throwbigdata | 8 years ago | reply

I have been asked to cooy 8PB from a customers datacenter in NYC to our Datacenter in SJC. I expect to start in 2 weeks and I want to be done 30 days later.

NYC dc is currently unknown (!). I should learn this next week. The data is a collection of files size 100k to 2MB Ave size 500k.

What's the best way to move it?

Options:

1 - Ship my storage servers out there. 24 core controller with 128g ram, SAS card with 2xHGST 60 disk 10tb JBOD and 40gig nic. We use Freenas. I could ship it out, copy the data and ship it back. I could either just ship the disks or the whole storage system.

Option 2 - aws snowball. I figure I can get 3gbps based on testing so the copy would take 30 days.

Option 3 - aws snowmobile - overkill for this?

Option 4 - find 1 or more temp 10g or 100g circuits

Option 5 - run it through aws direct connect assuming I can get this at the east coast dc. I have high bw aws dx in the sjc facility.

Option 6 - something else?

Your thoughts suggestions recommendations and experience appreciated!

11 comments

order
[+] jwilliams|8 years ago|reply
If you already own the servers and can do without them, then that'll be a fast option for sure. That said, that's quite a bit to insure on the shipping (that's a quarter to half of a million $ of discs alone?). Hate to think of the damage if one of those servers were dropped. Plus you're shipping twice, so that's double the cost.

You can probably do via tape for circa $50k. Probably a lot less if you buy in bulk. Also depends on the kind of compression you can get on that 8pb.

If you can handle tape (500-1,000 tapes is a lot to do manually), I'd consider shipping the tape. Then try selling the tape at the destination (surely there is a decent market in SJC).

[+] thiago_fm|8 years ago|reply
I would do the same, perhaps find some companies who does this and quote them.
[+] abra_kadabra|8 years ago|reply
So I do agree with the rest of the comments that the absolute fastest way is to ship the disks and then copy the data. Now that being said if you can't or would rather not ship anything because of disks getting ruined, then the best thing to do is to find a high bandwidth internet connection and use something like IBM Aspera (https://www-03.ibm.com/software/products/en/high-speed-file-...)

Aspera is used widely in the Movie industry to move Tb size movies around (~2.5Tb). The key to Aspera is that it assumes that packet order doesn't matter so it will max out your pipes and it doesn't have to immediately report back about what packets were dropped and therefore is pretty efficient at getting the data across quickly.

[+] brudgers|8 years ago|reply
Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway. -- Andrew S. Tanenbaum
[+] twobyfour|8 years ago|reply
This. Don't bother shipping the servers. Ship the disks. Copy them. Ship them back. Easy peasy.
[+] bradknowles|8 years ago|reply
Came here to say this, only to discover that someone beat me to the punch.

Cool!

[+] jtchang|8 years ago|reply
Shipping storage servers is by far the fastest way. You can encrypt data if needed and have options to either fly the thing out or ship it ground.
[+] jtchang|8 years ago|reply
This sounds like a fun project.