Ex Aspera Dev here. I did the encryption and early parallel work.
There is a lot of good science behind fasp. An advantage it has over IETF protocols is that both ends trust one another. Another advantage, until recently, was out-of-order delivery.
The protocol totally ignores drops, for flow control. Instead, it measures change in transit time. The receiver knows, the sender needs to know, but the useful lifetime of the measurement is less than the transit time. This should make an engineer think "control theory!", and did. So, the receiver reports a stream of transit time samples back to the sender, which feeds them into a predictor, which controls transmission rate. Simple, in principle, but the wide Internet is full of surprises.
If you think this wouldn't be able to go a thousand times faster than TCP, you have never tried moving a file to China or India over TCP. :-) (Customers used to report 5% drop rates.) Drops and high RTT are devastating to traditional TCP throughput on high-packet-rate routes; read about "slow-start" sometime, and do the math. Problem is that for untrusted peers, drops are the only trustworthy signal of congestion. Recent improvements where routers tag packets to say "I was really, really tempted to drop this!" help some.
Torrents get the out-of-order delivery and the lower sensitivity to drops, but its blocks are too big.. Others commented that opening lots of connections gets around some TCP bottlenecks, but that helps much only when the drop rate isn't too high (i.e. not to India or China).
I'm curious how much the observed speed of Aspera on the Internet might rely on "nicer" protocols (TCP) backing off as Aspera causes congestion^H^H maximises link utilisation.
Do you think the protocol might cause problems if it were more widely used?
Rate control sounds nearly identical to TCP Vegas then, no?
But out of order delivery and (I'm guessing) out of order retransmits appear to be unique to fasp. What did you mean about these being unique "until recently"? [edit - nevermind, saw below you were referring to sack]
PS. I remember looking at fasp ~10 years ago and it looked like a fantastic tech. Someone's taking things that many people only talked about and putting them into a product that really worked well. Cutting edge stuff. A job like - a dream for many :)
This is something bittorrent could get better at. It would be cool to update the bittorent spec to use eg. a resnet to predict endpoint and path saturation from moment to moment.
It's kind of amazing really that the near future will probably be about removing massive amounts of code & design from all sorts of incredibly carefully engineered systems like the linux kernel in favor of a pile of linear algebra that can just figure things out. resistance is futile.
Do you have any anecdotal experience with how this sort of approach affects very short, bursty stop-and-start transmissions (notwithstanding the fact that you obviously can't go faster than link latency)?
> The protocol totally ignores drops, for flow control. Instead, it measures change in transit time. The receiver knows, the sender needs to know, but the useful lifetime of the measurement is less than the transit time. This should make an engineer think "control theory!" (and did). So, the receiver reports a stream of transit time samples back to the sender, which feeds them into a predictor, which controls transmission rate. Simple, in principle, but the wide Internet is full of surprises.
Would you be willing to talk more about this, for someone not as intimately-familiar with the details?
(Raptor codes (a type of FEC) can essentially transfer over UDP at the underlying line rate, even if packet drop is high. In other words, if it is a 1 gbit/s link, with 5% packet loss, you will be able to send over 900 mbits/s without needing any TCP features). This is also helps high latency scenarios, as you don't need test/back off how much data to send...you just send as much Raptor encoded data as you want down the pipe, and as long as you send enough recovery packets with it, you will be able to reconstruct the original flawlessly).
You don't need 5% packet loss on a high latency 1Gbps link to trash performance. Even .1% packet loss is enough to cut throughput massively on long fat networks (LFNs).
Erasure coding helps, but it demands much more CPU than simply having a very wide window for retransmits. For bulk data transfer of files, retransmitting any part of a 1GB window is trivial.
Second this. When we needed to deal with somewhat similar issues (quickly transfer multi-gigabyte archives over massively unreliable link) , we ended up with FDT ; it can be easily scripted , and transfers data between two hosts (without using a network of torrent peers) at that hosts' link speed. Used it to transfer builds between US and Russia, and Russia and China .
The 'proof' shown that aspera is 200x faster shows an ETA of 2:15 compared to 00:09. First of all, this is 15x faster, not 200.
I tried downloading the file using axel -n 32, and it took 1:28, which is only 9x slower. Curious though, when it started out with all threads running it was transferring data at 29594.2KB/s and hit 50% after only a few seconds, but by the end with only a few threads left running it was only doing 5625.8KB/s.
It looks like the performance varies a lot, possibly due to multiple interfaces or wan links being used.. or issues with their server.
Using axel -n 64 over http I was able to fetch it in 50s, which is only 5x slower, and most of that time was spent from 98 to 100% as the last few connections finally finished.
A smarter client that more aggressively re-fetched chunks being downloaded over slow connections would likely match the same performance.
not to take away from the posts plea. Apart from bit vs bytes, it should be noted the remotes are also on different machines with potentially varying network throughput
Why not use torrents over wireguard? From an engineering perspective I'd personally be hard pressed to come up with a more optimal solution for private big data transfer.
If Aspera is as good as the parent suggests it's probably because they are operating a bittorrent-like network of geo-distributed peers. It would be very cheap to emulate that using cloud computing providers, especially if you are discarding the data after transfer.
I benchmarked Aspera against alternatives for getting data from on-prem storage into the cloud (google cloud storage) a year or two ago on a 10Gbps link.
Aspera was faster on a single stream, but if you have a lot of files to move around (you usually do) you can just multiplex a bunch of tcp streams to get the same throughput.
So like in the example on their page, if you have 20 wget's hitting their ftp and there are no other bottlenecks, the throughput will be similar to using Aspera - and you have just a greater variety of free tooling for tcp based those protocols...
Sending on lots of TCP streams can get you to 80% line utilization if the drop rate is good and low. That's often good enough. On some lines, with the "wrong" number of TCP connections, the rate will oscillate, and you will be lucky to get 50%. Tuning that is a black art some people enjoy.
Even if you have only the one large file, can't you segment it? That's what I do for my consumer level use case anyway. Using a nonsegmented ftp client to download a file I get like 2MB/s, and when its segmented I max out my download rate at 20MB/s.
Honestly speaking, a good way to solve this is not trivial. You would need to:
1. Measure the existing performance on specific data sets.
2. Understand how many of the bottlenecks come from the network latency vs. I/O limits vs. CPU bottlenecks (if using compression).
3. See if any domain-specific compression is needed.
4. Document the typical use cases. Understand why the current solution sucks (e.g. requires redundant user actions). Write down user interaction scenarios. Design the UI to be as efficient as possible for those scenarios.
This is a non-trivial amount of work that would require a lot of back-and-forth interaction and on-the-go requirement changes and I don't think it's entirely honest to ask someone to do this work for free in the name of cancer research (after all, you are not donating most of your paycheck to charities, are you?). If the existing solution by IBM sucks, how about making a Request For Proposal [0] and seeing if smaller software vendors could offer something better given that you are actually willing to pay for the work?
P.S. A student/hobbyist can probably whip out some sort of a parallel TCP-like thing with a large window for free, but you would get the same performance by just cranking up the TCP window size via sysctl and using multiple HTTP threads (htcat was suggested earlier in the comments).
Way back when I had a problem on my network that was capping the FTP/SFTP speeds to about a fifth of my download speed. Torrents worked fine. To fix my problem I have cobbled up an extremely hacky solution which consisted of opening several listening sockets, ssh-ing to the remote machine and net-catting the file by chunks in several processes in parallel. It actually worked. The code is here https://github.com/jlegeny/blazing-speed more for archiving purposes than to be used by anybody for anything.
FTR, they say: "As you can see, the ascp client performs over 200x faster than FTP, and can download the whole file in 9 seconds. It’s pretty magical." But the example shows MB/s vs Mb/s. It's not over 200x times faster, but about 17 times faster.
Question at the meta-level: what software/platforms are producing, analyzing, modifying & consuming this data?
Is it possible some legacy systems that produce and consume these massive files on-site would more sensibly run in the cloud, directly & selectively accessing the data chunks they need over fast backbone connections?
Also, is there any room for an rsync type approach, sending compressed deltas rather than naively sending huge files that may be redundant?
Not to say disintermediating a BigCO expensive patented vendor-locked-in MLPOS (Market Leading Piece Of Shit) doesn't sound exciting -- it does. I'm just curious and a bit skeptical that it is always necessary to mass-copy all this data over and over.
Aspera is the standard in broadcast video for transferring video files. Typically you wouldn't want just a small piece, you would want the entire hour-long, 3 terabyte video.
Also, you would still need to get the data into the cloud in the first place.
The systems that produce massive DNA sequence files are machines reading the sequences of actual physical DNA molecules. The cloud can't sequence DNA, as far as I know.
I feel the author’s pain - Aspera is a clunky 90s styled Ruby on Rails app that is about the most cloud unfriendly piece of software I’ve encountered, I’d love an alternative.
I honestly don’t care about about their proprietary UDP protocol, it’s nothing special, just another way to copy bits onto a wire. Dime a dozen.
The true value of Aspera is they provide an integrated browser plug-in that lets the technically challenged reliably upload large files. If the transfer is interrupted or either side changes addresses it deals with it gracefully. It’s also a bridge to AWS S3.
I spent some time looking for a replacement and while there are numerous download managers that facilitate people downloading large files, I couldn’t find any upload managers with the same level of integration and polish.
About the closet thing I could find is Cyberduck, but it’s not integrated with the web browser, not as easy for technically challenged people to use, and there is no support (community support but seems really hit and miss). However it does make good use of Amazon’s multiple upload api and will happily fill whatever wire it’s connected to.
Torrent software has largely the same pros and cons Cyberduck does.
I have seen this software before. I suppose the speed boost results from using multiple parallel connections, meaning it could make better use of aggregate links and multipath networks? Is there anything else to it?
Used a few of their products at a previous company. That's the gist of the protocol. It's UDP-based and the software has intelligence to use bandwidth as efficiently as possible between links. We had a few servers sending to many endpoints concurrently, and I'd regularly see the systems cap out their 10gbit connections.
License costs are pretty brutal and it does need decent amount CPU to make the most of it.
It does what it says and it’s expensive. There’s a couple of commercial packages like this - two that come to mind are vcinity and Signiant. A solution can be made but every time I look at the market you find very few customers with a real need to move terabytes that can’t solve their problem with a snowball-style solution. Those that do usually also have the economics to justify a commercial package.
I know about a couple of these. Some use very specific compression based on external knowledge—like that this is all office xml documents, with freedom to study the set of the last year’s documents. Some work by eliciting pessimal behavior from TCP flows sharing the same link—they’re really bad neighbors!
Some use a family of different approaches to cope with different performance domains. Some use intermediate relays to cut the bandwidth-deat product by reducing delay.
But the big key is this: standard tools aren’t even trying to be good at this. They’re aimed at making an Internet work, not at optimizing and particular flow.
bbcp doesn't work so well for some reason. Where I used to work we had a senior dev that claimed that he could just repurpose bbcp for this. Then he gave up and wrote something custom in c++. Then he rewrote it in go, but used (among other things) non-multithreadsafe primitives, then when asked to make it encrypted, he used epoll with tls (which is apparently not a thing in go? I don't know). I told him he should just use DTLS or hell even a one-time-pad encrypted UDP stream with backpressure management, but he didn't listen to me. Then he rewrote it in C++ again and then went back to the go version. When I left the company, it still wasn't working.
Also, this senior dev never wrote unit tests.
On the other hand, maybe bbcp will work, and the senior dev just didn't know what he was doing. There was another senior dev who, seeing what was coming down the pike, left his job, and on his way out he was like, "yeah you can do it with bbcp, just you gotta tweak your tcp congestion rules on all the hops (which we could do, but is probably not an option for OP)"
I worked with a couple of the commercial alternatives to aspera and aspera as well and I used UDT, Tsunami, GridFTP, syncthing[1] as a poor mans alternative to Aspera.
For real transfer with big (> 500TB) Aspera will deliver what it says over WAN (over the Atlantic).
If you have better connections and not so much data syncthing will probably work if you can have someone manually take care of all exceptions.[2]
If you know your data and you can build quite a lot of stuff yourself you can get almost the same speed as Aspera with UDT.
There is also a go implementation of UDTs used by kcptun[3] but I haven't tried that.
2 I don't intend to disrespect syncthing here, but when your dealing with TB scale 24/7 things are never really up all the time, the network is unreliable, the sender host filesystem is corrupt and the receiver host filesystem is full or broken or ...
> Aspera is owned by IBM - does their client spy on what else is running on my system and report it back to their headquarters as business intelligence? If asked, would they share this information with the government? Without the source code, it’s impossible to tell!
I mean, with some reverse engineering, mentioned later in the post, it’s really not.
My group is about to start using Aspera as well.. I too wish there was a good alternative I could suggest to my bosses. And I'm still skeptical of their claims... but I haven't gotten to test it over an actual bad link yet (over a good link it does nothing, which a simple test will prove).
It's being run out of UChicago, and I know they have at least one large company using them. Feel free to reach out to the Globus team, or shoot me an email!
Aspera never did in-flight compression when I was there. In-flight compression got even less interesting around 2010 when the long fiber links got faster than customers' (shared) storage systems. Users have good reasons to compress at rest, before starting a transfer.
[+] [-] ncmncm|6 years ago|reply
There is a lot of good science behind fasp. An advantage it has over IETF protocols is that both ends trust one another. Another advantage, until recently, was out-of-order delivery.
The protocol totally ignores drops, for flow control. Instead, it measures change in transit time. The receiver knows, the sender needs to know, but the useful lifetime of the measurement is less than the transit time. This should make an engineer think "control theory!", and did. So, the receiver reports a stream of transit time samples back to the sender, which feeds them into a predictor, which controls transmission rate. Simple, in principle, but the wide Internet is full of surprises.
If you think this wouldn't be able to go a thousand times faster than TCP, you have never tried moving a file to China or India over TCP. :-) (Customers used to report 5% drop rates.) Drops and high RTT are devastating to traditional TCP throughput on high-packet-rate routes; read about "slow-start" sometime, and do the math. Problem is that for untrusted peers, drops are the only trustworthy signal of congestion. Recent improvements where routers tag packets to say "I was really, really tempted to drop this!" help some.
Torrents get the out-of-order delivery and the lower sensitivity to drops, but its blocks are too big.. Others commented that opening lots of connections gets around some TCP bottlenecks, but that helps much only when the drop rate isn't too high (i.e. not to India or China).
[+] [-] zxcmx|6 years ago|reply
Do you think the protocol might cause problems if it were more widely used?
[+] [-] eps|6 years ago|reply
But out of order delivery and (I'm guessing) out of order retransmits appear to be unique to fasp. What did you mean about these being unique "until recently"? [edit - nevermind, saw below you were referring to sack]
PS. I remember looking at fasp ~10 years ago and it looked like a fantastic tech. Someone's taking things that many people only talked about and putting them into a product that really worked well. Cutting edge stuff. A job like - a dream for many :)
[+] [-] bsder|6 years ago|reply
[+] [-] qorrect|6 years ago|reply
[+] [-] grizzles|6 years ago|reply
It's kind of amazing really that the near future will probably be about removing massive amounts of code & design from all sorts of incredibly carefully engineered systems like the linux kernel in favor of a pile of linear algebra that can just figure things out. resistance is futile.
[+] [-] exikyut|6 years ago|reply
[+] [-] bogomipz|6 years ago|reply
Can you elaborate on this? Which recent development are you referring to? Thanks.
[+] [-] big_chungus|6 years ago|reply
Would you be willing to talk more about this, for someone not as intimately-familiar with the details?
[+] [-] phonon|6 years ago|reply
(Raptor codes (a type of FEC) can essentially transfer over UDP at the underlying line rate, even if packet drop is high. In other words, if it is a 1 gbit/s link, with 5% packet loss, you will be able to send over 900 mbits/s without needing any TCP features). This is also helps high latency scenarios, as you don't need test/back off how much data to send...you just send as much Raptor encoded data as you want down the pipe, and as long as you send enough recovery packets with it, you will be able to reconstruct the original flawlessly).
(Also see https://par.nsf.gov/servlets/purl/10066600
http://www1.icsi.berkeley.edu/~pooja/HowUseRaptorQ.pdf
https://a1f9fb7d-b120-4c73-98e3-5cdb4ec8a2ab.filesusr.com/ug...
https://youtu.be/bYPbat-FFTo
https://github.com/mk-fg/python-libraptorq)
[+] [-] Scaevolus|6 years ago|reply
Erasure coding helps, but it demands much more CPU than simply having a very wide window for retransmits. For bulk data transfer of files, retransmitting any part of a 1GB window is trivial.
[+] [-] matsur|6 years ago|reply
[+] [-] ovidiul|6 years ago|reply
[+] [-] YarickR2|6 years ago|reply
[+] [-] beojan|6 years ago|reply
[+] [-] justinsaccount|6 years ago|reply
I tried downloading the file using axel -n 32, and it took 1:28, which is only 9x slower. Curious though, when it started out with all threads running it was transferring data at 29594.2KB/s and hit 50% after only a few seconds, but by the end with only a few threads left running it was only doing 5625.8KB/s.
It looks like the performance varies a lot, possibly due to multiple interfaces or wan links being used.. or issues with their server.
Using axel -n 64 over http I was able to fetch it in 50s, which is only 5x slower, and most of that time was spent from 98 to 100% as the last few connections finally finished.
A smarter client that more aggressively re-fetched chunks being downloaded over slow connections would likely match the same performance.
[+] [-] StavrosK|6 years ago|reply
[+] [-] flas9sd|6 years ago|reply
[+] [-] KaiserPro|6 years ago|reply
I worked in VFX, so I've been using aspera since before it was owned by IBM.
Depending on what line speed you have, but for a gig link we made a simple protocol using parallel TCP streams.
Basically, it chunked up the large file into configurable sized chunks, and assigned a chunk to each stream.
Another stream passed the metadata.
This has all the advantage of TCP, with less of the drawbacks of a custom UDP protocol.
For transferring files from london to SF we were getting 800mbit/s over a 1 gig link.
[+] [-] grizzles|6 years ago|reply
If Aspera is as good as the parent suggests it's probably because they are operating a bittorrent-like network of geo-distributed peers. It would be very cheap to emulate that using cloud computing providers, especially if you are discarding the data after transfer.
[+] [-] mattbillenstein|6 years ago|reply
Aspera was faster on a single stream, but if you have a lot of files to move around (you usually do) you can just multiplex a bunch of tcp streams to get the same throughput.
So like in the example on their page, if you have 20 wget's hitting their ftp and there are no other bottlenecks, the throughput will be similar to using Aspera - and you have just a greater variety of free tooling for tcp based those protocols...
[+] [-] ncmncm|6 years ago|reply
[+] [-] austhrow743|6 years ago|reply
[+] [-] john_moscow|6 years ago|reply
1. Measure the existing performance on specific data sets.
2. Understand how many of the bottlenecks come from the network latency vs. I/O limits vs. CPU bottlenecks (if using compression).
3. See if any domain-specific compression is needed.
4. Document the typical use cases. Understand why the current solution sucks (e.g. requires redundant user actions). Write down user interaction scenarios. Design the UI to be as efficient as possible for those scenarios.
This is a non-trivial amount of work that would require a lot of back-and-forth interaction and on-the-go requirement changes and I don't think it's entirely honest to ask someone to do this work for free in the name of cancer research (after all, you are not donating most of your paycheck to charities, are you?). If the existing solution by IBM sucks, how about making a Request For Proposal [0] and seeing if smaller software vendors could offer something better given that you are actually willing to pay for the work?
P.S. A student/hobbyist can probably whip out some sort of a parallel TCP-like thing with a large window for free, but you would get the same performance by just cranking up the TCP window size via sysctl and using multiple HTTP threads (htcat was suggested earlier in the comments).
[0] https://en.wikipedia.org/wiki/Request_for_proposal
[+] [-] yoz-y|6 years ago|reply
[+] [-] 01CGAT|6 years ago|reply
[+] [-] danbmil99|6 years ago|reply
Is it possible some legacy systems that produce and consume these massive files on-site would more sensibly run in the cloud, directly & selectively accessing the data chunks they need over fast backbone connections?
Also, is there any room for an rsync type approach, sending compressed deltas rather than naively sending huge files that may be redundant?
Not to say disintermediating a BigCO expensive patented vendor-locked-in MLPOS (Market Leading Piece Of Shit) doesn't sound exciting -- it does. I'm just curious and a bit skeptical that it is always necessary to mass-copy all this data over and over.
[+] [-] ncmncm|6 years ago|reply
[+] [-] burntwater|6 years ago|reply
Also, you would still need to get the data into the cloud in the first place.
[+] [-] rcthompson|6 years ago|reply
[+] [-] zxcvbn4038|6 years ago|reply
I honestly don’t care about about their proprietary UDP protocol, it’s nothing special, just another way to copy bits onto a wire. Dime a dozen.
The true value of Aspera is they provide an integrated browser plug-in that lets the technically challenged reliably upload large files. If the transfer is interrupted or either side changes addresses it deals with it gracefully. It’s also a bridge to AWS S3.
I spent some time looking for a replacement and while there are numerous download managers that facilitate people downloading large files, I couldn’t find any upload managers with the same level of integration and polish.
About the closet thing I could find is Cyberduck, but it’s not integrated with the web browser, not as easy for technically challenged people to use, and there is no support (community support but seems really hit and miss). However it does make good use of Amazon’s multiple upload api and will happily fill whatever wire it’s connected to.
Torrent software has largely the same pros and cons Cyberduck does.
[+] [-] dgemm|6 years ago|reply
[+] [-] kylek|6 years ago|reply
License costs are pretty brutal and it does need decent amount CPU to make the most of it.
[+] [-] mattrp|6 years ago|reply
[+] [-] dboreham|6 years ago|reply
[+] [-] brians|6 years ago|reply
Some use a family of different approaches to cope with different performance domains. Some use intermediate relays to cut the bandwidth-deat product by reducing delay.
But the big key is this: standard tools aren’t even trying to be good at this. They’re aimed at making an Internet work, not at optimizing and particular flow.
[+] [-] ewwhite|6 years ago|reply
[+] [-] patelh|6 years ago|reply
[+] [-] aamargulies|6 years ago|reply
http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm
[+] [-] dnautics|6 years ago|reply
Also, this senior dev never wrote unit tests.
On the other hand, maybe bbcp will work, and the senior dev just didn't know what he was doing. There was another senior dev who, seeing what was coming down the pike, left his job, and on his way out he was like, "yeah you can do it with bbcp, just you gotta tweak your tcp congestion rules on all the hops (which we could do, but is probably not an option for OP)"
[+] [-] TNick|6 years ago|reply
[+] [-] jonke|6 years ago|reply
For real transfer with big (> 500TB) Aspera will deliver what it says over WAN (over the Atlantic).
If you have better connections and not so much data syncthing will probably work if you can have someone manually take care of all exceptions.[2]
If you know your data and you can build quite a lot of stuff yourself you can get almost the same speed as Aspera with UDT.
There is also a go implementation of UDTs used by kcptun[3] but I haven't tried that.
1 https://github.com/syncthing/syncthing
2 I don't intend to disrespect syncthing here, but when your dealing with TB scale 24/7 things are never really up all the time, the network is unreliable, the sender host filesystem is corrupt and the receiver host filesystem is full or broken or ...
3 https://github.com/xtaci/kcptun
[+] [-] fanf2|6 years ago|reply
But it looks like GridFTP and Globus are basically dead https://opensciencegrid.org/technology/policy/gridftp-gsi-mi...
Tsunami sounds good but maybe needs some updating? I haven’t looked closely...
[+] [-] saagarjha|6 years ago|reply
I mean, with some reverse engineering, mentioned later in the post, it’s really not.
[+] [-] quasarj|6 years ago|reply
[+] [-] CaliforniaKarl|6 years ago|reply
https://www.globus.org
It's being run out of UChicago, and I know they have at least one large company using them. Feel free to reach out to the Globus team, or shoot me an email!
[+] [-] KaiserPro|6 years ago|reply
Or anything with a >0.1% packet loss.
[+] [-] sgt101|6 years ago|reply
But, I think Aspera uses some specialized compression for genomics data, and genomics compression is not easy.
My suggestion would be to talk to the telco that provides your wan. If you are using a direct internet connection my suggestion would be : don't.
[+] [-] ncmncm|6 years ago|reply
Aspera never did in-flight compression when I was there. In-flight compression got even less interesting around 2010 when the long fiber links got faster than customers' (shared) storage systems. Users have good reasons to compress at rest, before starting a transfer.
[+] [-] wtracy|6 years ago|reply