top | item 4262862

Common problems with large file uploads

72 points| ananddass | 13 years ago |blog.filepicker.io | reply

34 comments

order
[+] zimbatm|13 years ago|reply
Given the title I was expecting the article to provide a solution.

From personal experience, the bigger the file, the more likely you will experience a connection cut in the middle of the upload. That is why the most important thing it to support resumable uploads.

At the moment there is no clear consensus on how to handle that. Amazon S3 has one protocol, Google uses two revisions of a different protocol, one on YouTube[2], another on Google Cloud Storage[3]. Both work by first creating a session that you refer to when uploading the chunks. There is also the Nginx upload module[4] that delegates the session ID to the client for some reason.

And there is no browser client available to my knowledge.

That's all I know folks

[1]: http://docs.amazonwebservices.com/AmazonS3/latest/API/mpUplo... [2]: https://developers.google.com/youtube/2.0/developers_guide_p... [3]: https://developers.google.com/storage/docs/developer-guide [4]: http://www.grid.net.ru/nginx/resumable_uploads.en.html

[+] otoburb|13 years ago|reply
I miss the ZMODEM protocol. Resumable file transfers over 56kbps was the bomb. Made me feel whole again (pun partially intended) back in '89.
[+] ars|13 years ago|reply
For the HTTP/2.0 discussion there was here earlier:

A way to continue an interrupted file upload.

Because POST variable are sent in order, if you put the file first and the other variables after, the server never sees them if the file was interrupted. So when I code a form I always put the hidden ones first so at least I can give a useful error message (since I know what the user was trying to do).

It would be better to decouple them and upload the files and the rest of the variables separately.

[+] tagx|13 years ago|reply
I'd really like to be able to use Dropbox as a magic upload handler for any file I upload on my local HD, not just those in my Dropbox folder. They handle the logic of getting all my files into the cloud. Why can't I point a website to my Dropbox and say here, this is handling the file upload?
[+] girasquid|13 years ago|reply
Dropbox has an API that will (theoretically) let you do this, but there hasn't been a ton of people jumping up and implementing it yet. It'll be cool when it shows up.
[+] meanguy|13 years ago|reply
Dropbox only supports files up to 150MB via their API. I inquired about bumping the limit on a per-app basis; no response.
[+] ChrisNorstrom|13 years ago|reply
8gb+ files? I found a way but you have to use a JAVA FTP Applet. I tested these two here: http://jupload.sourceforge.net/ and http://www.jfileupload.com/

Dragged and dropped an 8gb+ file and left it on for 5 hours. Worked perfectly. No time outs, no errors, and I'm on a shared hosting account at 1and1.

My problem with them is that it wasn't possible to hide the FTP username and password, they were always in javascript files. I whined, I complained, I bitched, and there was nothing they could do about it. :( So you basically had to password protect the whole directory with .htaccess and be very careful with whom you shared the credentials.

If you don't want people to download and install software just stick with JAVA FTP Applets.

[+] Ralith|13 years ago|reply
What exactly did you expect them to do about it? For a client-side tool to establish a plain FTP connection, it needs to possess authentication credentials.
[+] mbreese|13 years ago|reply
You could always just hard-code the username/password into the applet and recompile. That shouldn't be too hard...

Or, if you control the FTP server, you could dynamically add and remove random virtual users/passwords to the FTP server (hopefully virtual users). Then when the client javascript gets the username/password, it could only be used once.

[+] stellar678|13 years ago|reply
It's been a long time since I've been on shared hosting, but I thought they usually offered some kind of anonymous upload-only FTP directory. Couldn't your users upload to that and then your application can read from that directory?
[+] rajbot|13 years ago|reply
I've been dealing with browser-based large file uploads, which means dealing with lots of browser-specific issues.

Fortunately, things are getting better, especially for the webkit-based browsers. Firefox still has some issues, and I check https://bugzilla.mozilla.org/show_bug.cgi?id=678648 pretty regularly. Just today this bug, which was filed in 2003, changed from Status = NEW to Status = ASSIGNED.

Today is a good day.

[+] liyanchang|13 years ago|reply
First, I'm impressed that someone was uploading 2gb files back in 2003...

Agreed. Good to see that firefox is going to be able to do more than 2gb soon.

[+] t4nkd|13 years ago|reply
I've experienced this issue before when establishing a publisher backend for a D2D pc game business. It seems to be basically impossible without a Java applet of some kind, and even then it's wonky at best and just 'fails' at worst. The real fix for the issue seemed to be simply providing an FTP connection and letting people connect through the native client of their choosing.

That really seems to be the key for this problem, develop a simple native app capable of FTP uploads, that make it easy for users to deliver files to your app within the context of their use. Most browsers are capable of opening native applications via unique protocol, you could easily enrich the process by having the native app be a part of(or try to blend seamlessly with) major browsers.

[+] jasomill|13 years ago|reply
As plenty of file transfer protocols, clients, and servers support resumable transfers (FTP, SFTP, rsync, proprietary browser-based tools, etc., or even basic HTTP if you arrange for the file to be pulled rather than pushed and your "client's server" has byte-range support), perhaps this should be titled "why you shouldn't use a single HTTP POST request from a browser to upload a large file". The general reason seems to be "because this is not a use case this feature is commonly designed for and tested against."
[+] abemassry|13 years ago|reply
I ran into this problem with https://truefriender.com/ the solution I used was to use nginx instead of apache, nginx streams the file to disk and then I can handle it with PHP. I still have the 2GB problem but I've tested out Perl and I can go past it, now I just have to implement it.
[+] liyanchang|13 years ago|reply
Being on Herkou, been bit many times by the 30 second time out. No luxury of changing it, let alone moving in nginx.
[+] severin|13 years ago|reply
Hi everyone. We developped a solution just for that! Please feel free to look at http://forgetbox.com and give us feedback.

Our users send 130GB files, directly from Gmail...

[+] frytaz|13 years ago|reply
split them into rar/zip files with checksums on client side then upload...