Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uploading a large >5GB file to S3 errors out #1945

Closed
jaequery opened this issue Nov 15, 2019 · 20 comments · Fixed by #4372
Closed

Uploading a large >5GB file to S3 errors out #1945

jaequery opened this issue Nov 15, 2019 · 20 comments · Fixed by #4372
Assignees
Labels
AWS S3 Plugin that handles uploads to Amazon AWS S3 Bug Stale Old issues that haven't had activity recently

Comments

@jaequery
Copy link

Help, I posted this same topic on the Companion forum and I got zero response so I'm trying my luck here (https://community.transloadit.com/t/is-there-a-file-size-limit/15115/4).

I'm using the AWS MultiPart plugin like this:

          const AwsS3Multipart = Uppy.AwsS3Multipart
          const uppy = Uppy.Core({ debug: true, autoProceed: false, allowMultipleUploads: false })
            .use(Uppy.Dashboard, { 
              trigger: '#uppyModalOpener',
              closeAfterFinish: true,              
            })
            .use(AwsS3Multipart, {
              limit: 0,
              companionUrl: 'http://localhost:3020/'
            })
          uppy.on('success', (fileCount, res) => {
            console.log(`${fileCount} files uploaded`)
            console.log(res)
          })             

This is the error from Uppy after it finished uploading a large file (15GB):

POST http://localhost:3020/s3/multipart/huqp0uHAb443J_o_PrARUP8ixOjvzAG4P_XPFQDFvMWd74XOzxXXP6.Ko6S4uFoeWLsSi4Z86naoftGNzQ9wHO6zr4Xf.gdmo7FWPXiczwR5z0VqqU7RVhH28aXWyBoH/complete?key=ghostbuster_test2.mov net::ERR_ABORTED 413 (Payload Too Large)
(index):1 Access to fetch at 'http://localhost:3020/s3/multipart/huqp0uHAb443J_o_PrARUP8ixOjvzAG4P_XPFQDFvMWd74XOzxXXP6.Ko6S4uFoeWLsSi4Z86naoftGNzQ9wHO6zr4Xf.gdmo7FWPXiczwR5z0VqqU7RVhH28aXWyBoH/complete?key=ghostbuster_test2.mov' from origin 'http://localhost:3001' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.
getTimeStamp.js:3 [Uppy] [19:19:35] Failed because: Could not post http://localhost:3020/s3/multipart/huqp0uHAb443J_o_PrARUP8ixOjvzAG4P_XPFQDFvMWd74XOzxXXP6.Ko6S4uFoeWLsSi4Z86naoftGNzQ9wHO6zr4Xf.gdmo7FWPXiczwR5z0VqqU7RVhH28aXWyBoH/complete?key=ghostbuster_test2.mov. TypeError: Failed to fetch 

It works fine when I tested uploading a 5GB file.
Has anyone ever tested uploading something greater than 5GB?
I'm wondering if this is a Node/Express issue or something else.

Please let me know if anyone has tried it, thanks.

@jaequery
Copy link
Author

I've gotten this response back from the Express server, I don't really know how as I can't reproduce it but think it gives some clue:

expected: 336824,
length: 336824,
limit: 102400,
type: ‘entity.too.large’,
expose: true,
statusCode: 413,
status: 413

Looks like it did hit some sort of a limit of 102400, which I wonder if it's a Express's JSON size limit?

Does Uppy send a JSON request to Companion after the upload has finished? And can that JSON request be too large for Express's default configuration to handle it?

@jaequery
Copy link
Author

jaequery commented Nov 15, 2019

Now I get this error during midway of uploading a 15gb file:

PUT https://x-ott.s3-accelerate.amazonaws.com/x.mov?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=%2F20191115%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20191115T191449Z&X-Amz-Expires=300&X-Amz-Signature=19b773f5ebd3a20sdfsd&X-Amz-SignedHeaders=host&partNumber=816&uploadId=_6FcH097JtHWUtnatSwOfeVjdVMQh7tObso_DTNVrkR6ZzHJEqGrp.esfsefsef.8RjOH28gkuDvG 502 (Bad Gateway)
(index):1 Access to XMLHttpRequest at 'https://x-ott.s3-accelerate.amazonaws.com/x.mov?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=se%2F20191115%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20191115T191449Z&X-Amz-Expires=300&X-Amz-Signature=ddesf&X-Amz-SignedHeaders=host&partNumber=816&uploadId=efsfa.jg3ok0wdQgBQWCzsCm__c20Vzo3tXHj5MFODJmxPPau6UHAndmqA6dddDE8W.8RjOH28gkuDvG' from origin 'http://localhost:3001' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
getTimeStamp.js:3 [Uppy] [11:15:23] Failed because: Unknown error 
error @ getTimeStamp.js:3
n.log @ index.js:2
n._showOrLogErrorAndThrow @ index.js:2
(anonymous) @ index.js:2

@jaequery
Copy link
Author

jaequery commented Nov 15, 2019

This is after I increased the bodyParse.json limit:

app.use(bodyParser.json({ limit: '50GB' }))

I was able to upload a 10gb file fine with this.

But 15gb is giving me the above error.

@jaequery
Copy link
Author

jaequery commented Nov 16, 2019

Upon further tests, the problem is I'm just randomly getting 502 Bad Gateway from AWS endpoint.
Sometimes during 50% of the upload, sometimes within 30%, sometimes 80%.

PUT https://x-ott.s3-accelerate.amazonaws.com/x.zip?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=x%2F20191115%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20191115T223852Z&X-Amz-Expires=300&X-Amz-Signature=x&X-Amz-SignedHeaders=host&partNumber=1704&uploadId=x.KfVNgxq6CHyEwuxoQO02qW1Sy9drtD0tXb.uDh1vn1atxh67k_YFmUe90.x. 502 (Bad Gateway)

Anyone got any ideas?

@goto-bus-stop
Copy link
Contributor

Thanks for documenting your investigation! I don't really have a clue why it would do that from the top off my head—does it only happen on the accelerated endpoint or also without?

@goto-bus-stop goto-bus-stop self-assigned this Nov 18, 2019
@goto-bus-stop goto-bus-stop added AWS S3 Plugin that handles uploads to Amazon AWS S3 and removed Triage labels Nov 18, 2019
@goto-bus-stop
Copy link
Contributor

Actually, this might be causing the S3 requests to fail: you've set limit: 0, which lets Uppy send the signing requests to Companion for all of the chunks at once. Those signing requests return a URL that's valid for a few minutes. As soon as the signing results come in, Uppy starts sending requests to S3. All these requests clog up the browser's internal queue, so some of the requests will actually be sent much later, probably after the signed URL expires.

Not setting a limit has very bad failure modes like this, and we'll configure one by default in a future version. Most likely, if you set limit: 5 or something, you won't see the requests to S3 fail any more.

The JSON issue is interesting too 🤔 we do send a big JSON object to the /complete endpoint, it contains the ETags from all the chunks that were uploaded. I just tried uploading a 2GB file and the JSON is already 25KB. Would be worth looking into avoiding that somehow...

@jaequery
Copy link
Author

Okay thanks, let me try set the limit to 5 and see if that fixes it.

@amanstack
Copy link

Any update on this? Is it resolved? I am having trouble uploading files larger than 8 GB

@goto-bus-stop
Copy link
Contributor

@amangijoe Have you set a limit option, like below?

uppy.use(AwsS3Multipart, {
  /* ... your existing options ... */
  limit: 5
})

@jaequery
Copy link
Author

Yes, setting limit somehow fixed it for me.

@goto-bus-stop
Copy link
Contributor

Great, thanks for the confirmation! We'll be setting a default limit of 6 or 10 or something in the future so we don't run into these issues anymore.

@amanstack
Copy link

amanstack commented Feb 11, 2020

It's still not working, here is what I have understood the problem to be:

Here is the error that I face when my upload completes

7edfe24c-6e40-4764-931d-16f7f9af0684 { PayloadTooLargeError: request entity too large at readStream (/tmp/node_modules/raw-body/index.js:155:17) at getRawBody (/tmp/node_modules/raw-body/index.js:108:12) at read (/tmp/node_modules/body-parser/lib/read.js:77:3) at jsonParser (/tmp/node_modules/body-parser/lib/types/json.js:135:5) at Layer.handle [as handle_request] (/tmp/node_modules/express/lib/router/layer.js:95:5) at trim_prefix (/tmp/node_modules/express/lib/router/index.js:317:13) at /tmp/node_modules/express/lib/router/index.js:284:7 at Function.process_params (/tmp/node_modules/express/lib/router/index.js:335:12) at next (/tmp/node_modules/express/lib/router/index.js:275:10) at middleware (/tmp/node_modules/express-prom-bundle/src/index.js:153:5) message: 'request entity too large', expected: 151561, length: 151561, limit: 102400, type: 'entity.too.large' }

I think that when we are sending the completeMultipartUpload (documented here) request is the reason we are getting the error above. the parts json array sent in this request is too large. @jaequery is right, this should be fixed in companion.

Maybe if we increase the size of chunks the parts array will be smaller.

Tell me if I am going wrong somewhere, thanks

@amanstack
Copy link

In MultipartUploader.js

const chunkSize = Math.max(Math.ceil(this.file.size / 10000), 5 * MB);

will always result in 5 MB as the chunk size until we try to upload a file greater than 50GB, due to this the final /complete request errors out due to PayloadTooLarge on around 8GB or 2261 5MB chunks.

I changed the value from 10000 to 500 and was able to upload files upto 30 GB with ease.

@goto-bus-stop Do you want me to generate a pull request that allows you to configure the chunk size and also decides an optimal chunk size depending on the file size?

@goto-bus-stop
Copy link
Contributor

goto-bus-stop commented Mar 9, 2020

@amanstack sorry I didn't see your comment! If you'd still like to create a PR for that, that would be super appreciated!

@amanstack
Copy link

@goto-bus-stop sure I ll create a PR right away. Thanks.

@stephenbickle
Copy link

Looks like I'm having the exact same issue. We are able to upload multipart files ~= 5GB but above 8G and we get the exact same error. I have gone in and updated the MultipartUploader.js as @amanstack suggested but it doesn't seem to make any difference. Was anybody able to fix this or figure out why its doing this?

@baba43
Copy link

baba43 commented Dec 17, 2020

We are facing the same problem (uploading files up to 8GB) and I'm not sure why.

I assumed the upload goes directly to S3 so why is the default configuration causing any errors when uploading files?

Is my companion server (express) still involved in uploading?

@stale
Copy link

stale bot commented Dec 18, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Stale Old issues that haven't had activity recently label Dec 18, 2021
@stale stale bot closed this as completed Jan 1, 2022
@vymao
Copy link

vymao commented Mar 11, 2022

Has this been fixed?

@cmsnegar

This comment was marked as spam.

@mifi mifi reopened this Mar 21, 2023
mifi added a commit that referenced this issue Mar 23, 2023
mifi added a commit that referenced this issue Mar 28, 2023
…r s3 (#4372)

* refine body parsers

to make it possible to adjust them individually
also improves security and speed
because we don't have to parse all body types (and all sizes) for all endpoints

* increase body size for s3 complete endpoint

fixes #1945
HeavenFox pushed a commit to docsend/uppy that referenced this issue Jun 27, 2023
…r s3 (transloadit#4372)

* refine body parsers

to make it possible to adjust them individually
also improves security and speed
because we don't have to parse all body types (and all sizes) for all endpoints

* increase body size for s3 complete endpoint

fixes transloadit#1945
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AWS S3 Plugin that handles uploads to Amazon AWS S3 Bug Stale Old issues that haven't had activity recently
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants