-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 multipart only showing files in bucket after all upload #4326
Comments
What version of |
@aduh95 I am using Also, I did some further tests - files start showing in S3 in batches (e.g. after first X are uploaded, they all show up), not necessarily when they all upload (so I am seeing nothing till (let's say) 6 files upload - then they suddenly all appear, then nothing new till the next batch of 6 files uploads, etc.) |
I think this is because our priority queue works with files, but a file can consistent of many requests and we prioritise the upload requests higher than the finish ones. I don't think this is a problem. Other than noticing it, do you actually have a problem with this? |
@Murderlon Sorrt, not sure we are talking about the same thing: the problem I have is that even after uppy reports that a file is uploaded, it only shows up in S3 after a batch of files has been uploaded - so the first uploaded file will only show up after the next 5 have also finished uploading. Well, my issue with this is that I want to monitor the upload process on the backend, and kick off some processes once a file finishes uploading - this introduces a noise in the process. Nothing serious, but I think if this can be fixed easily it should be. |
With "finish" request, are you refering to |
I see! I think we are talking about the same thing. Imagine a queue of two files, one file can have all of its bytes uploaded, but since we prioritise upload requests higher they will came in front of the queue, meaning the finish request (which is a separate request, required by the S3 spec) will come later. I agree that there could be some improvement there. What do you think @aduh95? |
Okay, good! Just to make my complaint a bit more concrete: the sooner the file shows up in S3, the sooner I can do some processing on it. If I have to wait for all files to upload, this causes the whole workflow to complete more slowly - especially if I have some capacity constraints for the backend processing job, as now the files all queue up at once. I think from your perspective giving priority to finish requests shouldn't be an issue - it's a request with almost no payload, unless someone is uploading thousands of files it shouldn't really slow down the uploads |
I just wanted to chime in and try to give this ticket a nudge 🙂 This can be a pretty unfortunate bug to run into under the right conditions. The impact to upload speed is really felt for larger batches since uploads that are basically done can end up taking a significantly longer time depending on the total size of the remaining batch. This also extends to any uploads that are started with existing uploads in flight -- which can feel very counterintuitive. For example: If a user adds a new upload before an existing upload would end, that existing upload is now waiting for the new upload to finish. This scenario can feel pretty frustrating haha Is there anything that I can do to help move this ticket forward? I tried messing with the priorities defined in packages/@uppy/aws-s3-multipart/src/index.js#L54-L78 |
@nmbrgts could you share the changes you made? Was it a matter of assigning some priority for the |
Initial checklist
Link to runnable example
No response
Steps to reproduce
When I upload multiple files using S3 multipart upload, they only show up in S3 after they all upload - even if uppy reports individuals files have already uploaded.
Expected behavior
I would want to see the uploaded files appear in S3 as they finish uploading.
Actual behavior
Files only appear in S3 after they all finish uploading
The text was updated successfully, but these errors were encountered: