Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uploading from Google Drive hangs after a while #3098

Closed
mifi opened this issue Aug 12, 2021 · 4 comments
Closed

Uploading from Google Drive hangs after a while #3098

mifi opened this issue Aug 12, 2021 · 4 comments
Labels
Bug Companion The auth server (for Instagram, GDrive, etc) and upload proxy (for S3) Remote Provider Remote providers: Google Drive, Instagram, Dropbox, etc

Comments

@mifi
Copy link
Contributor

mifi commented Aug 12, 2021

A person tried to upload a 100GB file many times, but observed that it hangs at 19% (presumably before 40 minutes) with no error in the UI and no error in the companion logs.

I tried to upload a 100GB file from my own google drive, and it seems to progress fine, although I did not try to finish it due to the fact that it will take hours/days to test.

Because companion first downloads the whole file before uploading, but it emits an "Illusive Progress", so if it hung at 19% (less than 50%), it means that it hung while downloading the file from drive. So either something went wrong during downloading on the server (companion), or the client got disconnected from the server after some time. But it's hard to say because no error logged.

I'm thinking it could be due to:

  • a firewall cutting the client's connection after a while
  • a bug in the companion code or dependencies

I tried to reproduce a server connection cut in companion while uploading and I tried running abort() on the requestStream:

...and when doing that I see that companion stops printing the uploader.illusive.progress events, but it does not emit any error and the UI still shows as uploading.

When I call requestStream.emit('error', new Error('test')), I see it successfully logging the error in companion and also reporting an error in the UI, so it seems to successfully propagate server errors to the UI. This leads me to believe that there is something wrong with request or purest (uses request). request is already deprecated so it is no longer supported on newer versions of Node, so it should be replaced. But I see that purest is very tightly integrated into the codebase in all the providers and also the supporting code.

Also if I'm uploading and then cut the browser's connection with companion, while uploading, it also doesn't report any error. I don't know if that's a feature, but if contact with the server is lost, it could indeed produce the issue that the person had where it gets stuck at 19%.

So I cannot really find the cause of this issue. And rewriting companion to use something else than request a big feat and needs a lot of complicated testing because there are so many providers. We could rewrite just the google drive provider, but it's no guarantee it will fix it.

If someone with a multi-gigabit internet connection could test uploading a 100gb file from google drive, I can share the file with you.

@mifi mifi changed the title Uploading from Google Drive seems hangs after a while Uploading from Google Drive hangs after a while Aug 12, 2021
@Murderlon
Copy link
Member

Also if I'm uploading and then cut the browser's connection with companion, while uploading, it also doesn't report any error. I don't know if that's a feature, but if contact with the server is lost, it could indeed produce the issue that the person had where it gets stuck at 19%.

Related to #1658?

@Murderlon Murderlon added Companion The auth server (for Instagram, GDrive, etc) and upload proxy (for S3) Remote Provider Remote providers: Google Drive, Instagram, Dropbox, etc and removed Triage labels Aug 12, 2021
@mifi
Copy link
Contributor Author

mifi commented Aug 12, 2021

Looks like it could be, yes

@mifi
Copy link
Contributor Author

mifi commented Aug 29, 2021

I think I have identified the reason why uploading to transloadit/tus causes the assembly to timeout after 20/40min. As mention earlier, companion will currently completely download the file from google drive (or any other provider) before even starting the upload to TUS. Because downloading a 100gb file can take longer than 20/40 minutes, TUS doesn't get any upload progress events and neither does the assembly because the upload hasn't yet started. Then the assembly will timeout because it doesn't have any data. This assembly timeout might be what causes the progress bar to just stop in the UI without any feedback.

Workarounds that I can think of are:

  1. Rewrite companion to download/upload at the same time (e.g. support Node.js streams). Rewrite Companion providers to use streams to allow simultaneous upload/download without saving to disk #3159
    Issues with this are:
  • Some of the upload destinations (s3 uploader or multipart) may not support streams and need a file to upload (can still save as a file for those). (they all support streams)
  • We need to support backpressure in our implementation. The current provider API does not support this kind of mechanism (simple onData callback):
    onData(this._error(null, { ...resp, body: jsonResp }))
  • So we would need to rewrite the Provider API to a stream based API as well as rewrite and test the 8 different providers and properly handle backpressure.
  • Alternatively, maybe rewrite onData to return a promise. Then the sender could await the promise before calling it again. The receiver would resolve the promise when the stream is ready for more data.
  • Possibly alternatively just use a dumb passthrough stream with no backpressure support. Then if data is read faster than it can be written (or upload hangs), incoming data will be buffered in memory in Node.js and can cause companion to buffer 100gb or more in memory.
  • Possibly implement our own custom passthrough stream implementation that will intelligently buffer the data it receives in a file. Then when the reading end tries to read, it will read from the file at the desired position. If the file has not yet been written up until the requested read position, wait until the file's size is big enough and then read from it (I will try to find an existing node module that does this already).
  1. Periodically send "fake" progress events (e.g. 0 bytes) or some kind of keep-alive to the assembly from the frontend (uppy). However I couldn't find any API for this.

Updating companion to use streams gives more advantages like speed and not needing to store huge files on disk, but it requires a bigger rewrite.

mifi added a commit that referenced this issue Sep 2, 2021
This allows for upload to start almost immediately without having to first download the file.
And it allows for uploading bigger files, because transloadit assembly will not timeout,
as it will get upload progress events all the time.
No longer need for illusive progress.
Also fix eslint warnings and simplify logic

Still TODO: TUS pause/resume has a bug:
tus/tus-js-client#275
mifi added a commit that referenced this issue Sep 3, 2021
This removes the need for disk space as data will be buffered in memory and backpressure will be respected
#3098 (comment)
All providers "download" methods will now return a { stream } which can be consumed by uploader.

Also:
- Remove capacitor (no longer needed)
- Change Provider/SearchProvider API to async (Breaking change for custom companion providers)
- Fix the case with unknown length streams (zoom / google drive). Need to be downloaded first
- rewrite controllers deauth-callback, thumbnail, list, logout to async
- getURLMeta: make sure size is never NaN (NaN gets converted to null in JSON.stringify when sent to client but not when used in backend)
- fix purest mock (it wasn't returning statusCode on('response'))
- add missing http mock for "request" for THUMBNAIL_URL and http://url.myendpoint.com/file (these request errors were never caught by tests previously)
- "upload functions with tus protocol" test: move filename checking to new test where size is null. Fix broken expects
- fix some lint
mifi added a commit that referenced this issue Nov 1, 2021
…ad/download without saving to disk (#3159)

* rewrite to async/await

* Only fetch size (HEAD) if needed #3034

* Update packages/@uppy/companion/src/server/controllers/url.js

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>

* Change HEAD to GET in getURLMeta

and abort request immediately upon response headers received
#3034 (comment)

* fix lint

* fix lint

* cut off length of file names

or else we get
"MetadataTooLarge: Your metadata headers exceed the maximum allowed metadata size" in tus / S3

* try to fix flaky test

* remove iife and cleanup code a bit

* fix lint by reordering code

* rename Uploader to MultipartUploader

* Rewrite Uploader to use fs-capacitor #3098

This allows for upload to start almost immediately without having to first download the file.
And it allows for uploading bigger files, because transloadit assembly will not timeout,
as it will get upload progress events all the time.
No longer need for illusive progress.
Also fix eslint warnings and simplify logic

Still TODO: TUS pause/resume has a bug:
tus/tus-js-client#275

* add comment in dev Dashboard and pull out variable

* fix a bug where remote xhr upload would ignore progress events in the UI

* fix bug where s3 multipart cancel wasn't working

* fix also cancel for xhr

* Rewrite providers to use streams

This removes the need for disk space as data will be buffered in memory and backpressure will be respected
#3098 (comment)
All providers "download" methods will now return a { stream } which can be consumed by uploader.

Also:
- Remove capacitor (no longer needed)
- Change Provider/SearchProvider API to async (Breaking change for custom companion providers)
- Fix the case with unknown length streams (zoom / google drive). Need to be downloaded first
- rewrite controllers deauth-callback, thumbnail, list, logout to async
- getURLMeta: make sure size is never NaN (NaN gets converted to null in JSON.stringify when sent to client but not when used in backend)
- fix purest mock (it wasn't returning statusCode on('response'))
- add missing http mock for "request" for THUMBNAIL_URL and http://url.myendpoint.com/file (these request errors were never caught by tests previously)
- "upload functions with tus protocol" test: move filename checking to new test where size is null. Fix broken expects
- fix some lint

* Implement streamingUpload flag

COMPANION_STREAMING_UPLOAD
Default to false due to backward compatibility
If set to true, will start to upload files at the same time as dowlnoading them, by piping the streams

- Also implement progress for downloading too
- and fix progress duplication logic
- fix test that assumed file was fully downloaded after first progress event

* rearrange validation logic

* add COMPANION_STREAMING_UPLOAD to env.test.sh too

* implement maxFileSize option in companion

for both unknown length and known length downloads

* fix bug

* fix memory leak when non 200 status

streams were being kept

* fix lint

* Add backward-compatibility for companion providers

Implement a new static field "version" on providers, which when not set to 2,
will cause a compatibility layer to be added for supporting old callback style provider api

also fix some eslint and rename some vars

* document new provider API

* remove static as it doesn't work on node 10

* try to fix build issue

* degrade to node 14 in github actions

due to hitting this error: nodejs/node#40030
https://github.com/transloadit/uppy/pull/3159/checks?check_run_id=3544858518

* pull out duplicated logic into reusable function

* fix lint

* make methods private

* re-add unsplash download_location request

got lost in merge

* add try/catch

as suggested #3159 (comment)

* Only set default chunkSize if needed

for being more compliant with previous behavior when streamingUpload = false

* Improve flaky test

Trying to fix this error:

FAIL packages/@uppy/utils/src/delay.test.js
  ● delay › should reject when signal is aborted

    expect(received).toBeLessThan(expected)

    Expected: < 70
    Received:   107

      32 |     const time = Date.now() - start
      33 |     expect(time).toBeGreaterThanOrEqual(30)
    > 34 |     expect(time).toBeLessThan(70)
         |                  ^
      35 |   })
      36 | })
      37 |

      at Object.<anonymous> (packages/@uppy/utils/src/delay.test.js:34:18)

https://github.com/transloadit/uppy/runs/3984613454?check_suite_focus=true

* Apply suggestions from code review

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>

* fix review feedback & lint

* Apply suggestions from code review

Co-authored-by: Merlijn Vos <merlijn@soverin.net>

* remove unneeded ts-ignore

* Update packages/@uppy/companion/src/server/controllers/url.js

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>

* Update packages/@uppy/companion/src/server/Uploader.js

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>

* reduce nesting

* fix lint

* optimize promisify

#3159 (comment)

* Update packages/@uppy/companion/test/__tests__/uploader.js

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>
Co-authored-by: Merlijn Vos <merlijn@soverin.net>
@mifi
Copy link
Contributor Author

mifi commented Nov 1, 2021

The underlying issue causing progress events to not be emitted is now fixed in #3159, so if running companion with the option streamingUpload: true or env COMPANION_STREAMING_UPLOAD=true it should no longer be a problem.

@mifi mifi closed this as completed Nov 1, 2021
HeavenFox pushed a commit to docsend/uppy that referenced this issue Jun 27, 2023
…ad/download without saving to disk (transloadit#3159)

* rewrite to async/await

* Only fetch size (HEAD) if needed transloadit#3034

* Update packages/@uppy/companion/src/server/controllers/url.js

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>

* Change HEAD to GET in getURLMeta

and abort request immediately upon response headers received
transloadit#3034 (comment)

* fix lint

* fix lint

* cut off length of file names

or else we get
"MetadataTooLarge: Your metadata headers exceed the maximum allowed metadata size" in tus / S3

* try to fix flaky test

* remove iife and cleanup code a bit

* fix lint by reordering code

* rename Uploader to MultipartUploader

* Rewrite Uploader to use fs-capacitor transloadit#3098

This allows for upload to start almost immediately without having to first download the file.
And it allows for uploading bigger files, because transloadit assembly will not timeout,
as it will get upload progress events all the time.
No longer need for illusive progress.
Also fix eslint warnings and simplify logic

Still TODO: TUS pause/resume has a bug:
tus/tus-js-client#275

* add comment in dev Dashboard and pull out variable

* fix a bug where remote xhr upload would ignore progress events in the UI

* fix bug where s3 multipart cancel wasn't working

* fix also cancel for xhr

* Rewrite providers to use streams

This removes the need for disk space as data will be buffered in memory and backpressure will be respected
transloadit#3098 (comment)
All providers "download" methods will now return a { stream } which can be consumed by uploader.

Also:
- Remove capacitor (no longer needed)
- Change Provider/SearchProvider API to async (Breaking change for custom companion providers)
- Fix the case with unknown length streams (zoom / google drive). Need to be downloaded first
- rewrite controllers deauth-callback, thumbnail, list, logout to async
- getURLMeta: make sure size is never NaN (NaN gets converted to null in JSON.stringify when sent to client but not when used in backend)
- fix purest mock (it wasn't returning statusCode on('response'))
- add missing http mock for "request" for THUMBNAIL_URL and http://url.myendpoint.com/file (these request errors were never caught by tests previously)
- "upload functions with tus protocol" test: move filename checking to new test where size is null. Fix broken expects
- fix some lint

* Implement streamingUpload flag

COMPANION_STREAMING_UPLOAD
Default to false due to backward compatibility
If set to true, will start to upload files at the same time as dowlnoading them, by piping the streams

- Also implement progress for downloading too
- and fix progress duplication logic
- fix test that assumed file was fully downloaded after first progress event

* rearrange validation logic

* add COMPANION_STREAMING_UPLOAD to env.test.sh too

* implement maxFileSize option in companion

for both unknown length and known length downloads

* fix bug

* fix memory leak when non 200 status

streams were being kept

* fix lint

* Add backward-compatibility for companion providers

Implement a new static field "version" on providers, which when not set to 2,
will cause a compatibility layer to be added for supporting old callback style provider api

also fix some eslint and rename some vars

* document new provider API

* remove static as it doesn't work on node 10

* try to fix build issue

* degrade to node 14 in github actions

due to hitting this error: nodejs/node#40030
https://github.com/transloadit/uppy/pull/3159/checks?check_run_id=3544858518

* pull out duplicated logic into reusable function

* fix lint

* make methods private

* re-add unsplash download_location request

got lost in merge

* add try/catch

as suggested transloadit#3159 (comment)

* Only set default chunkSize if needed

for being more compliant with previous behavior when streamingUpload = false

* Improve flaky test

Trying to fix this error:

FAIL packages/@uppy/utils/src/delay.test.js
  ● delay › should reject when signal is aborted

    expect(received).toBeLessThan(expected)

    Expected: < 70
    Received:   107

      32 |     const time = Date.now() - start
      33 |     expect(time).toBeGreaterThanOrEqual(30)
    > 34 |     expect(time).toBeLessThan(70)
         |                  ^
      35 |   })
      36 | })
      37 |

      at Object.<anonymous> (packages/@uppy/utils/src/delay.test.js:34:18)

https://github.com/transloadit/uppy/runs/3984613454?check_suite_focus=true

* Apply suggestions from code review

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>

* fix review feedback & lint

* Apply suggestions from code review

Co-authored-by: Merlijn Vos <merlijn@soverin.net>

* remove unneeded ts-ignore

* Update packages/@uppy/companion/src/server/controllers/url.js

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>

* Update packages/@uppy/companion/src/server/Uploader.js

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>

* reduce nesting

* fix lint

* optimize promisify

transloadit#3159 (comment)

* Update packages/@uppy/companion/test/__tests__/uploader.js

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>
Co-authored-by: Merlijn Vos <merlijn@soverin.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Companion The auth server (for Instagram, GDrive, etc) and upload proxy (for S3) Remote Provider Remote providers: Google Drive, Instagram, Dropbox, etc
Projects
None yet
Development

No branches or pull requests

2 participants