source-s3: get object with context #344
Merged
+2
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description:
Uses the passed-in context for
source-s3
(*s3Store).Read
. This way the retrieving of objects can respect context timeouts.Most notably, there is a timeout specified for schema discovery, and without this timeout, it is possible that a single extremely large file will take an excessive amount of time to process.
Workflow steps:
(How does one use this feature, and how has it changed)
Documentation links affected:
(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)
Notes for reviewers:
Along these same lines, the
List
method doesn't use the provided context. UpdatingList
to use a context would be a bit more complicated. I created a separate issue for that work, which isn't causing any immediate problems but might be good to take care of at some point: #345Also, while investigating discovery from relatively large files, I found our parsing & possibly inference processes surprisingly slow. If the parsing and inference were much faster, this particular context deadline would be less important, although still useful to have. I have some details to still work out for this, and have created a placeholder for an issue related to this potential optimization: estuary/flow#681
This change is