-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data] Always launch one task for read_sql
#48923
Merged
Merged
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
cb830b0
Initial commit
bveeramani 1d9dd2e
Merge branch 'master' of https://github.com/ray-project/ray
bveeramani c01297b
Initial commit
bveeramani 1b4630c
Fix typo
bveeramani bed2558
Merge branch 'master' of https://github.com/ray-project/ray
bveeramani c40b06d
Merge branch 'master' of https://github.com/ray-project/ray
bveeramani 4731063
Address review comments
bveeramani d80b12a
Fix test
bveeramani a709fe4
Fix test
bveeramani f5f10f5
Fix test
bveeramani 9593bc6
Fix bug
bveeramani 49a8d71
Merge branch 'master' of https://github.com/ray-project/ray
bveeramani 7d8aae2
Merge branch 'master' into fix-read-sql
bveeramani 1f247b3
Fix typo
bveeramani File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just raise an error. warning is implicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To understand, if DB table is huge (1B rows or more), will this be single threaded ingest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Many DBAPI implementations don't support multithreading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this right, we may end up with very slow ingest with just 1 task for DBs and also OOM kills. While for files, we are able to do support parallel ingests in a scaled out fashion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right.
What do we do as an alternative that's both scalable and correct? Many
OFFSET
implementations require scanning the entire database. So,OFFSET
andLIMIT
often perform the same or worse than a single task that reads the entire database.