-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Datasets] Support select_columns
to select subset of columns
#27667
Comments
In addition, as discussed, lazy-first execution + indexing on columns names would be a great UX boost too, but it's not urgent. |
Just FYI one more user request in https://ray-distributed.slack.com/archives/C02PHB3SQHH/p1664802732112309 . This should be prioritized. |
anyone has bandwidth in near future can feel free to grab this task. |
took a quick look, @c21 for |
Now this should be implemented even easier with this Block API select(): https://sourcegraph.com/github.com/ray-project/ray@master/-/blob/python/ray/data/block.py?L279 |
ah so something like |
Description
Datasets have
add_column()
anddrop_columns()
to add or drop columns. But it's not flexible enough when user wants to select a subset of existing columns. We can provide a newselect_columns
API to do it, and also deprecate existingadd/drop_column
API.Use case
Help better UX when user manipulates a subset of columns.
The text was updated successfully, but these errors were encountered: