Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Datasets] Use read stage name for naming Data-read tasks on Ray Dashboard #34341

Merged
merged 5 commits into from
Apr 13, 2023

Conversation

scottjlee
Copy link
Contributor

Why are these changes needed?

Previously, the name of read tasks uses the generic DoRead name:
screen_shot_2023-04-11_at_4 08 22_pm

This PR updates the naming so that we use the underlying read stage name, if available from the input LazyBlockList, as the resulting MapOperator; otherwise, we fall back to the existing DoRead name. For example, when running this code:

import ray

ctx = ray.data.DataContext.get_current()
ctx.new_execution_backend = True

ds = ray.data.read_parquet("example://iris.parquet")
data = ds.take(5)
import time
time.sleep(20)

This generates the task with the name ReadParquet:
Screenshot at Apr 12 14-52-33

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Scott Lee <sjl@anyscale.com>
@scottjlee scottjlee marked this pull request as ready for review April 13, 2023 00:00
Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we can test this?

Copy link
Contributor

@c21 c21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah let's also add a unit test if it's not too hard.

@pcmoritz
Copy link
Contributor

Very nice!

Scott Lee added 2 commits April 13, 2023 10:20
Signed-off-by: Scott Lee <sjl@anyscale.com>
Signed-off-by: Scott Lee <sjl@anyscale.com>
python/ray/data/tests/test_execution_optimizer.py Outdated Show resolved Hide resolved
python/ray/data/tests/test_execution_optimizer.py Outdated Show resolved Hide resolved
Scott Lee added 2 commits April 13, 2023 11:04
Signed-off-by: Scott Lee <sjl@anyscale.com>
Signed-off-by: Scott Lee <sjl@anyscale.com>
@scottjlee scottjlee requested a review from c21 April 13, 2023 20:48
@c21
Copy link
Contributor

c21 commented Apr 13, 2023

Documentation failure is not related to this PR. Merging to master.

@c21 c21 merged commit 3ea9f72 into ray-project:master Apr 13, 2023
vitsai pushed a commit to vitsai/ray that referenced this pull request Apr 17, 2023
…board (ray-project#34341)

This PR updates the naming so that we use the underlying read stage name, if available from the input `LazyBlockList`, as the resulting `MapOperator`; otherwise, we fall back to the existing `DoRead` name.

Signed-off-by: Scott Lee <sjl@anyscale.com>
elliottower pushed a commit to elliottower/ray that referenced this pull request Apr 22, 2023
…board (ray-project#34341)

This PR updates the naming so that we use the underlying read stage name, if available from the input `LazyBlockList`, as the resulting `MapOperator`; otherwise, we fall back to the existing `DoRead` name.

Signed-off-by: Scott Lee <sjl@anyscale.com>
Signed-off-by: elliottower <elliot@elliottower.com>
ProjectsByJackHe pushed a commit to ProjectsByJackHe/ray that referenced this pull request May 4, 2023
…board (ray-project#34341)

This PR updates the naming so that we use the underlying read stage name, if available from the input `LazyBlockList`, as the resulting `MapOperator`; otherwise, we fall back to the existing `DoRead` name.

Signed-off-by: Scott Lee <sjl@anyscale.com>
Signed-off-by: Jack He <jackhe2345@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants