Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data] Correctly refresh elapsed time in ProgressBar output #46974

Merged
merged 6 commits into from
Aug 7, 2024

Conversation

scottjlee
Copy link
Contributor

@scottjlee scottjlee commented Aug 5, 2024

Why are these changes needed?

When using Ray Data with a slow UDF for map operators, the progress bar's "elapsed time" item is only updated each time a task finishes. If the task is slow, this means that the elapsed time is left "hanging" and appears to be stuck, when this is not necessarily the case.

This PR fixes the refreshing of the progress bar, so that the elapsed time ticks up as expected. For the following example code, we can see the change in behavior in the ticking of the progress bar:

import time
import ray

def f(batch):
    time.sleep(10)
    return batch

ds = ray.data.range(50, override_num_blocks=50).map(f)
ds.materialize()

Related issue number

Closes #44689

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Scott Lee and others added 3 commits August 5, 2024 16:35
Signed-off-by: Scott Lee <sjl@anyscale.com>
Signed-off-by: Scott Lee <sjl@anyscale.com>
Copy link
Contributor

@omatthew98 omatthew98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Scott Lee added 2 commits August 6, 2024 10:47
Signed-off-by: Scott Lee <sjl@anyscale.com>
Copy link
Member

@bveeramani bveeramani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@@ -320,6 +320,9 @@ def _scheduling_loop_step(self, topology: Topology) -> bool:
# Update the progress bar to reflect scheduling decisions.
for op_state in topology.values():
op_state.refresh_progress_bar(self._resource_manager)
# Refresh the global progress bar to update elapsed time progress.
if self._global_info and self._global_info._bar:
self._global_info._bar.refresh()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a refresh method to ProgressBar? My impression is that this is breaking an abstraction barrier by accessing the private ProgressBar._bar attribute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah good point, that will help clean up the logic as well. added it

Signed-off-by: Scott Lee <sjl@anyscale.com>
@scottjlee scottjlee added the go add ONLY when ready to merge, run all tests label Aug 7, 2024
@bveeramani bveeramani merged commit 5cd9a15 into ray-project:master Aug 7, 2024
6 checks passed
dev-goyal pushed a commit to dev-goyal/ray that referenced this pull request Aug 8, 2024
…oject#46974)

When using Ray Data with a slow UDF for map operators, the progress
bar's "elapsed time" item is only updated each time a task finishes. If
the task is slow, this means that the elapsed time is left "hanging" and
appears to be stuck, when this is not necessarily the case.

This PR fixes the refreshing of the progress bar, so that the elapsed
time ticks up as expected. For the following example code, we can see
the change in behavior in the ticking of the progress bar:
```
import time
import ray

def f(batch):
    time.sleep(10)
    return batch

ds = ray.data.range(50, override_num_blocks=50).map(f)
ds.materialize()
```

- Before (ticks every 10 seconds):

https://github.com/user-attachments/assets/d46f0d6f-dacc-4148-a12c-001fcf44f008

- After (ticks every 1 second):

https://github.com/user-attachments/assets/88807215-9e81-434f-b186-b5d597bc3c59

---------

Signed-off-by: Scott Lee <sjl@anyscale.com>
Signed-off-by: Dev <dev.goyal@hinge.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Data] Progress bar doesn't refresh if slow UDF
3 participants