Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data] Progress Bar: Sort sample in "rows" and remove the duplicate Sort sample. #47106

Merged
merged 12 commits into from
Aug 15, 2024

Conversation

Bye-legumes
Copy link
Contributor

Why are these changes needed?

Currently, the sort sample is not in rows and there is a duplicate sort sample progress bar.
image

With this modification, Sort sample will be also in rows and the additional progress bar will be removed.
image

In fact there should only one sort sample progress bar which is created at

SortTaskSpec.SORT_SAMPLE_SUB_PROGRESS_BAR_NAME,
while the one created in

sub_progress_bar_names=[
                SortTaskSpec.SORT_SAMPLE_SUB_PROGRESS_BAR_NAME,
                ExchangeTaskSpec.MAP_SUB_PROGRESS_BAR_NAME,
                ExchangeTaskSpec.REDUCE_SUB_PROGRESS_BAR_NAME,
            ],

should be deleted.

Related issue number

Checks

  • [√] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [√] I've run scripts/format.sh to lint the changes in this PR.
  • [ √ I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [√] Unit tests
    • [√] Release tests
    • This PR is not tested :(

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
@Bye-legumes
Copy link
Contributor Author

@scottjlee With this fix, all progress bar (expect two when read the file) are replaced with rows. And the sort sample problem is also fixed.

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
Comment on lines 173 to 179
if hasattr(result, "num_rows"):
num_rows = result.num_rows
elif hasattr(result, "__len__"):
# For output is DataFrame,i.e. sort_sample
num_rows = len(result)
else:
num_rows = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you consolidate this logic and the one in block_until_complete() as a static method in ProgressBar or a utility method in the file?

@@ -125,7 +125,6 @@ def __init__(
"Sort",
input_op,
sub_progress_bar_names=[
SortTaskSpec.SORT_SAMPLE_SUB_PROGRESS_BAR_NAME,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this. I think we actually want to keep this bar here, so that it is initialized as a sub-progress bar for the AbstractAllToAll op, and we should remove the one that is created in SortTaskSpec.sample_boundaries(). We want the "Sort Sample" to be a sub-bar of the overall Sort bar, like this:
357460721-30aa9fc3-8e96-473e-a794-da4fc023093a

In terms of the concrete change, we can update sample_boundaries() to reference the sub-progress bar that is part of the operator. This will also require us to either:

(1) change sample_boundaries from a static method to class method, and use the object's sub-progress bar
(2) add a sub_progress_bar parameter to sample_boundaries(), and at the callers of this method, pass the SortTaskSpec instance's progress bar (e.g. here)

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
Copy link
Contributor

@omatthew98 omatthew98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes lgtm, can you upload a new image of what the progress bars look like with these changes + whatever script you used for the example for posterity?

@Bye-legumes
Copy link
Contributor Author

Changes lgtm, can you upload a new image of what the progress bars look like with these changes + whatever script you used for the example for posterity?

Sure!, let me make it better and I will @ you and scott when I finished!

@Bye-legumes Bye-legumes changed the title [Data] Progress Bar: Sort sample in "rows" and remove the duplicate Sort sample. [WIP][Data] Progress Bar: Sort sample in "rows" and remove the duplicate Sort sample. Aug 14, 2024
Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
@Bye-legumes
Copy link
Contributor Author

Bye-legumes commented Aug 14, 2024

Hi, @scottjlee and @omatthew98 here is what look like now.
During exctuion
image
After
image

TOOD: There is one missing step that is: I tried different solutions but the sort sample bar doesn't update although it finished in fetch_until_complete. I double checked the _bar.total and progress in ray/python/ray/data/_internal/progress_bar.py can be update correctly, but it doesn't show.

But at least we have the sort sample inside the main now.

FYI
sample_bar.close() need to be deleted as otherwise, since the sort sample finished early, the sort sample bar will always above the main bar, and looks like this
image

@Bye-legumes Bye-legumes changed the title [WIP][Data] Progress Bar: Sort sample in "rows" and remove the duplicate Sort sample. [Data] Progress Bar: Sort sample in "rows" and remove the duplicate Sort sample. Aug 14, 2024
@scottjlee
Copy link
Contributor

@Bye-legumes Nice, the updated bars look great!

I tried different solutions but the sort sample bar doesn't update although it finished in fetch_until_complete.

For this part, which update are you referring to? In the "after" screenshot you shared, it looks like the Sort Sample bar updated progress to 100%, and shows numerator == denominator?

Also another small question/request. It looks like in the initially created progress bar for Sort Sample (in the "before execution" screenshot), it shows up as its own bar, and not a sub-bar of Sort, and it also shows the index of the bar (Sort Sample 2). This is later corrected by the time execution completes, in the "after" screenshot. Could you take a look at if the bar is being initialized incorrectly?

@Bye-legumes
Copy link
Contributor Author

@Bye-legumes Nice, the updated bars look great!

I tried different solutions but the sort sample bar doesn't update although it finished in fetch_until_complete.

For this part, which update are you referring to? In the "after" screenshot you shared, it looks like the Sort Sample bar updated progress to 100%, and shows numerator == denominator?

Also another small question/request. It looks like in the initially created progress bar for Sort Sample (in the "before execution" screenshot), it shows up as its own bar, and not a sub-bar of Sort, and it also shows the index of the bar (Sort Sample 2). This is later corrected by the time execution completes, in the "after" screenshot. Could you take a look at if the bar is being initialized incorrectly?

  1. During the excution, the sort sample is always 0/1 and cannot be update.
  2. Yeah, that is the question that I left a TODO there. I tried to solve it. But in fact, I passed the sub progress bar to the and I checked the object number. The sub progress bar is passed to the sort sample with fetch_until_finished. But it will keeps as that before finished. Also when the bar starts before any updating. They will all has the number, but it becomes "*- {name} " after update (or set description later.)
    image

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
@scottjlee
Copy link
Contributor

I think what's going on with the 0/1 in Sort bar is that:

  • When we initialize the sub-progress bar in initialize_sub_progress_bar(), the total is set to 1.
  • After the Sort Sample finishes, we get the row count from the resulting output block, which updates the numerator and denominator accordingly.

So, I think we can initialize the sub progress bar (or call .update() with the updated total) using the n_samples from here.

Regarding the indices in the most recent screenshot you posted looks good actually, since it shows the operators in order with the indices. The description is initialized with <op_name> <position> in tqdm_ray here, then later updated in here. I think it's OK to leave this behavior as is for now. We can decide later if the indices appearing/disappearing is confusing behavior.

@Bye-legumes
Copy link
Contributor Author

I think what's going on with the 0/1 in Sort bar is that:

  • When we initialize the sub-progress bar in initialize_sub_progress_bar(), the total is set to 1.
  • After the Sort Sample finishes, we get the row count from the resulting output block, which updates the numerator and denominator accordingly.

So, I think we can initialize the sub progress bar (or call .update() with the updated total) using the n_samples from here.

Regarding the indices in the most recent screenshot you posted looks good actually, since it shows the operators in order with the indices. The description is initialized with <op_name> <position> in tqdm_ray here, then later updated in here. I think it's OK to leave this behavior as is for now. We can decide later if the indices appearing/disappearing is confusing behavior.

OK. Maybe I can try to solve it later. I just fix any where it calls sort sample,i.e. in aggragate, so it's also a sub_progress bar now
image

sample_bar = ProgressBar(
SortTaskSpec.SORT_SAMPLE_SUB_PROGRESS_BAR_NAME,
len(sample_results),
unit="block",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, if we are setting the unit as block here, how come the screenshot in #47106 (comment) shows the unit as row for the Sort Sample bars?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see, this is only for the case where the bar is not passed from calling sample_boundaries. understood

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think if we use row as the unit here, it should still be supported right? since you implemented extract_num_rows() which will get the num_rows from blocks?

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
Copy link
Contributor

@scottjlee scottjlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@scottjlee scottjlee added the go add ONLY when ready to merge, run all tests label Aug 15, 2024
@scottjlee scottjlee merged commit 2962b1b into ray-project:master Aug 15, 2024
6 checks passed
simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Aug 16, 2024
…ort sample. (ray-project#47106)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Currently, the sort sample is not in rows and there is a duplicate sort
sample progress bar.

![image](https://github.com/user-attachments/assets/30aa9fc3-8e96-473e-a794-da4fc023093a)

With this modification, Sort sample will be also in rows and the
additional progress bar will be removed.

![image](https://github.com/user-attachments/assets/f0a3e5b6-3f84-4993-9f03-36a350aa47b0)

In fact there should only one sort sample progress bar which is created
at
https://github.com/ray-project/ray/blob/e066289b374464f1e2692382fdea871eb34e3156/python/ray/data/_internal/planner/exchange/sort_task_spec.py#L166
while the one created in
```
sub_progress_bar_names=[
                SortTaskSpec.SORT_SAMPLE_SUB_PROGRESS_BAR_NAME,
                ExchangeTaskSpec.MAP_SUB_PROGRESS_BAR_NAME,
                ExchangeTaskSpec.REDUCE_SUB_PROGRESS_BAR_NAME,
            ],
``` 
should be deleted.
## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [√] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [√] I've run `scripts/format.sh` to lint the changes in this PR.
- [ √ I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [√] Unit tests
   - [√] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: zhilong <zhilong.chen@mail.mcgill.ca>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants