New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Templates] Unify the batch inference template with an existing Data example #36401

Merged

matthewdeng merged 16 commits into ray-project:master from justinvyu:templates/update_bi

Jun 15, 2023

Contributor

justinvyu commented Jun 14, 2023 •

edited

Loading

This PR de-duplicates the batch inference template by making it the same as the existing pytorch gpu batch inference example. There still needs to be a copy due to relative references in the docs not generating correctly when pulling the notebook code directly.

This PR also fixes some typos in the Data example and changes some code to have no warnings show up when running through the example (increasing the model + dataset size for a reasonable batch size with 4 workers + using a kwarg when initializing the resnet model with weights).

Notes

GPU utilization after (no warnings about reducing the batch size):

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

justinvyu added 6 commits

June 13, 2023 14:51


          Switch to using the existing ViT example

93a5e01

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          Use the pytorch resnet example instead

84b5512

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          Fix up the original example

8f060c3

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          Switch back to map + fix to not show any warnings

9638f34

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          Merge branch 'master' of https://github.com/ray-project/ray into temp…

280349e

…lates/update_bi


          Update examples to max out batch size with a larger model + dataset

e7e65ce

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

justinvyu assigned amogkam

justinvyu requested review from ericl, scv119, c21, amogkam, scottjlee, bveeramani, raulchen, maxpumperla, a team and sofianhnaide as code owners

June 14, 2023 00:50


          fix some mistakes

c5ac89e

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

Contributor Author

justinvyu commented Jun 14, 2023 •

edited

Loading

Template running as release tests: https://buildkite.com/ray-project/release-tests-pr/builds/42209


          Use large training dataset

8fdfff2

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

amogkam reviewed

View reviewed changes

doc/source/data/examples/pytorch_resnet_batch_prediction.ipynb Show resolved Hide resolved

doc/source/data/examples/pytorch_resnet_batch_prediction.ipynb Show resolved Hide resolved

doc/source/data/examples/pytorch_resnet_batch_prediction.ipynb Show resolved Hide resolved

doc/source/data/examples/pytorch_resnet_batch_prediction.ipynb Outdated Show resolved Hide resolved

doc/source/data/examples/pytorch_resnet_batch_prediction.ipynb Show resolved Hide resolved

justinvyu added 6 commits

June 14, 2023 00:23


          Update example name

7e6b72e

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          Address comments

1c01ba9

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          Merge branch 'master' of https://github.com/ray-project/ray into temp…

09779c1

…lates/update_bi


          Don't save images as part of predictions (causes oom on head node)

686ce3a

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          Merge branch 'master' of https://github.com/ray-project/ray into temp…

b65e914

…lates/update_bi


          Merge branch 'master' of https://github.com/ray-project/ray into temp…

3ed532d

…lates/update_bi

amogkam approved these changes

View reviewed changes

Contributor

amogkam left a comment

Overall lgtm, but can we remove the explicit materialize call? It’s not necessary and prevents from streaming to the writes

justinvyu added 2 commits

June 14, 2023 13:42


          Don't materialize explicitly

b1f4b9a

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          Small wording fix

a611d22

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

Contributor Author

justinvyu commented Jun 14, 2023

@amogkam Done. I originally had a materialize because running take_batch and write_parquet would both seem to run the full dataset execution. However, this is not actually the case since take_batch actually only runs the prediction on a small amount of data, and write_parquet finishes the execution on the rest of the data.

So, the predictions are only computed 1x, rather than 2x as I originally thought. Is that correct?

Contributor

amogkam commented Jun 14, 2023

Yep that’s right it will only run 1x, barring a few extra samples

justinvyu added the tests-ok label

sofianhnaide approved these changes

View reviewed changes

matthewdeng approved these changes

View reviewed changes

matthewdeng merged commit 2e37a2a into ray-project:master

justinvyu added a commit that referenced this pull request


          [Templates] Unify the batch inference template with an existing Data …

797bd4a

…example (#36401)

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

justinvyu added a commit to justinvyu/ray that referenced this pull request


          [Templates] Unify the batch inference template with an existing Data …

059c1cc

…example (ray-project#36401)

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

akshay-anyscale mentioned this pull request

Add service deployment instructions to stable diffusion template #37645

Closed

8 tasks

arvind-chandra pushed a commit to lmco/ray that referenced this pull request


          [Templates] Unify the batch inference template with an existing Data …

9786d55

…example (ray-project#36401)

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

matthewdeng matthewdeng approved these changes

amogkam amogkam approved these changes

sofianhnaide sofianhnaide approved these changes

ericl Awaiting requested review from ericl

scv119 Awaiting requested review from scv119

c21 Awaiting requested review from c21

scottjlee Awaiting requested review from scottjlee

bveeramani Awaiting requested review from bveeramani

raulchen Awaiting requested review from raulchen

maxpumperla Awaiting requested review from maxpumperla

Labels