Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Templates] Unify the batch inference template with an existing Data example #36401

Merged
merged 16 commits into from
Jun 15, 2023

Conversation

justinvyu
Copy link
Contributor

@justinvyu justinvyu commented Jun 14, 2023

This PR de-duplicates the batch inference template by making it the same as the existing pytorch gpu batch inference example. There still needs to be a copy due to relative references in the docs not generating correctly when pulling the notebook code directly.

This PR also fixes some typos in the Data example and changes some code to have no warnings show up when running through the example (increasing the model + dataset size for a reasonable batch size with 4 workers + using a kwarg when initializing the resnet model with weights).

Notes

GPU utilization after (no warnings about reducing the batch size):

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
@justinvyu
Copy link
Contributor Author

justinvyu commented Jun 14, 2023

Template running as release tests: https://buildkite.com/ray-project/release-tests-pr/builds/42209

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Copy link
Contributor

@amogkam amogkam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm, but can we remove the explicit materialize call? It’s not necessary and prevents from streaming to the writes

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
@justinvyu
Copy link
Contributor Author

@amogkam Done. I originally had a materialize because running take_batch and write_parquet would both seem to run the full dataset execution. However, this is not actually the case since take_batch actually only runs the prediction on a small amount of data, and write_parquet finishes the execution on the rest of the data.

So, the predictions are only computed 1x, rather than 2x as I originally thought. Is that correct?

@amogkam
Copy link
Contributor

amogkam commented Jun 14, 2023

Yep that’s right it will only run 1x, barring a few extra samples

@justinvyu justinvyu added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Jun 15, 2023
@matthewdeng matthewdeng merged commit 2e37a2a into ray-project:master Jun 15, 2023
justinvyu added a commit that referenced this pull request Jun 15, 2023
…example (#36401)

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
justinvyu added a commit to justinvyu/ray that referenced this pull request Jun 23, 2023
…example (ray-project#36401)

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
…example (ray-project#36401)

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants