Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] CQL change hparams and data reading strategy #27451

Merged
merged 3 commits into from
Aug 5, 2022

Conversation

avnishn
Copy link
Member

@avnishn avnishn commented Aug 3, 2022

Signed-off-by: avnish avnish@anyscale.com

image

tweaking the cql the same way I tweaked pretty much every other offline algorithm so far.
Added the dataset reader for faster speeds.

Why are these changes needed?

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

avnishn added 2 commits August 3, 2022 12:57
Signed-off-by: avnish <avnish@anyscale.com>
Signed-off-by: avnish <avnish@anyscale.com>
@avnishn
Copy link
Member Author

avnishn commented Aug 3, 2022

https://buildkite.com/ray-project/release-tests-pr/builds/12452
merge pending this test passing

@avnishn avnishn requested a review from kouroshHakha August 3, 2022 20:24
Signed-off-by: Avnish <avnishnarayan@gmail.com>
@avnishn
Copy link
Member Author

avnishn commented Aug 5, 2022

buildkite/ray-builders-pr — Build #41560 failed (3 hours, 2 minutes, 32 seconds)
Details
buildkite/ray-builders-pr/octopus-brain-tune-tests-and-examples-using-rllib — Failed
Details
buildkite/ray-builders-pr/python-medium-k-z — Failed
Details
buildkite/ray-builders-pr/windows-build-and-test — Failed
Details

failing tests unrelated.
release tests passing:
image

This is ready to merge

num_gpus: 1
metrics_smoothing_episodes: 5
min_time_s_per_iteration: 30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this necessary? This is offline RL so more number of iterations should have worked equally here. If this is necessary something does not make sense, if it wasn't necessary we should remove this hparam and instead increase timesteps_total to be consistent with other learning tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait this parameter controls the logging frequency. I increased it so that we don't run too many unnecessary evaluations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not just that, but also the training time spent per iteration. so you keep running the same training_step() function until that timing requirement is met. In other words you keep taking gradient updates until that time is reached.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We chatted offline and my concern was more about reproducibility. These nits are not merge blockers. So please merge if it's the only thing holding this back.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will open a separate pr that addresses this issue across all release tests and environments

@richardliaw richardliaw merged commit 6a31b61 into ray-project:master Aug 5, 2022
Stefan-1313 pushed a commit to Stefan-1313/ray_mod that referenced this pull request Aug 18, 2022
Signed-off-by: Stefan van der Kleij <s.vanderkleij@viroteq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants