-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Introduce IMPALA off_policyness test with GPU #31485
[RLlib] Introduce IMPALA off_policyness test with GPU #31485
Conversation
Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>
Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great man!! separate testing of difference things into different tests!!
just 1 quick question.
@@ -71,7 +71,7 @@ def __init__( | |||
""" | |||
self.capacity = capacity | |||
self.replay_ratio = replay_ratio | |||
self.replay_proportion = None | |||
self.replay_proportion = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you flip this flag here? it looks important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh good point. That's by accident. I wondered why it was None.
I'll change it back and ping you after tests are green again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggested changes from jun
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome!
Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>
Signed-off-by: Artur Niederfahrenhorst artur@anyscale.com
Why are these changes needed?
IMPALA runs with a GPU learner "by original design". A compilation test without GPU is fine but something like off_policyness depends on learning/sampling speed and we should test it for our standard impala settings (gpu=1, num_rollout_workers=2) - with a GPU and without competing over resources. Only then will we be able to get meaningful measurements of it's off-policyness.
This PR also tries to deflake IMPALA tests.