-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rllib] Port DDPG to the build_tf_policy pattern #5242
Conversation
This reverts commit 5f64551.
Test FAILed. |
fd50692
to
10be568
Compare
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test PASSed. |
Test FAILed. |
Test FAILed. |
Test PASSed. |
What do these changes do?
This ports DDPG to the policy builder pattern. This is the last major algorithm that needed to be ported.
Pendulum performance seems to be on par. @joneswong could you check if parameter noise exploration still works as expected? There was a lot of changes around handling in that code.
fyi @qxcv @gehring
Related issue number
Closes #4822
Closes #4788
Linter
scripts/format.sh
to lint the changes in this PR.