-
-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Issue with TD3 for multi-dimensional action spaces #624
Comments
I vaguely remember it should be |
na in this case is an Integer which is 1. So, that gets the same result. I will try the inner comma though. |
No luck with the inner comma. I did confirm ns is a Int64, so I assume na would/should be as well. So 1 should work just the same as assigning 1 to na and then using na. |
Thanks for the feedback, I'll take a look into it later tonight. |
Hi @tyleringebrand , I think the bug comes from the following line: ReinforcementLearning.jl/src/ReinforcementLearningZoo/src/algorithms/policy_gradient/td3.jl Line 162 in 0a8b9a6
I should fix it soon. Thanks again for reporting it. |
@all-contributors please add @tyleringebrand for bug |
I've put up a pull request to add @tyleringebrand! 🎉 |
Thanks @findmyway! I tested it on my custom MDP and everything works as expected. For anyone who runs into this before the patch is released (I assume version v0.11), you can get the bug fix using "add ReinforcementLearningZoo#findmyway-patch-7" in the package manager. |
I recently tried to use TD3 for a custom MDP I wrote, and run into an error when I tried to make it work for MDPs with more than 1 action dimension. I am able to reproduce the same bug with this code:
Note this code is taken directly from https://juliareinforcementlearning.org/docs/experiments/experiments/Policy%20Gradient/JuliaRL_TD3_Pendulum/#JuliaRL\\_TD3\\_Pendulum
except for one change in trajectory object. See the comments in the code.
The error I get is:
adError: DimensionMismatch("mismatch in dimension 2 (expected 64 got 1)")
which is about 40 layers deep in the stack trace, which occurs during Zygote auto differentiation:My package versions look like:
I am using julia 1.6.2.
Any help on how to make TD3 work for multidimensional action spaces? Is this a bug or user error?
The text was updated successfully, but these errors were encountered: