-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] DatasetReader action normalization #27356
[RLlib] DatasetReader action normalization #27356
Conversation
39583cf
to
317f0e4
Compare
Signed-off-by: Charles Sun <charlesjsun@gmail.com>
317f0e4
to
d43d279
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks good to me. I think one thing is that if there isn't a test for this already, we should make the test you added apply not just to offline datasets but also to other regular environments.
If you think im wrong lmk :)
@@ -8,7 +8,7 @@ | |||
import ray.data | |||
from ray.rllib.offline.input_reader import InputReader | |||
from ray.rllib.offline.io_context import IOContext | |||
from ray.rllib.offline.json_reader import from_json_data | |||
from ray.rllib.offline.json_reader import from_json_data, postprocess_actions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
This PR doesn't apply in the online case because this just normalizes actions from dataset (i.e. DatasetReader). Online action normalization is handled elsewhere (somewhere in env runner). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only other thing I would test if I were you is that if you set any other flag combination of actions_in_input_normalized and normalize_actions that the actions won't be normalized by dataset if they aren't supposed to be
Signed-off-by: Charles Sun <charlesjsun@gmail.com>
Addressed! Added all combinations of actions_in_input_normalized and normalize_actions |
Signed-off-by: Stefan van der Kleij <s.vanderkleij@viroteq.com>
Why are these changes needed?
DatasetReader didn't normalize actions when
actions_in_input_normalized
was False, which meant CRR was using unnormalized actions during training.Added normalization to DatasetReader (same as JsonReader)
Related issue number
Closes #27335
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.