Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Homogenise and debug RL solvers and implement action masking in RLlib #308

Merged
merged 12 commits into from
Jan 23, 2024

Conversation

fteicht
Copy link
Collaborator

@fteicht fteicht commented Jan 19, 2024

This PR both improves the RL experience in scikit-decide and corrects some bugs in the RL solvers:

  • Implement action masking in the RLlib solver for the domains that implement _get_application_actions_from()
  • Implements the bi-directional conversion from numpy arrays to user-defined classes in the scikit-decide spaces deriving from gym.Space
  • Allows DictSpace and TupleSpace to use subspaces optionally deriving from GymSpace, and implements proper recursive unwrapping and wrapping of elements belonging to those subspaces
  • Implements a SetSpace space that derives from GymSpace and that provides efficient search in the contains method
  • Allows the hub domains Maze, SimpleGridWorld and Mastermind to be solvable by RLlib in addition to StableBaselines
  • Correct a bug in scikit-decide's gym domain where the reset method did not unwrap the initial observation
  • Correct a bug in scikit-decide's StableBaselines solver where the _sample_action method did not unwrap the current observation
  • Improve coverage of RL solvers in the unit tests

@fteicht fteicht added bug Something isn't working enhancement New feature or request python Pull requests that update Python code labels Jan 19, 2024
@fteicht fteicht requested a review from neo-alex January 19, 2024 10:31
@fteicht fteicht self-assigned this Jan 19, 2024
Copy link
Collaborator

@neo-alex neo-alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, thank you!

@fteicht fteicht merged commit 7b40743 into airbus:master Jan 23, 2024
42 checks passed
@fteicht fteicht deleted the rayrllib-restricted-actions branch January 23, 2024 16:17
fteicht added a commit that referenced this pull request Jan 26, 2024
This commit improves the support of unified planning in scikit-decide, notably enabling to solve UP domains with the RLlib solver. It relies on:

- the recent handling of state-based applicable actions, which are produced by unified planning domains, in the RLlib solver (PR #308) ;

- the automatic translation to/from unified planning states and actions from/to numpy arrays that can be handled by RLlib.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants