Fix ray_rllib solver after the upgrade to ray 2.7.0 #284

nhuet · 2023-11-24T16:22:37Z

The RayRLlib solver class wraps ray.rllib algorithms.
Since we upgrade to gymnasium, the dependency in ray also upgrade from
1.10 to >=2.7.0 and the rllib api changed a bit.

Main changes/fixes:

ray.rllib.agents becomes ray.rllib.algorithms, Trainer -> Algorithm
policy_mapping_fn: maps now agent_id + episode + worker -> policy_id
algo.compute_action -> compute_single_action()
MultiAgentEnv follows now gym0.26/gymnasium convention => possibility
to use MultiAgentEnvCompatibility rllib wrapper. We choose here,
consistently with the work done in skdecide.hub.domain.gym to
keep the old class AsRLlibMultiAgentEnv with the gym 0.21 api and wraps it.
More precisely:
- the old env is renamed as AsLegacyRLlibMultiAgentEnv and derives from AsLegacyGymV21Env
  in order to share some methods and define some missing ones (as render(), seed(), close())
- the new AsRLlibMultiAgentEnv derives from MultiAgentEnvCompatibility which itself
  derives from MultiAgentEnv
MultiAgentEnv needs a get_agent_ids() (or at least a _agent_ids
attribute)
MultiAgentEnv needs to define observation_space and action_space
attributes. Preferably as gym.spaces.Dict in order to allow implicit
implementation of new methods action_space_sample() and observation_space_sample()
done output in step() must be a dictionary with a key by agent + a
"__all__ " key
algo.load() takes directly the directory path (which is overwritten at
each store)

Other fixes done here:

__init__() of RayRLlib was using a mutable object as default value for
policy_config which can lead to unforeseen bugs
main script moved as test in tests/ together with a test dedicated to
AsRLlibMultiAgentEnv
fixes "__all__" value of done dictionary as MultiAgent domains are
supposed to returned a dictionary agent_id->boolean instead of a
single boolean
update the examples/rllib_solver.py
test also single agent domain to check the autocast works properly
with rllib

neo-alex

Thanks, please consider the few suggested minor changes and then LGTM.

neo-alex · 2023-11-30T13:10:08Z

tests/solvers/python/test_ray_rllib.py

+def test_as_rllib_env_with_autocast_from_singleagent_to_multiagents():
+    ENV_NAME = "CartPole-v1"
+
+    domainupcasted = GymDomain(gym.make(ENV_NAME))


As in the previous PR, I suggest renaming it upcast_domain

of course, i change it right away

neo-alex · 2023-11-30T13:11:41Z

tests/solvers/python/test_ray_rllib.py

+    assert RayRLlib.check_domain(domain)
+
+    # solver factory
+    config_factory = lambda: PPO.get_default_config().resources(


A config_factory seems overkill here (why not just config?)

because it is crashing later when reusing it with the loaded solver (line 89). When loading, we reinitialize the underlyning algorithm and it does not want to use the previous config which has been "frozen" during training. So we need a fresh new config at that second step which is testing save/load.
And to be sure having the same config values as before I used this factory (perhaps a bit overkill, but i do not write twice the same values when i can avoid it since sooner or later, it introduce bugs when editing it only in one place...)

I added a comment in the code to explain it.

Oh OK, and then I suppose that the config being frozen is an issue (even if we don't modify it?)...

This solver class wraps ray.rllib algorithms. Since we upgrade to gymnasium, the dependency in ray also upgrade from 1.10 to >=2.7.0 and the rllib api changed a bit. Main changes/fixes: - ray.rllib.agents becomes ray.rllib.algorithms, Trainer -> Algorithm - policy_mapping_fn: maps now agent_id + episode + worker -> policy_id - algo.compute_action -> compute_single_action() - MultiAgentEnv follows now gym0.26/gymnasium convention => possibility to use MultiAgentEnvCompatibility rllib wrapper. We choose here, consistently with the work done in skdecide.hub.domain.gym to keep the old class AsRLlibMultiAgentEnv with the gym 0.21 api and wraps it. More precisely: - the old env is renamed as AsLegacyRLlibMultiAgentEnv and derives from AsLegacyGymV21Env in order to share some methods and define some missing ones (as render(), seed(), close()) - the new AsRLlibMultiAgentEnv derives from MultiAgentEnvCompatibility which itself derives from MultiAgentEnv - MultiAgentEnv needs a get_agent_ids() (or at least a _agent_ids attribute) - MultiAgentEnv needs to define observation_space and action_space attributes. Preferably as gym.spaces.Dict in order to allow implicit implementation of new methods action_space_sample() and observation_space_sample() - done output in step() must be a dictionary with a key by agent + a __all__ key - algo.load() takes directly the directory path (which is overwritten at each store) Other fixes done here: - __init__() of RayRLlib was using a mutable object as default value for policy_config which can lead to unforeseen bugs - main script moved as test in tests/ together with a test dedicated to AsRLlibMultiAgentEnv - fixes "__all__" value of done dictionary as MultiAgent domains are supposed to returned a dictionary agent_id->boolean instead of a single boolean - update the examples/rllib_solver.py - test also single agent domain to check the autocast works properly with rllib

…class definition We faced serialization issues with ray.rllib solver and python 3.10. This seemed to be due to NewType definition inside Domain class. Actually the NewType assumed that the resulting is belonged by the module and not the class which causes pickle issues. See for instance https://stackoverflow.com/a/4677063 for similar issue.

The tests are hanging for ever on macos 11 in guthub actions with the following message looping: > (autoscaler +5h31m36s) Warning: The following resource request cannot be scheduled right now: {'CPU': 1.0}. This is likely due to all cluster resources being claimed by actors. Consider creating fewer actors or adding more nodes to this Ray cluster. We limit the cpu quota allowed per worker to avoid this.

nhuet force-pushed the rllib branch 5 times, most recently from b099f6f to 7df3abc Compare November 30, 2023 11:52

nhuet marked this pull request as ready for review November 30, 2023 11:52

neo-alex approved these changes Nov 30, 2023

View reviewed changes

nhuet added 3 commits November 30, 2023 14:40

nhuet force-pushed the rllib branch from 7df3abc to d3b4901 Compare November 30, 2023 13:43

neo-alex merged commit fbf01e6 into airbus:master Nov 30, 2023

This was referenced Dec 1, 2023

Support ray.rllib >= 2.0.0 #282

Closed

Update RLlib multi-agent environment wrapper #197

Closed

nhuet deleted the rllib branch December 12, 2023 08:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ray_rllib solver after the upgrade to ray 2.7.0 #284

Fix ray_rllib solver after the upgrade to ray 2.7.0 #284

nhuet commented Nov 24, 2023

neo-alex left a comment

neo-alex Nov 30, 2023

nhuet Nov 30, 2023

neo-alex Nov 30, 2023

nhuet Nov 30, 2023 •

edited

Loading

nhuet Nov 30, 2023

neo-alex Nov 30, 2023

Fix ray_rllib solver after the upgrade to ray 2.7.0 #284

Fix ray_rllib solver after the upgrade to ray 2.7.0 #284

Conversation

nhuet commented Nov 24, 2023

neo-alex left a comment

Choose a reason for hiding this comment

neo-alex Nov 30, 2023

Choose a reason for hiding this comment

nhuet Nov 30, 2023

Choose a reason for hiding this comment

neo-alex Nov 30, 2023

Choose a reason for hiding this comment

nhuet Nov 30, 2023 • edited Loading

Choose a reason for hiding this comment

nhuet Nov 30, 2023

Choose a reason for hiding this comment

neo-alex Nov 30, 2023

Choose a reason for hiding this comment

nhuet Nov 30, 2023 •

edited

Loading