[RLlib; docs] Change links and references in code and docs to "Farama…

… foundation's gymnasium" (from "OpenAI gym"). (#32061)
ray-project · Jan 31, 2023 · f2b6a6b · f2b6a6b
1 parent 61c411f
commit f2b6a6b
Show file tree

Hide file tree

Showing 13 changed files with 26 additions and 25 deletions.
diff --git a/doc/source/rllib/core-concepts.rst b/doc/source/rllib/core-concepts.rst
@@ -119,7 +119,7 @@ Policies
 `Policies <rllib-concepts.html#policies>`__ are a core concept in RLlib. In a nutshell, policies are
 Python classes that define how an agent acts in an environment.
 `Rollout workers <rllib-concepts.html#policy-evaluation>`__ query the policy to determine agent actions.
-In a `gym <rllib-env.html#openai-gym>`__ environment, there is a single agent and policy.
+In a `Farama-Foundation Gymnasium <rllib-env.html#gymnasium>`__ environment, there is a single agent and policy.
 In `vector envs <rllib-env.html#vectorized>`__, policy inference is for multiple agents at once,
 and in `multi-agent <rllib-env.html#multi-agent-and-hierarchical>`__, there may be multiple policies,
 each controlling one or more agents:

diff --git a/doc/source/rllib/index.rst b/doc/source/rllib/index.rst
@@ -77,7 +77,7 @@ As a last step, we `evaluate` the trained Algorithm:
     :start-after: __rllib-in-60s-begin__
     :end-before: __rllib-in-60s-end__
 
-Note that you can use any OpenAI gym environment as `env`.
+Note that you can use any Farama-Foundation Gymnasium environment as `env`.
 In `rollouts` you can for instance specify the number of parallel workers to collect samples from the environment.
 The `framework` config lets you choose between "tf2", "tf" and "torch" for execution.
 You can also tweak RLlib's default `model` config,and set up a separate config for `evaluation`.
@@ -159,7 +159,7 @@ click on the dropdowns below:
     :animate: fade-in-slide-down
 
     *  `RLlib Environments Overview <rllib-env.html>`__
-    *  `OpenAI Gym <rllib-env.html#openai-gym>`__
+    *  `Farama-Foundation gymnasium <rllib-env.html#gymnasium>`__
     *  `Vectorized <rllib-env.html#vectorized>`__
     *  `Multi-Agent and Hierarchical <rllib-env.html#multi-agent-and-hierarchical>`__
     *  `External Agents and Applications <rllib-env.html#external-agents-and-applications>`__
@@ -200,7 +200,7 @@ Feature Overview
 
     **RLlib Environments**
     ^^^
-    Get started with environments supported by RLlib, such as OpenAI Gym, Petting Zoo,
+    Get started with environments supported by RLlib, such as Farama foundation's Gymnasium, Petting Zoo,
     and many custom formats for vectorized and multi-agent environments.
     +++
     .. link-button:: rllib-environments-doc
@@ -220,7 +220,7 @@ Customizing RLlib
 
 RLlib provides simple APIs to customize all aspects of your training- and experimental workflows.
 For example, you may code your own `environments <rllib-env.html#configuring-environments>`__
-in python using openAI's gym or DeepMind's OpenSpiel, provide custom
+in python using Farama-Foundation's gymnasium or DeepMind's OpenSpiel, provide custom
 `TensorFlow/Keras- <rllib-models.html#tensorflow-models>`__ or ,
 `Torch models <rllib-models.html#torch-models>`_, write your own
 `policy- and loss definitions <rllib-concepts.html#policies>`__, or define

diff --git a/doc/source/rllib/rllib-env.rst b/doc/source/rllib/rllib-env.rst
@@ -7,7 +7,7 @@
 Environments
 ============
 
-RLlib works with several different types of environments, including `OpenAI Gym <https://www.gymlibrary.dev/>`__, user-defined, multi-agent, and also batched environments.
+RLlib works with several different types of environments, including `Farama-Foundation Gymnasium <https://gymnasium.farama.org/>`__, user-defined, multi-agent, and also batched environments.
 
 .. tip::
 
@@ -88,10 +88,10 @@ This can be useful if you want to train over an ensemble of different environmen
 
    When using logging in an environment, the logging configuration needs to be done inside the environment, which runs inside Ray workers. Any configurations outside the environment, e.g., before starting Ray will be ignored.
 
-OpenAI Gym
+Gymnasium
 ----------
 
-RLlib uses Gym as its environment interface for single-agent training. For more information on how to implement a custom Gym environment, see the `gym.Env class definition <https://github.com/openai/gym/blob/master/gym/core.py>`__. You may find the `SimpleCorridor <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_env.py>`__ example useful as a reference.
+RLlib uses Gymnasium as its environment interface for single-agent training. For more information on how to implement a custom Gymnasium environment, see the `gymnasium.Env class definition <https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/core.py>`__. You may find the `SimpleCorridor <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_env.py>`__ example useful as a reference.
 
 Performance
 ~~~~~~~~~~~

diff --git a/doc/source/rllib/rllib-models.rst b/doc/source/rllib/rllib-models.rst
@@ -152,7 +152,7 @@ Custom Preprocessors and Environment Filters
 .. warning::
 
     Custom preprocessors have been fully deprecated, since they sometimes conflict with the built-in preprocessors for handling complex observation spaces.
-    Please use `wrapper classes <https://github.com/openai/gym/tree/master/gym/wrappers>`__ around your environment instead of preprocessors.
+    Please use `wrapper classes <https://github.com/Farama-Foundation/Gymnasium/tree/main/gymnasium/wrappers>`__ around your environment instead of preprocessors.
     Note that the built-in **default** Preprocessors described above will still be used and won't be deprecated.
 
 Instead of using the deprecated custom Preprocessors, you should use ``gym.Wrappers`` to preprocess your environment's output (observations and rewards),

diff --git a/doc/source/rllib/rllib-training.rst b/doc/source/rllib/rllib-training.rst
@@ -38,7 +38,7 @@ You can train DQN with the following commands:
     has a number of options you can show by running `rllib train --help`.
 
 Note that you choose any supported RLlib algorithm (``--algo``) and environment (``--env``).
-RLlib supports any OpenAI Gym environment, as well as a number of other environments
+RLlib supports any Farama-Foundation Gymnasium environment, as well as a number of other environments
 (see :ref:`rllib-environments-doc`).
 It also supports a large number of algorithms (see :ref:`rllib-algorithms-doc`) to
 choose from.

diff --git a/rllib/README.rst b/rllib/README.rst
@@ -129,7 +129,7 @@ Quick First Experiment
     from ray.rllib.algorithms.ppo import PPOConfig
 
 
-    # Define your problem using python and openAI's gym API:
+    # Define your problem using python and Farama-Foundation's gymnasium API:
     class ParrotEnv(gym.Env):
         """Environment in which an agent must learn to repeat the seen observations.
 

diff --git a/rllib/algorithms/algorithm_config.py b/rllib/algorithms/algorithm_config.py
@@ -1119,9 +1119,10 @@ def environment(
             env: The environment specifier. This can either be a tune-registered env,
                 via `tune.register_env([name], lambda env_ctx: [env object])`,
                 or a string specifier of an RLlib supported type. In the latter case,
-                RLlib will try to interpret the specifier as either an openAI gym env,
-                a PyBullet env, a ViZDoomGym env, or a fully qualified classpath to an
-                Env class, e.g. "ray.rllib.examples.env.random_env.RandomEnv".
+                RLlib will try to interpret the specifier as either an Farama-Foundation
+                gymnasium env, a PyBullet env, a ViZDoomGym env, or a fully qualified
+                classpath to an Env class, e.g.
+                "ray.rllib.examples.env.random_env.RandomEnv".
             env_config: Arguments dict passed to the env creator as an EnvContext
                 object (which is a dict plus the properties: num_rollout_workers,
                 worker_index, vector_index, and remote).

diff --git a/rllib/common.py b/rllib/common.py
@@ -120,8 +120,8 @@ def get_help(key: str) -> str:
 
 
 train_help = dict(
-    env="The environment specifier to use. This could be an openAI gym "
-    "specifier (e.g. `CartPole-v1`) or a full class-path (e.g. "
+    env="The environment specifier to use. This could be an Farama-Foundation "
+    "Gymnasium specifier (e.g. `CartPole-v1`) or a full class-path (e.g. "
     "`ray.rllib.examples.env.simple_corridor.SimpleCorridor`).",
     config_file="Use the algorithm configuration from this file.",
     filetype="The file type of the config file. Defaults to 'yaml' and can also be "
@@ -160,8 +160,8 @@ def get_help(key: str) -> str:
     algo="The algorithm or model to train. This may refer to the name of a built-in "
     "Algorithm (e.g. RLlib's `DQN` or `PPO`), or a user-defined trainable "
     "function or class registered in the Tune registry.",
-    env="The environment specifier to use. This could be an openAI gym "
-    "specifier (e.g. `CartPole-v1`) or a full class-path (e.g. "
+    env="The environment specifier to use. This could be an Farama-Foundation gymnasium"
+    " specifier (e.g. `CartPole-v1`) or a full class-path (e.g. "
     "`ray.rllib.examples.env.simple_corridor.SimpleCorridor`).",
     local_mode="Run Ray in local mode for easier debugging.",
     render="Render the environment while evaluating. Off by default",

diff --git a/rllib/evaluation/rollout_worker.py b/rllib/evaluation/rollout_worker.py
@@ -145,7 +145,7 @@ def _update_env_seed_if_necessary(
     computed_seed: int = worker_idx * max_num_envs_per_workers + vector_idx + seed
 
     # Gymnasium.env.
-    # This will silently fail for most OpenAI gyms
+    # This will silently fail for most Farama-foundation gymnasium environments.
     # (they do nothing and return None per default)
     if not hasattr(env, "reset"):
         if log_once("env_has_no_reset_method"):

diff --git a/rllib/examples/documentation/rllib_on_ray_readme.py b/rllib/examples/documentation/rllib_on_ray_readme.py
@@ -3,7 +3,7 @@
 from ray.rllib.algorithms.ppo import PPOConfig
 
 
-# Define your problem using python and openAI's gym API:
+# Define your problem using python and Farama-Foundation's gymnasium API:
 class SimpleCorridor(gym.Env):
     """Corridor in which an agent must learn to move right to reach the exit.
 

diff --git a/rllib/examples/documentation/rllib_on_rllib_readme.py b/rllib/examples/documentation/rllib_on_rllib_readme.py
@@ -2,7 +2,7 @@
 from ray.rllib.algorithms.ppo import PPOConfig
 
 
-# Define your problem using python and openAI's gym API:
+# Define your problem using python and Farama-Foundation's gymnasium API:
 class ParrotEnv(gym.Env):
     """Environment in which an agent must learn to repeat the seen observations.
 

diff --git a/rllib/examples/env/cliff_walking_wall_env.py b/rllib/examples/env/cliff_walking_wall_env.py
@@ -8,8 +8,8 @@
 
 
 class CliffWalkingWallEnv(gym.Env):
-    """Modified version of the CliffWalking environment from OpenAI Gym
-    with walls instead of a cliff.
+    """Modified version of the CliffWalking environment from Farama-Foundation's
+    Gymnasium with walls instead of a cliff.
 
     ### Description
     The board is a 4x12 matrix, with (using NumPy matrix indexing):

diff --git a/rllib/examples/env/stateless_pendulum.py b/rllib/examples/env/stateless_pendulum.py
@@ -7,8 +7,8 @@
 class StatelessPendulum(PendulumEnv):
     """Partially observable variant of the Pendulum gym environment.
 
-    https://github.com/openai/gym/blob/master/gym/envs/classic_control/
-    pendulum.py
+    https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/
+    classic_control/pendulum.py
 
     We delete the angular velocity component of the state, so that it
     can only be solved by a memory enhanced model (policy).