Skip to content

Commit

Permalink
[RLlib; docs] Change links and references in code and docs to "Farama…
Browse files Browse the repository at this point in the history
… foundation's gymnasium" (from "OpenAI gym"). (#32061)
  • Loading branch information
avnishn authored Jan 31, 2023
1 parent 61c411f commit f2b6a6b
Show file tree
Hide file tree
Showing 13 changed files with 26 additions and 25 deletions.
2 changes: 1 addition & 1 deletion doc/source/rllib/core-concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ Policies
`Policies <rllib-concepts.html#policies>`__ are a core concept in RLlib. In a nutshell, policies are
Python classes that define how an agent acts in an environment.
`Rollout workers <rllib-concepts.html#policy-evaluation>`__ query the policy to determine agent actions.
In a `gym <rllib-env.html#openai-gym>`__ environment, there is a single agent and policy.
In a `Farama-Foundation Gymnasium <rllib-env.html#gymnasium>`__ environment, there is a single agent and policy.
In `vector envs <rllib-env.html#vectorized>`__, policy inference is for multiple agents at once,
and in `multi-agent <rllib-env.html#multi-agent-and-hierarchical>`__, there may be multiple policies,
each controlling one or more agents:
Expand Down
8 changes: 4 additions & 4 deletions doc/source/rllib/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ As a last step, we `evaluate` the trained Algorithm:
:start-after: __rllib-in-60s-begin__
:end-before: __rllib-in-60s-end__

Note that you can use any OpenAI gym environment as `env`.
Note that you can use any Farama-Foundation Gymnasium environment as `env`.
In `rollouts` you can for instance specify the number of parallel workers to collect samples from the environment.
The `framework` config lets you choose between "tf2", "tf" and "torch" for execution.
You can also tweak RLlib's default `model` config,and set up a separate config for `evaluation`.
Expand Down Expand Up @@ -159,7 +159,7 @@ click on the dropdowns below:
:animate: fade-in-slide-down

* `RLlib Environments Overview <rllib-env.html>`__
* `OpenAI Gym <rllib-env.html#openai-gym>`__
* `Farama-Foundation gymnasium <rllib-env.html#gymnasium>`__
* `Vectorized <rllib-env.html#vectorized>`__
* `Multi-Agent and Hierarchical <rllib-env.html#multi-agent-and-hierarchical>`__
* `External Agents and Applications <rllib-env.html#external-agents-and-applications>`__
Expand Down Expand Up @@ -200,7 +200,7 @@ Feature Overview

**RLlib Environments**
^^^
Get started with environments supported by RLlib, such as OpenAI Gym, Petting Zoo,
Get started with environments supported by RLlib, such as Farama foundation's Gymnasium, Petting Zoo,
and many custom formats for vectorized and multi-agent environments.
+++
.. link-button:: rllib-environments-doc
Expand All @@ -220,7 +220,7 @@ Customizing RLlib

RLlib provides simple APIs to customize all aspects of your training- and experimental workflows.
For example, you may code your own `environments <rllib-env.html#configuring-environments>`__
in python using openAI's gym or DeepMind's OpenSpiel, provide custom
in python using Farama-Foundation's gymnasium or DeepMind's OpenSpiel, provide custom
`TensorFlow/Keras- <rllib-models.html#tensorflow-models>`__ or ,
`Torch models <rllib-models.html#torch-models>`_, write your own
`policy- and loss definitions <rllib-concepts.html#policies>`__, or define
Expand Down
6 changes: 3 additions & 3 deletions doc/source/rllib/rllib-env.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
Environments
============

RLlib works with several different types of environments, including `OpenAI Gym <https://www.gymlibrary.dev/>`__, user-defined, multi-agent, and also batched environments.
RLlib works with several different types of environments, including `Farama-Foundation Gymnasium <https://gymnasium.farama.org/>`__, user-defined, multi-agent, and also batched environments.

.. tip::

Expand Down Expand Up @@ -88,10 +88,10 @@ This can be useful if you want to train over an ensemble of different environmen

When using logging in an environment, the logging configuration needs to be done inside the environment, which runs inside Ray workers. Any configurations outside the environment, e.g., before starting Ray will be ignored.

OpenAI Gym
Gymnasium
----------

RLlib uses Gym as its environment interface for single-agent training. For more information on how to implement a custom Gym environment, see the `gym.Env class definition <https://github.com/openai/gym/blob/master/gym/core.py>`__. You may find the `SimpleCorridor <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_env.py>`__ example useful as a reference.
RLlib uses Gymnasium as its environment interface for single-agent training. For more information on how to implement a custom Gymnasium environment, see the `gymnasium.Env class definition <https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/core.py>`__. You may find the `SimpleCorridor <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_env.py>`__ example useful as a reference.

Performance
~~~~~~~~~~~
Expand Down
2 changes: 1 addition & 1 deletion doc/source/rllib/rllib-models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ Custom Preprocessors and Environment Filters
.. warning::

Custom preprocessors have been fully deprecated, since they sometimes conflict with the built-in preprocessors for handling complex observation spaces.
Please use `wrapper classes <https://github.com/openai/gym/tree/master/gym/wrappers>`__ around your environment instead of preprocessors.
Please use `wrapper classes <https://github.com/Farama-Foundation/Gymnasium/tree/main/gymnasium/wrappers>`__ around your environment instead of preprocessors.
Note that the built-in **default** Preprocessors described above will still be used and won't be deprecated.

Instead of using the deprecated custom Preprocessors, you should use ``gym.Wrappers`` to preprocess your environment's output (observations and rewards),
Expand Down
2 changes: 1 addition & 1 deletion doc/source/rllib/rllib-training.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ You can train DQN with the following commands:
has a number of options you can show by running `rllib train --help`.

Note that you choose any supported RLlib algorithm (``--algo``) and environment (``--env``).
RLlib supports any OpenAI Gym environment, as well as a number of other environments
RLlib supports any Farama-Foundation Gymnasium environment, as well as a number of other environments
(see :ref:`rllib-environments-doc`).
It also supports a large number of algorithms (see :ref:`rllib-algorithms-doc`) to
choose from.
Expand Down
2 changes: 1 addition & 1 deletion rllib/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ Quick First Experiment
from ray.rllib.algorithms.ppo import PPOConfig
# Define your problem using python and openAI's gym API:
# Define your problem using python and Farama-Foundation's gymnasium API:
class ParrotEnv(gym.Env):
"""Environment in which an agent must learn to repeat the seen observations.
Expand Down
7 changes: 4 additions & 3 deletions rllib/algorithms/algorithm_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -1119,9 +1119,10 @@ def environment(
env: The environment specifier. This can either be a tune-registered env,
via `tune.register_env([name], lambda env_ctx: [env object])`,
or a string specifier of an RLlib supported type. In the latter case,
RLlib will try to interpret the specifier as either an openAI gym env,
a PyBullet env, a ViZDoomGym env, or a fully qualified classpath to an
Env class, e.g. "ray.rllib.examples.env.random_env.RandomEnv".
RLlib will try to interpret the specifier as either an Farama-Foundation
gymnasium env, a PyBullet env, a ViZDoomGym env, or a fully qualified
classpath to an Env class, e.g.
"ray.rllib.examples.env.random_env.RandomEnv".
env_config: Arguments dict passed to the env creator as an EnvContext
object (which is a dict plus the properties: num_rollout_workers,
worker_index, vector_index, and remote).
Expand Down
8 changes: 4 additions & 4 deletions rllib/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,8 +120,8 @@ def get_help(key: str) -> str:


train_help = dict(
env="The environment specifier to use. This could be an openAI gym "
"specifier (e.g. `CartPole-v1`) or a full class-path (e.g. "
env="The environment specifier to use. This could be an Farama-Foundation "
"Gymnasium specifier (e.g. `CartPole-v1`) or a full class-path (e.g. "
"`ray.rllib.examples.env.simple_corridor.SimpleCorridor`).",
config_file="Use the algorithm configuration from this file.",
filetype="The file type of the config file. Defaults to 'yaml' and can also be "
Expand Down Expand Up @@ -160,8 +160,8 @@ def get_help(key: str) -> str:
algo="The algorithm or model to train. This may refer to the name of a built-in "
"Algorithm (e.g. RLlib's `DQN` or `PPO`), or a user-defined trainable "
"function or class registered in the Tune registry.",
env="The environment specifier to use. This could be an openAI gym "
"specifier (e.g. `CartPole-v1`) or a full class-path (e.g. "
env="The environment specifier to use. This could be an Farama-Foundation gymnasium"
" specifier (e.g. `CartPole-v1`) or a full class-path (e.g. "
"`ray.rllib.examples.env.simple_corridor.SimpleCorridor`).",
local_mode="Run Ray in local mode for easier debugging.",
render="Render the environment while evaluating. Off by default",
Expand Down
2 changes: 1 addition & 1 deletion rllib/evaluation/rollout_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ def _update_env_seed_if_necessary(
computed_seed: int = worker_idx * max_num_envs_per_workers + vector_idx + seed

# Gymnasium.env.
# This will silently fail for most OpenAI gyms
# This will silently fail for most Farama-foundation gymnasium environments.
# (they do nothing and return None per default)
if not hasattr(env, "reset"):
if log_once("env_has_no_reset_method"):
Expand Down
2 changes: 1 addition & 1 deletion rllib/examples/documentation/rllib_on_ray_readme.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from ray.rllib.algorithms.ppo import PPOConfig


# Define your problem using python and openAI's gym API:
# Define your problem using python and Farama-Foundation's gymnasium API:
class SimpleCorridor(gym.Env):
"""Corridor in which an agent must learn to move right to reach the exit.
Expand Down
2 changes: 1 addition & 1 deletion rllib/examples/documentation/rllib_on_rllib_readme.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from ray.rllib.algorithms.ppo import PPOConfig


# Define your problem using python and openAI's gym API:
# Define your problem using python and Farama-Foundation's gymnasium API:
class ParrotEnv(gym.Env):
"""Environment in which an agent must learn to repeat the seen observations.
Expand Down
4 changes: 2 additions & 2 deletions rllib/examples/env/cliff_walking_wall_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@


class CliffWalkingWallEnv(gym.Env):
"""Modified version of the CliffWalking environment from OpenAI Gym
with walls instead of a cliff.
"""Modified version of the CliffWalking environment from Farama-Foundation's
Gymnasium with walls instead of a cliff.
### Description
The board is a 4x12 matrix, with (using NumPy matrix indexing):
Expand Down
4 changes: 2 additions & 2 deletions rllib/examples/env/stateless_pendulum.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
class StatelessPendulum(PendulumEnv):
"""Partially observable variant of the Pendulum gym environment.
https://github.com/openai/gym/blob/master/gym/envs/classic_control/
pendulum.py
https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/
classic_control/pendulum.py
We delete the angular velocity component of the state, so that it
can only be solved by a memory enhanced model (policy).
Expand Down

0 comments on commit f2b6a6b

Please sign in to comment.