(v2.2.5) - Multi-Objective Reward compability in Sinergym (#303)

* Updated Sinergym version from 2.2.4 to 2.2.5 * Updated reward terms dictionary * Improved environment layer info dict construction * Added MultiObjectiveReward in wrappers.py * Enhanced Sinergym logger: Using info for reward information in order to be more general * Fixed loggerWrapper to logger update * Added timestep and time_elapsed (both 0) to simulator info in reset * Added fixture with multiobjective wrapper * Added tests for multiobjective wrapper * Fix simulator test to new info dict dimension * Fixed pytype errors * Update documentation modules for API reference documentation * Documentation: Updated wrapper section * Update wrapper notebook example * Documentation: Updated reward section and added issue reference un multiObjective wrapper section
ugr-sail · Mar 10, 2023 · f33f130 · f33f130
1 parent 50ce3f7
commit f33f130
Show file tree

Hide file tree

Showing 18 changed files with 398 additions and 245 deletions.
diff --git a/docs/source/pages/modules/sinergym.utils.wrappers.LoggerWrapper.rst b/docs/source/pages/modules/sinergym.utils.wrappers.LoggerWrapper.rst
@@ -1,4 +1,4 @@
-sinergym.utils.wrappers.LoggerWrapper
+sinergym.utils.wrappers.LoggerWrapper
 =====================================
 
 .. currentmodule:: sinergym.utils.wrappers

diff --git a/docs/source/pages/modules/sinergym.utils.wrappers.MultiObjectiveReward.rst b/docs/source/pages/modules/sinergym.utils.wrappers.MultiObjectiveReward.rst
@@ -0,0 +1,42 @@
+sinergym.utils.wrappers.MultiObjectiveReward
+============================================
+
+.. currentmodule:: sinergym.utils.wrappers
+
+.. autoclass:: MultiObjectiveReward
+   :members:                                                           
+   :undoc-members:               
+
+
+   .. automethod:: __init__
+
+
+   .. rubric:: Methods
+
+   .. autosummary::
+
+      ~MultiObjectiveReward.__init__
+      ~MultiObjectiveReward.class_name
+      ~MultiObjectiveReward.close
+      ~MultiObjectiveReward.render
+      ~MultiObjectiveReward.reset
+      ~MultiObjectiveReward.step
+
+
+
+
+
+   .. rubric:: Attributes
+
+   .. autosummary::
+
+      ~MultiObjectiveReward.action_space
+      ~MultiObjectiveReward.metadata
+      ~MultiObjectiveReward.np_random
+      ~MultiObjectiveReward.observation_space
+      ~MultiObjectiveReward.render_mode
+      ~MultiObjectiveReward.reward_range
+      ~MultiObjectiveReward.spec
+      ~MultiObjectiveReward.unwrapped
+
+
diff --git a/docs/source/pages/modules/sinergym.utils.wrappers.MultiObsWrapper.rst b/docs/source/pages/modules/sinergym.utils.wrappers.MultiObsWrapper.rst
@@ -1,4 +1,4 @@
-sinergym.utils.wrappers.MultiObsWrapper
+sinergym.utils.wrappers.MultiObsWrapper
 =======================================
 
 .. currentmodule:: sinergym.utils.wrappers

diff --git a/docs/source/pages/modules/sinergym.utils.wrappers.NormalizeObservation.rst b/docs/source/pages/modules/sinergym.utils.wrappers.NormalizeObservation.rst
@@ -1,4 +1,4 @@
-sinergym.utils.wrappers.NormalizeObservation
+sinergym.utils.wrappers.NormalizeObservation
 ============================================
 
 .. currentmodule:: sinergym.utils.wrappers

diff --git a/.../sinergym.utils.wrappers.OfficeGridStorageSmoothingActionConstraintsWrapper.rst b/.../sinergym.utils.wrappers.OfficeGridStorageSmoothingActionConstraintsWrapper.rst
@@ -1,4 +1,4 @@
-sinergym.utils.wrappers.OfficeGridStorageSmoothingActionConstraintsWrapper
+sinergym.utils.wrappers.OfficeGridStorageSmoothingActionConstraintsWrapper
 ==========================================================================
 
 .. currentmodule:: sinergym.utils.wrappers

diff --git a/docs/source/pages/modules/sinergym.utils.wrappers.rst b/docs/source/pages/modules/sinergym.utils.wrappers.rst
@@ -20,6 +20,7 @@
       :template: custom-class-template.rst               
 
       LoggerWrapper
+      MultiObjectiveReward
       MultiObsWrapper
       NormalizeObservation
       OfficeGridStorageSmoothingActionConstraintsWrapper

diff --git a/docs/source/pages/rewards.rst b/docs/source/pages/rewards.rst
@@ -76,6 +76,9 @@ But you can change this configuration using ``gym.make()`` as follows:
                                                                             'range_comfort_summer': (23.0, 26.0),
                                                                             'energy_weight': 0.1})
 
+.. note:: By default, reward class will return the reward value and the terms used in its calculation. 
+          This terms will be added to info dict in environment automatically.
+
 .. warning:: When specifying a different reward with `gym.make` than the 
              default environment ID, it is very important to set the `reward_kwargs` 
              that are required and therefore do not have a default value. 

diff --git a/docs/source/pages/wrappers.rst b/docs/source/pages/wrappers.rst
@@ -4,49 +4,25 @@ Wrappers
 
 *Sinergym* has several **wrappers** in order to add some functionality in the environment 
 that it doesn't have by default. Currently, we have developed a **normalization wrapper**, 
-**multi-observation wrapper** and **Logger wrapper**. The code can be found in 
+**multi-observation wrapper**, **multi-objective wrapper** and **Logger wrapper**. The code can be found in 
 `sinergym/sinergym/utils/wrappers.py <https://github.com/ugr-sail/sinergym/blob/main/sinergym/utils/wrappers.py>`__.
 You can implement your own wrappers inheriting from *gym.Wrapper* or some of its variants.
 
-An usage of these wrappers could be the next:
-
-.. code:: python
-
-    import gymnasium as gym
-    import sinergym
-    from sinergym.utils.wrapper import LoggerWrapper, NormalizeObservation
-
-    env = gym.make('Eplus-5Zone-hot-continuous-v1')
-    env = NormalizeObservation(env)
-    env = LoggerWrapper(env)
-    ...
-
-    for i in range(1):
-        obs, info = env.reset()
-        rewards = []
-        terminated = False
-        current_month = 0
-        while not terminated:
-            a = env.action_space.sample()
-            obs, reward, terminated, truncated, info = env.step(a)
-            rewards.append(reward)
-            if info['month'] != current_month:  # display results every month
-                current_month = info['month']
-                print('Reward: ', sum(rewards), info)
-        print(
-            'Episode ',
-            i,
-            'Mean reward: ',
-            np.mean(rewards),
-            'Cumulative reward: ',
-            sum(rewards))
-    env.close()
+- **NormalizeObservation**: It is used to transform observation received from simulator in values between 0 and 1.
+
+- **LoggerWrapper**: Wrapper for logging all interactions between agent and environment. Logger class can be selected
+  in the constructor if other type of logging is required. For more information about *Sinergym* Logger visit :ref:`Logger`.
+
+- **MultiObjectiveReward**: Environment step will return a vector reward (selected elements in wrapper constructor, 
+  one for each objective) instead of a traditional scalar value. See `#301 <https://github.com/ugr-sail/sinergym/issues/301>`__.
+
+- **MultiObsWrapper**: Stack observation received in a history queue (size customizable).
+
+.. note:: For examples about how to use these wrappers, visit :ref:`Wrappers example`.
 
 .. warning:: The order of wrappers if you are going to use several at the same time is really important.
-             The correct order is **Normalization - Logger - MultiObs** and subsets (for example, *Normalization* - *Multiobs* is valid).
+             The correct order is **Normalization - Logger - Multi-Objectives - MultiObs** and subsets (for example, *Normalization* - *Multiobs* is valid).
 
 .. warning:: If you add new observation variables to the environment than the default ones, you have 
              to update the **value range dictionary** in `sinergym/sinergym/utils/constants.py <https://github.com/ugr-sail/sinergym/blob/main/sinergym/utils/constants.py>`__ 
-             so that normalization can be applied correctly. Otherwise, you will encounter bug `#249 <https://github.com/ugr-sail/sinergym/issues/249>`__.
-
-.. note:: For more information about *Sinergym* Logger, visit :ref:`Logger`.
+             so that normalization can be applied correctly. Otherwise, you will encounter bug `#249 <https://github.com/ugr-sail/sinergym/issues/249>`__.