Skip to content

Commit

Permalink
Minor docs improvements
Browse files Browse the repository at this point in the history
  • Loading branch information
thomaswolgast committed Jan 28, 2025
1 parent 44fb851 commit 67615b5
Show file tree
Hide file tree
Showing 6 changed files with 61 additions and 52 deletions.
8 changes: 5 additions & 3 deletions docs/source/advanced_features.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,17 @@ Multi-Stage OPF
The multi-stage OPF problem is an OPF that is performed over multiple time
steps, including constraint satisfaction over multiple time steps, for example,
storage state-of-charge or ramping constraints.
The multi-stage OPF can be implemented by overwriting the :meth:`step` method, as
shown in the
The multi-stage OPF can be implemented by overwriting the :meth:`step` method,
which can be done by inheriting from the :class:`MultiStageOpfEnv` class,
as shown in the
`multi-stage OPF example <https://github.com/Digitalized-Energy-Systems/opfgym/blob/development/opfgym/examples/multi_stage.py>`_.

Security-Constrained OPF
------------------------
The security-constrained OPF problem is an OPF were all constraints are also
consideref for the N-1 case with line outages. It can be implemented by adding
a loop to the :meth:`calculate_violations` method, as shown in the
a loop to the :meth:`calculate_violations` method, which can be easily done by
inheriting from the :class:`SecurityConstrainedOpfEnv` class, as shown in the
`security-constrained OPF example <https://github.com/Digitalized-Energy-Systems/opfgym/blob/development/opfgym/examples/security_constrained.py>`_.

Mixed Continuous and Discrete Actions
Expand Down
6 changes: 6 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,9 @@

html_theme = 'furo'
html_static_path = ['_static']

html_theme_options = {
"source_repository": "https://github.com/Digitalized-Energy-Systems/opfgym",
"source_branch": "development",
"source_directory": "docs/source/",
}
82 changes: 41 additions & 41 deletions docs/source/environment_design.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,65 +18,65 @@ Most environment design options are described in detail in

TODO: Work in progress, more information will follow.

Reward function
---------------
.. Reward function
.. ---------------
The reward function represents the goal of the agent. In the case of the RL-OPF,
the goal is to minimize the objective function while satisfying all constraints,
which can be represented by penalties.
.. The reward function represents the goal of the agent. In the case of the RL-OPF,
.. the goal is to minimize the objective function while satisfying all constraints,
.. which can be represented by penalties.
Three different standard reward functions to combine the objective function and
the constraint violations are available:
.. Three different standard reward functions to combine the objective function and
.. the constraint violations are available:
Summation reward
^^^^^^^^^^^^^^^^
In the summation reward, we simply add the penalties :math:`p_i(x)`
for constraint violations in the current state :math:`x`
to the negative objective function value :math:`f(x)`:
.. Summation reward
.. ^^^^^^^^^^^^^^^^
.. In the summation reward, we simply add the penalties :math:`p_i(x)`
.. for constraint violations in the current state :math:`x`
.. to the negative objective function value :math:`f(x)`:
:math:`r = -f(x) - \sum_{i} p_i(x)`
.. :math:`r = -f(x) - \sum_{i} p_i(x)`
Replacement reward
^^^^^^^^^^^^^^^^^^
In the replacement reward, we only provide either the objective function value
as a learning feedback or the penalty:
.. Replacement reward
.. ^^^^^^^^^^^^^^^^^^
.. In the replacement reward, we only provide either the objective function value
.. as a learning feedback or the penalty:
If valid: :math:`r = -f(x) + C`
.. If valid: :math:`r = -f(x) + C`
Else: :math:`r = -\sum_{i} p_i(x)`
.. Else: :math:`r = -\sum_{i} p_i(x)`
Additionally, we need a constant :math:`C` to ensure that the valid reward is
always better than the invalid one.
.. Additionally, we need a constant :math:`C` to ensure that the valid reward is
.. always better than the invalid one.
Parameterized reward
^^^^^^^^^^^^^^^^^^^^
This reward combines the previous two and allows for all possible combinations:
.. Parameterized reward
.. ^^^^^^^^^^^^^^^^^^^^
.. This reward combines the previous two and allows for all possible combinations:
If valid: :math:`r = -f(x) + C_{valid}`
.. If valid: :math:`r = -f(x) + C_{valid}`
Else: :math:`r = w * -f(x) - \sum_{i} p_i(x) - C_{invalid}`
.. Else: :math:`r = w * -f(x) - \sum_{i} p_i(x) - C_{invalid}`
Note that if the objective weight :math:`w` is set to zero, it is equivalent to
the replacement reward. If it is set to one and both constants
:math:`C` are set to zero, it is equivalent to the summation reward.
.. Note that if the objective weight :math:`w` is set to zero, it is equivalent to
.. the replacement reward. If it is set to one and both constants
.. :math:`C` are set to zero, it is equivalent to the summation reward.
Observation space
-----------------
.. Observation space
.. -----------------
TODO: Work in progress, more information will follow.
.. TODO: Work in progress, more information will follow.
Action space
------------
.. Action space
.. ------------
TODO: Work in progress, more information will follow.
.. TODO: Work in progress, more information will follow.
Episode definition
------------------
.. Episode definition
.. ------------------
TODO: Work in progress, more information will follow.
.. TODO: Work in progress, more information will follow.
Training and test data
----------------------
.. Training and test data
.. ----------------------
TODO: Work in progress, more information will follow.
.. TODO: Work in progress, more information will follow.
7 changes: 4 additions & 3 deletions docs/source/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ design options:
env = QMarket(**kwargs)
# Interact with the environment in the usual way (see above)
# env.reset()
obs, info = env.reset()
# ...
For more information on environment design and why it is important, see
Expand Down Expand Up @@ -152,15 +152,16 @@ More details can be found in :ref:`Create Custom Environments`.
return net, profiles
# Note that by inheriting from `OpfEnv`, all env design options are available
# Note that by inheriting from `OpfEnv`, all standard env design options are available
kwargs = {
# Add current line load to the observation space
'add_res_obs': ['line_loading'],
# ...
}
# Load the custom environment
env = CustomEnv(**kwargs)
# Interact with the environment in the usual way (see above)
# env.reset()
obs, info = env.reset()
# ...
3 changes: 2 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
OPF-Gym
=======

Expand All @@ -24,7 +25,7 @@ benchmarks power grids and time-series data by default.
All pandapower OPF variants can be represented as an RL environment by
*OPF-Gym*. Additionally, advanced OPF problems like multi-stage OPF,
security-constrained OPF, mixed continuous and discrete actions, stochastic OPF,
etc. are easily possible with *OPF-Gym*.
etc. are possible as well.

Contact thomas.wolgast@uol.de for questions, feedback, and collaboration.

Expand Down
7 changes: 3 additions & 4 deletions docs/source/supervised_learning.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Support for Supervised Learning
==========
===============================

While the focus of *OPF-Gym* is on reinforcement learning and its environments,
it also enables comparability with other machine learning approaches like
Expand Down Expand Up @@ -30,6 +30,5 @@ with the pandapower conventional OPF to generate ground-truth labels. That is
the case for all provided :ref:`Benchmarks`. However, it might not be the case
for custom environments, especially when implementing advanced OPF concepts
like multi-stage OPF or stochastic OPF. These are not solvable with the
pandapower OPF. In that case, you have to overwrite the
:py:meth:`env.run_optimal_power_flow` method of your custom environment and
provide your own OPF solver.
pandapower OPF. In that case, you also have to provide your own OPF solver to
your custom environment, as described in :ref:`Create Custom Environments`.

0 comments on commit 67615b5

Please sign in to comment.