Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Action Repetition Issue #436

Closed
3 tasks done
Mulcek04 opened this issue Sep 3, 2024 · 5 comments · Fixed by #443
Closed
3 tasks done

[Bug]: Action Repetition Issue #436

Mulcek04 opened this issue Sep 3, 2024 · 5 comments · Fixed by #443
Labels
BackEnd Simulator and communication interface bug Something isn't working

Comments

@Mulcek04
Copy link

Mulcek04 commented Sep 3, 2024

Bug 🐛

I am currently using Sinergym to set up a residential heating system simulation for simulating a typical European radiator heating system. The heating system consists of a condensing boiler with a heating valve and a radiant convective baseboard (you can see the heating system in the Openstudio screenshot I uploaded).
Heating system
I've developed a Soft Actor-Critic (SAC) agent using Stable Baselines3 (SB3) based on the sinergym example file "drl.ipynb". The agent gives the only action from 10 to 70 degrees to adjust the supply hot water setpoint. The reward consists of both energy consumption and temperature violations.

However, after reviewing the log files generated during training, I noticed an issue. In some episodes, the agent's actions are repeated twice in the certain timestep. And in subsequent timesteps, an action taken at time t will only be executed at time t+2. For example, as shown in the attached image, the action [44.685577] on row 409 is conducted repeatly in the following two timesteps (rows 410 and 411).
Bug
merged_data-5-BUG.csv

Moreover, this error occurs randomly. For example, in a 30-episode SAC training run, this issue appearred in episodes 5, 6, 7, 22, and 27. Even though the currently developed SAC agent can be trained and converged, I am concerned that this “misalignment” error may cause suboptimal problems.
Can anyone help or provide some insights?

To Reproduce

Here I attach my epjson file and some code. If you need me to provide other content, please let me know.
Residential_heating_system.zip

environment='Eplus-20240716_rbc-hot-continuous-v1'
new_variables={
    'T_amb': ('Site Outdoor Air DryBulb Temperature', 'Environment'),
    'DNI':('Site Direct Solar Radiation Rate per Area', 'Environment'),
    'T_diff_upper': ('Zone Air Temperature', 'ROOMS'),
    'T_diff_lower': ('Zone Air Temperature', 'ROOMS'),
    
    'T_rooms': ('Zone Air Temperature', 'ROOMS'),
    'Baseboard_T_inlet': ('Baseboard Water Inlet Temperature', 'ZONE HVAC BASEBOARD RAD CONV WATER'),
    'Baseboard_T_outlet': ('Baseboard Water Outlet Temperature', 'ZONE HVAC BASEBOARD RAD CONV WATER'),
    'SP_T_supply':('System Node Setpoint Temperature', 'Hot Water Loop Supply Outlet Node'),
    'Pump_mass_flow':('Pump Mass Flow Rate', 'CONST SPD PUMP'),
    'Boiler_mass_flow':('Boiler Mass Flow Rate', '90.1-2019 BOILER'),
    'PLR': ('Boiler Part Load Ratio', '90.1-2019 BOILER'),
    'E_boiler[W]': ('Boiler Heating Rate', '90.1-2019 BOILER'),
    'E_boiler[J]':('Boiler Heating Energy', '90.1-2019 BOILER'),
    'E_baseboard[W]':('Baseboard Total Heating Rate', 'ZONE HVAC BASEBOARD RAD CONV WATER'),
    'E_baseboard[J]':('Baseboard Total Heating Energy', 'ZONE HVAC BASEBOARD RAD CONV WATER'),
}
new_meters = {}

from sinergym.utils.rewards import MyReward_24h
reward_kwargs={
    'temperature_variables': ['T_rooms'], 
    'energy_variables': ['E_boiler[W]'],
    'range_comfort_winter': (19, 21),
    'range_comfort_summer': (24, 26),
    'SP_int': 20.0,
    'lambda_energy': lambda_energy, 
    'lambda_temperature': lambda_temperature,       
    'energy_weight': energy_weight 
    }

new_actuators = {'T_water_outlet': ('Schedule:Year','Schedule Value', 'HOT WATER TEMPERATURE LOOP')}
new_action_space = gym.spaces.Box(
    low= np.array([10.], dtype=np.float32), 
    high=np.array([70.], dtype=np.float32),
    shape=(len(new_actuators),),#len(new_actuators),),
    dtype=np.float32
    )
extra_conf={
    'timesteps_per_hour': timesteps_per_hour,
    'runperiod':(1,12,2018, 31,12,2018)
    }

env= gym.make(environment,
            env_name=experiment_name,
            reward=MyReward_24h,
            reward_kwargs = reward_kwargs,
            weather_files= 'Juvara_1819_solarModify.epw',
            variables= new_variables,
            meters = new_meters,
            actuators= new_actuators,
            action_space= new_action_space,
            config_params= extra_conf)

env = ConvertDeltaTempWrapper(env, comfort_range = reward_kwargs['range_comfort_winter'])
env = Previous_nStep_ObservationWrapper(env, previous_variables=['T_diff_upper', 'T_diff_lower', 'T_amb'], n= 4)
env = NormalizeObservation(env)
env = NormalizeAction(env, normalize_range=(-1., 1.))
env = LoggerWrapper(env)
env = CSVLogger(env)

obs_reduction=[
  'month', 'day_of_month',
  'PLR', 'Baseboard_T_inlet', 'Baseboard_T_outlet', 'SP_T_supply', 'Pump_mass_flow', 'Boiler_mass_flow',
  'E_boiler[W]', 'E_boiler[J]', 'E_baseboard[W]', 'E_baseboard[J]', 'T_rooms', 
  ]
env = ReduceObservationWrapper(env, obs_reduction=obs_reduction)
Traceback (most recent call last): File ...

Expected behavior

In addition to the previouse error CSV file, I have also uploaded a bug-free logger CSV file. This file records the data from the 8th episode of training, where the actions (Temp_supply_water) provided by the agent at time t are correctly executed at time t+1 (SP_T_supply) without any repeated executions.
merged_data-8.csv

System Info

Describe the characteristic of your environment:

  • Describe how Sinergym was installed: docker
  • Sinergym Version: e.g. 3.5.2

Additional context

Add any other context about the problem here.

Checklist

  • I have checked that there is no similar issue in the repo (required)
  • I have read the documentation (required)
  • I have provided a minimal working example to reproduce the bug (required)

📝 Please, don't forget to include more labels besides bug if it is necessary.

@Mulcek04 Mulcek04 added the bug Something isn't working label Sep 3, 2024
@Mulcek04 Mulcek04 closed this as completed Sep 6, 2024
@AlejandroCN7
Copy link
Member

Hello @Mulcek04,

Sorry, but I'm currently very busy with maintenance tasks for the tool. There are a few issues similar to the one you're talking about—have you been able to find a solution? I believe it might be due to the implementation of the EnergyPlus API itself. My plan was to update it to the latest version (#430) and then continue investigating to see if I can identify the cause.

In any case, if you've found any information or something that could be useful, I'd really appreciate it :)

@AlejandroCN7 AlejandroCN7 reopened this Sep 6, 2024
@AlejandroCN7 AlejandroCN7 added the BackEnd Simulator and communication interface label Sep 6, 2024
@Mulcek04
Copy link
Author

Mulcek04 commented Sep 6, 2024

Hi @AlejandroCN7 ,
I found a potential solution to address the alignment problem according to the issue #416 . The way is to add a time.sleep() to the def _process_action(self, state_argument: int) function in the eplus.py file as followed:

class EnergyPlus(object):
def _process_action(self, state_argument: int) -> None:
"""EnergyPlus callback that sets output actuator value(s) from last received action.

    Args:
        state_argument (int): EnergyPlus API state
    """

    # If simulation is complete or not initialized --> do nothing
    if self.simulation_complete:
        return
    # Check system is ready (only is executed is not)
    self._init_system(self.energyplus_state)
    if not self.system_ready:
        return
    # If not value in action queue --> do nothing
    if self.act_queue.empty():
        return
    # Get next action from queue and check type
    next_action = self.act_queue.get()
    # self.logger.debug('ACTION get from queue: {}'.format(next_action))

    # Here is the modify
    while self.act_queue.empty():
        time.sleep(0.01)

    # Set the action values obtained in actuator handlers
    for i, (act_name, act_handle) in enumerate(self.actuator_handlers.items()):
        self.exchange.set_actuator_value(
            state=state_argument,
            actuator_handle=act_handle,
            actuator_value=next_action[i]
        )

I'm still a beginner in programming, hope this could help you or anyother people with similar problem.

@AlejandroCN7
Copy link
Member

Hi @Mulcek04

Thank you so much for your help! I'll run some tests with the solution you suggested, and if it works, I'll create a patch for the tool.

I'll keep this issue open until I release that version. Thanks again!

@AlejandroCN7
Copy link
Member

Hi @Mulcek04!

Starting from version 3.5.9 of Sinergym, the issue with the delay in the effect of actions should be resolved. The sleep you added to the code was correct because it prevented an extra skip when processing the actions, and it helped me identify the problem.

However, I’ve implemented a more stable solution. You can check the commits related to the simulator in the mentioned PR (#443). Essentially, the EnergyPlus execution thread is interrupted when the action is sent. Then, the reset collects the observation and waits to process the action with step(), ensuring no cycles are lost in the process. In earlier versions (starting from v3.0.0), I couldn't control the order in which actions and observations were processed in the same step, but now I can process the observation first 😄.

If you check the CSV files generated with the Logger, you'll notice that the observations have one more row than the actions (due to the reset). For any given row with the same index, you'll see the observation and info, the action taken in that state, and the reward obtained from that action. So, the action takes effect as soon as it’s sent and is reflected in the observation of the next step. I’ve run tests, and everything seems to be working well, but if I missed anything, I imagine more issues will come up.

I’d like to thank everyone who commented on the incident and helped resolve it; your support has been invaluable ❤️

@Mulcek04
Copy link
Author

Hi @AlejandroCN7 ,
Thank you for the update and solution. I appreciate the detailed explanation and I’m glad you were able to address this issue.
Thanks again for your hard work and keeping the community informed. If I meet any further issues, I’ll be sure to reach out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BackEnd Simulator and communication interface bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants