Ability to add buses to the grid via the Change Table #352

danielolsen · 2020-12-04T04:51:10Z

Purpose

Adding buses via the change table will allow us greater flexibility in constructing scenarios, e.g. modeling 'hybrid' power plants which feature a generator and a storage device behind a single point-of-interconnection with the rest of the grid.
Closes #341.

What the code is doing

In change_table.py:

adding the new ChangeTable.add_bus method. This is modeled after the dcline/plant additions, where it's passed a list of dicts, it validates each one, and eventually adds them to the ct dict, translating from zone_name to zone_id as necessary.
modifying ChangeTable.{add_storage_capacity, add_plant, _add_line} functions to check if the new assets they are adding are being added not only at the locations of existing grid buses, but expanding this check to include the new buses as well.
refactoring the zero-length line check to directly examine the lat/lon of the endpoint buses, rather than passing them to the haversine function and checking if the result is zero (should be equivalent but simpler). This is unrelated, but I noticed that it could be simpler so I decided to refactor it.

In transform_grid.py:

Adding the TransformGrid._add_bus method to interpret the "new_bus" keys in a ChangeTable, and running this as a part of TransformGrid.get_grid().

Testing

New unit tests:

In test_change_table.py:
- Checking that well-specified bus additions are parsed and added properly.
- Checking that several bad new bus inputs raise the appropriate errors.
- Checking that we can add lines, plants, and storage devices at newly-added buses.
In test_transform_grid.py
- Checking that when we add buses, they're given the right numbers, and that the interconnect is correctly added based on the zone.

Time estimate

An hour. Most of the code is in a few big, pretty straightforward chunks, but there are also some smaller changes scattered around, and given the importance of the TransformGrid object, we want to make sure we're not introducing any regressions or new edge cases.

powersimdata/input/change_table.py

rouille · 2020-12-04T05:56:29Z

powersimdata/input/change_table.py

+                    if l not in new_bus.keys():
+                        raise ValueError(f"Each new bus needs {l} info")
+                    if not isinstance(new_bus[l], (int, float)):
+                        raise ValueError(f"{l} must be numeric (int/float)")


Should it be a TypeError

I never know about these ones. The input is of the right type, but the values within it are the wrong type. I have no idea what is 'right'.

Thinking a little bit more about it. I would say the the first exception should be a KeyError since, after all, if you try to access a non-existing key in a dictionary this would be the error raised. I still think that the second one should be a TypeError. Let's see what @BainanXia and @jon-hagg think about it.

I would vote for either TypeErorr or ValueError. KeyError seems like an implementation detail, which the caller of the function should not need to be aware of. We check the inputs so that we don't end up throwing the KeyError, and can raise something more meaningful to the calling function/user.

@rouille I agree with you, the first one should be a KeyError (Python catches keyerror itself by default) given the fact the code is trying to find an expected key in a dict but fails and the second one should be a TypeError given it is thrown after a failure of isinstance check.

That is what we're doing in:

for l in {"lat", "lon"}: if l not in new_bus.keys(): raise ValueError(f"Each new bus needs {l} info")

We don't explicitly say that {"lat", "lon"} is the mandatory set, but that's what it is. We could make the code more explicit about that.

zone_id or zone_name is also a mandatory key. Anyway, it is not very important and checking for the completeness of the dictionary and raise an error if not is good.

We are checking that one and only one of these are specified via

if {"zone_id", "zone_name"} <= set(new_bus.keys()): raise ValueError("Cannot specify both 'zone_id' and 'zone_name'") if {"zone_id", "zone_name"} & set(new_bus.keys()) == set(): raise ValueError("Must specify either 'zone_id' or 'zone_name'")

I guess we could make that simpler/clearer, e.g.

if len({"zone_id", "zone_name"} & set(new_bus.keys())) != 1: raise ValueError("Must specify either 'zone_id' or 'zone_name' (but not both)")

@danielolsen, feel free to raise whatever exception you want. It is a detail and we should move on. Don't forget to document the docstring accordingly.

It is documented.

rouille · 2020-12-04T05:57:14Z

powersimdata/input/change_table.py

+                    del new_bus["zone_name"]
+                if "Pd" in new_bus:
+                    if not isinstance(new_bus["Pd"], (int, float)):
+                        raise ValueError("Pd must be numeric (int/float)")


Should it be a TypeError?

rouille · 2020-12-04T05:57:48Z

powersimdata/input/change_table.py

+                    new_bus["Pd"] = defaults["Pd"]
+                if "baseKV" in new_bus:
+                    if not isinstance(new_bus["baseKV"], (int, float)):
+                        raise ValueError("baseKV must be numeric (int/float)")


Should it be a TypeError?

rouille · 2020-12-04T06:10:08Z

Should we update the bus2sub data frame in the TransformGrid class?

danielolsen · 2020-12-04T17:04:37Z

Should we update the bus2sub data frame in the TransformGrid class?

My first instinct: I don't want to.

My second instinct: I think you are right. We don't use this in many places, but because of how we are interpreting the grid.mat files, I think we need Grid.bus and Grid.bus2sub to be the same length:

PowerSimData/powersimdata/input/scenario_grid.py

Line 96 in 40009a7

self.bus2sub, _ = frame("bus2sub", mpc.bus2sub, mpc.busid)

We'll want to make sure to do some end-to-end testing to make sure that this not only builds a Grid properly, but that it can be loaded successfully as a ScenarioGrid.

BainanXia · 2020-12-04T17:14:01Z

Should we update the bus2sub data frame in the TransformGrid class?

My first instinct: I don't want to.

My second instinct: I think you are right. We don't use this in many places, but because of how we are interpreting the grid.mat files, I think we need Grid.bus and Grid.bus2sub to be the same length:

PowerSimData/powersimdata/input/scenario_grid.py

Line 96 in 40009a7

self.bus2sub, _ = frame("bus2sub", mpc.bus2sub, mpc.busid)

We'll want to make sure to do some end-to-end testing to make sure that this not only builds a Grid properly, but that it can be loaded successfully as a ScenarioGrid.

Agree. I'm about to point it out that the modified grid not only matters when building it but also loading it. We would like to have a consistent dataframes everywhere for a specific scenario. You've been faster than me.

danielolsen · 2020-12-04T17:27:16Z

I added a new test which detects this failure. We will need to modify Grid.bus2sub as well as Grid.sub. I plan to use the new lat/lon to look up the appropriate substation if it exists, or add a new one to bus2sub and sub if it does not.

danielolsen · 2020-12-04T18:57:07Z

Substations are now added automatically, as needed, and this is checked in the add_bus test.

pytest is now taking about 27 seconds to get through the test suite, compared to about 11 seconds before, so either something is being done inefficiently or this new code creates some necessary new complexity. I suspect it may be due to the increased use of TransformGrid within the ChangeTable methods. We can probably simplify in ChangeTable.add_plant and ChangeTable.add_storage_capacity, since these just need to know the list of allowable bus ids, but ChangeTable._add_line looks up several specific entries in the bus dataframe, so I think the TransformGrid.get_grid() is definitely the cleaner approach.

EDIT: here's what we get with --durations=10:

3.07s call     powersimdata/input/tests/test_transform_grid.py::test_add_branch
1.91s call     powersimdata/input/tests/test_transform_grid.py::test_add_bus
1.53s call     powersimdata/input/tests/test_change_table.py::test_add_branch_argument_buses_in_different_interconnect
1.48s call     powersimdata/input/tests/test_change_table.py::test_add_branch_zero_distance_between_buses
0.82s call     powersimdata/input/tests/test_grid.py::test_drop_one_interconnect
0.79s call     powersimdata/input/tests/test_grid.py::test_that_fields_are_not_modified_when_loading_another_grid
0.77s setup    powersimdata/input/tests/test_grid.py::test_deepcopy_works
0.77s call     powersimdata/input/tests/test_grid.py::test_drop_two_interconnect
0.76s setup    powersimdata/input/tests/test_grid.py::test_grid_eq_failure_simple

danielolsen · 2020-12-04T20:28:52Z

This has been refactored to be more performant: inspired by @jon-hagg, I added a caching method in ChangeTable._get_new_bus(), which will only call TransformGrid if there is something in self.ct["new_bus"] that we haven't seen before. Otherwise, we will return a known bus dataframe, rather than re-calculating it (which necessitates several time-consuming pandas.DataFrame.append() calls).

Test times are back down to 11-12 seconds on my machine.

powersimdata/input/change_table.py

danielolsen · 2020-12-07T21:47:31Z

This has been end-to-end tested on a scenario, adding a branch, a plant, ad a storage device to a new bus. The scenarios prepares, runs, and extracts properly, all from within PowerSimData.

On the way, I found a bug in how we prepare profiles: previously, we would only use TransformProfile if the profiles were being scaled, but not if only new plants were added. This was causing errors within REISE.jl because the case.mat file would list more generators than were in the profile. This has been fixed, see the new changes to execute.py.

The scenario setup used to test:

from powersimdata.scenario.scenario import Scenario
scenario = Scenario('')
scenario.state.set_builder(["Texas"])
scenario.state.builder.set_base_profile("demand", "ercot")
scenario.state.builder.set_base_profile("hydro", "v2")
scenario.state.builder.set_base_profile("solar", "v4.1")
scenario.state.builder.set_base_profile("wind", "v5.1")
scenario.state.builder.set_name("test", "new_bus2")
scenario.state.builder.set_time("2016-01-01 00:00:00", "2016-01-03 23:00:00", "24H")
new_bus_id = scenario.state.get_grid().bus.index.max() + 1
scenario.state.builder.change_table.add_bus(
    [{"lat": 30, "lon": -95, "zone_id": 308}]
)
scenario.state.builder.change_table.add_storage_capacity(
    bus_id={new_bus_id: 100}
)
scenario.state.builder.change_table.add_plant(
    [{"type": "wind", "bus_id": new_bus_id, "Pmax": 400}]
)
scenario.state.builder.change_table.add_branch(
    [{"from_bus_id": (new_bus_id - 1), "to_bus_id": new_bus_id, "capacity": 300}]
)
scenario.state.create_scenario()
scenario.state.prepare_simulation_input()
scenario.state.launch_simulation(threads=8)  # By default will auto-extract

Then, loading the completed scenario is successful:

>>> scenario = Scenario(1713)
Transferring ScenarioList.csv from server
100%|########################################| 234k/234k [00:00<00:00, 851kb/s]
Transferring ExecuteList.csv from server
100%|######################################| 20.7k/20.7k [00:00<00:00, 175kb/s]
SCENARIO: test | new_bus2

--> State
analyze
--> Loading grid
1713_grid.mat not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring 1713_grid.mat from server
100%|#######################################| 191k/191k [00:00<00:00, 1.24Mb/s]
Loading bus
Loading plant
Loading heat_rate_curve
Loading gencost_before
Loading gencost_after
Loading branch
Loading sub
Loading bus2sub
--> Loading ct
1713_ct.pkl not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring 1713_ct.pkl from server
100%|#########################################| 368/368 [00:00<00:00, 3.57kb/s]

BainanXia · 2020-12-07T22:24:56Z

This has been end-to-end tested on a scenario, adding a branch, a plant, ad a storage device to a new bus. The scenarios prepares, runs, and extracts properly, all from within PowerSimData.

On the way, I found a bug in how we prepare profiles: previously, we would only use TransformProfile if the profiles were being scaled, but not if only new plants were added. This was causing errors within REISE.jl because the case.mat file would list more generators than were in the profile. This has been fixed, see the new changes to execute.py.

The scenario setup used to test:

from powersimdata.scenario.scenario import Scenario
scenario = Scenario('')
scenario.state.set_builder(["Texas"])
scenario.state.builder.set_base_profile("demand", "ercot")
scenario.state.builder.set_base_profile("hydro", "v2")
scenario.state.builder.set_base_profile("solar", "v4.1")
scenario.state.builder.set_base_profile("wind", "v5.1")
scenario.state.builder.set_name("test", "new_bus2")
scenario.state.builder.set_time("2016-01-01 00:00:00", "2016-01-03 23:00:00", "24H")
new_bus_id = scenario.state.get_grid().bus.index.max() + 1
scenario.state.builder.change_table.add_bus(
    [{"lat": 30, "lon": -95, "zone_id": 308}]
)
scenario.state.builder.change_table.add_storage_capacity(
    bus_id={new_bus_id: 100}
)
scenario.state.builder.change_table.add_plant(
    [{"type": "wind", "bus_id": new_bus_id, "Pmax": 400}]
)
scenario.state.builder.change_table.add_branch(
    [{"from_bus_id": (new_bus_id - 1), "to_bus_id": new_bus_id, "capacity": 300}]
)
scenario.state.create_scenario()
scenario.state.prepare_simulation_input()
scenario.state.launch_simulation(threads=8)  # By default will auto-extract

Then, loading the completed scenario is successful:

>>> scenario = Scenario(1713)
Transferring ScenarioList.csv from server
100%|########################################| 234k/234k [00:00<00:00, 851kb/s]
Transferring ExecuteList.csv from server
100%|######################################| 20.7k/20.7k [00:00<00:00, 175kb/s]
SCENARIO: test | new_bus2

--> State
analyze
--> Loading grid
1713_grid.mat not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring 1713_grid.mat from server
100%|#######################################| 191k/191k [00:00<00:00, 1.24Mb/s]
Loading bus
Loading plant
Loading heat_rate_curve
Loading gencost_before
Loading gencost_after
Loading branch
Loading sub
Loading bus2sub
--> Loading ct
1713_ct.pkl not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring 1713_ct.pkl from server
100%|#########################################| 368/368 [00:00<00:00, 3.57kb/s]

Good catch! Tested on my end and it works!

powersimdata/scenario/execute.py

powersimdata/input/transform_grid.py

rouille

Thanks

…cache

danielolsen added the new feature Feature that is currently in progress. label Dec 4, 2020

danielolsen requested review from rouille, BainanXia and jenhagg December 4, 2020 04:51

danielolsen self-assigned this Dec 4, 2020