Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: integrate build_bus_regions into base_network #1051

Merged
merged 5 commits into from
May 6, 2024

Conversation

koen-vg
Copy link
Contributor

@koen-vg koen-vg commented May 3, 2024

Unfortunately, #1013 introduced a regression that can be a little hard to detect. It stems from the bad practice of using the input of the rule build_bus_regions as and output on this line: https://github.com/PyPSA/pypsa-eur/blob/master/scripts/build_bus_regions.py#L218. That's a no no!

What happens is that snakemake can run build_bus_regions in parallel with other rules using base.nc as an input, and cryptic errors arise when one rule tries to read base.nc while build_bus_regions is writing to base.nc. An example of the kind of problem I got:

(pypsa-eur) koen-uit-desktop · pypsa-eur > ./test.sh                                                                                         
+ snakemake -call solve_elec_networks --configfile config/test/config.electricity.yaml --rerun-triggers=mtime
Set parameter Username
Academic license - for non-commercial use only - expires 2024-12-04
Config file config/config.default.yaml is extended by additional config specified via the command line.
Config file config/config.yaml is extended by additional config specified via the command line.
Assuming unrestricted shared filesystem usage.
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 16
Rules claiming more threads will be scaled down.
Conda environments: ignored
Job stats:
job                         count
------------------------  -------
add_electricity                 1
add_extra_components            1
base_network                    1
build_bus_regions               1
build_line_rating               1
build_powerplants               1
build_renewable_profiles        4
cluster_network                 1
prepare_network                 1
simplify_network                1
solve_elec_networks             1
solve_network                   1
total                          15

Select jobs to execute...
Execute 1 jobs...

[Fri May  3 14:27:33 2024]
localrule base_network:
    input: data/entsoegridkit/buses.csv, data/entsoegridkit/lines.csv, data/entsoegridkit/links.csv, data/entsoegridkit/converters.csv, data/entsoegridkit/transformers.csv, data/parameter_corrections.yaml, data/links_p_nom.csv, data/links_tyndp.csv, resources/test/country_shapes.geojson, resources/test/offshore_shapes.geojson, resources/test/europe_shape.geojson
    output: resources/test/networks/base.nc
    log: logs/test/base_network.log
    jobid: 8
    benchmark: benchmarks/test/base_network
    reason: Missing output files: resources/test/networks/base.nc
    resources: tmpdir=/tmp, mem_mb=1500, mem_mib=1431

/home/koen/Dokument/UiT/Research/Modelling/pypsa-eur/.snakemake/scripts/tmpomt6arjv.base_network.py:95: SyntaxWarning: invalid escape sequence '\d'
  return df.tags.str.extract('"oid"=>"(\d+)"', expand=False)
INFO:__main__:Removing buses with voltages Index([132.0], dtype='float64')
INFO:__main__:TYNDP links outside of the covered area (skipping): Biscay Gulf, Italy-France, IFA2, Italy-Montenegro, NordLink, COBRA cable, Thames Estuary Cluster (NEMO-Link), Anglo-Scottish -1, ALEGrO, North Sea Link, HVDC SuedOstLink, HVDC Line A-North, France-Alderney-Britain, Viking DKW-GB, ElecLink, Greenconnector, Hansa PowerBridge I, NorthConnect, HVDC SuedLink, AQUIND Interconnector, HVDC Ultranet, Gridlink, NeuConnect, NordBalt, Estlink 1, Greenlink, Celtic Interconnector, GiLA, HG North Tyrrhenian Corridor, HG Adriatic Corridor, SAPEI 2, HG Ionian-Tyrrhenian Corridor, HG Ionian-Tyrrhenian Corridor 2, Germany-UK Hybrid Interconnector, NU-Link Interconnector, APOLLO-LINK, Baltic WindConnector (BWC), High-Voltage Direct Current Interconnector Project Romania-Hungary, Rhine-Main-Link, Green Aegean Interconnector
INFO:__main__:Removing 2 unconnected network components with less than 1 buses. In total 2 buses.
INFO:pypsa.io:Exported network base.nc has lines, buses, transformers, shapes, carriers
[Fri May  3 14:27:36 2024]
Finished job 8.
1 of 15 steps (7%) done
Select jobs to execute...
Execute 3 jobs...

[Fri May  3 14:27:36 2024]
localrule build_line_rating:
    input: resources/test/networks/base.nc, cutouts/be-03-2013-era5.nc
    output: resources/test/networks/line_rating.nc
    log: logs/test/build_line_rating.log
    jobid: 19
    benchmark: benchmarks/test/build_line_rating
    reason: Input files updated by another job: resources/test/networks/base.nc
    threads: 4
    resources: tmpdir=/tmp, mem_mb=4000, mem_mib=3815

[Fri May  3 14:27:36 2024]
localrule build_bus_regions:
    input: resources/test/country_shapes.geojson, resources/test/offshore_shapes.geojson, resources/test/networks/base.nc
    output: resources/test/regions_onshore.geojson, resources/test/regions_offshore.geojson
    log: logs/test/build_bus_regions.log
    jobid: 12
    reason: Missing output files: resources/test/regions_onshore.geojson, resources/test/regions_offshore.geojson; Updated input files: resources/test/country_shapes.geojson, resources/test/offshore_shapes.geojson; Input files updated by another job: resources/test/networks/base.nc
    resources: tmpdir=/tmp, mem_mb=1000, mem_mib=954

[Fri May  3 14:27:36 2024]
localrule build_powerplants:
    input: resources/test/networks/base.nc, data/custom_powerplants.csv
    output: resources/test/powerplants.csv
    log: logs/test/build_powerplants.log
    jobid: 21
    reason: Missing output files: resources/test/powerplants.csv; Input files updated by another job: resources/test/networks/base.nc
    resources: tmpdir=/tmp, mem_mb=5000, mem_mib=4769

INFO:pypsa.io:Imported network base.nc has buses, carriers, lines, shapes, transformers
/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/geopandas/array.py:1470: UserWarning: CRS not set for some of the concatenation inputs. Setting output's CRS as WGS 84 (the single non-null crs provided).
  return GeometryArray(data, crs=_get_common_crs(to_concat))
/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/geopandas/array.py:1470: UserWarning: CRS not set for some of the concatenation inputs. Setting output's CRS as WGS 84 (the single non-null crs provided).
  return GeometryArray(data, crs=_get_common_crs(to_concat))
INFO:pypsa.io:Exported network base.nc has buses, transformers, lines, shapes, carriers
ERROR:root:Uncaught exception
Traceback (most recent call last):
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/backends/file_manager.py", line 211, in _acquire_with_cache_info
    file = self._cache[self._key]
           ~~~~~~~~~~~^^^^^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/backends/lru_cache.py", line 56, in __getitem__
    value = self._cache[key]
            ~~~~~~~~~~~^^^^^
KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('/home/koen/Dokument/UiT/Research/Modelling/pypsa-eur/resources/test/networks/base.nc',), 'a', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False)), 'e04e840c-871d-4d39-923f-fc6b6e2f8a2c']

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/koen/Dokument/UiT/Research/Modelling/pypsa-eur/.snakemake/scripts/tmpv09q_wty.build_bus_regions.py", line 222, in <module>
    n.export_to_netcdf(base_network)
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/pypsa/io.py", line 733, in export_to_netcdf
    with ExporterNetCDF(path, compression, float32) as exporter:
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/pypsa/io.py", line 72, in __exit__
    self.finish()
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/pypsa/io.py", line 421, in finish
    self.ds.to_netcdf(self.path)
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/core/dataset.py", line 2298, in to_netcdf
    return to_netcdf(  # type: ignore  # mypy cannot resolve the overloads:(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/backends/api.py", line 1322, in to_netcdf
    store = store_open(target, mode, format, group, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/backends/netCDF4_.py", line 409, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/backends/netCDF4_.py", line 356, in __init__
    self.format = self.ds.data_model
                  ^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/backends/netCDF4_.py", line 418, in ds
    return self._acquire()
           ^^^^^^^^^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/backends/netCDF4_.py", line 412, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/backends/file_manager.py", line 199, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/backends/file_manager.py", line 217, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/netCDF4/_netCDF4.pyx", line 2469, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 2028, in netCDF4._netCDF4._ensure_nc_success
PermissionError: [Errno 13] Permission denied: '/home/koen/Dokument/UiT/Research/Modelling/pypsa-eur/resources/test/networks/base.nc'
INFO:pypsa.io:Imported network base.nc has buses, carriers, lines, shapes, transformers
ERROR:root:Uncaught exception
Traceback (most recent call last):
  File "/home/koen/Dokument/UiT/Research/Modelling/pypsa-eur/.snakemake/scripts/tmp4juyj8ka.build_powerplants.py", line 174, in <module>
    n = pypsa.Network(snakemake.input.base_network)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/pypsa/components.py", line 372, in __init__
    self.import_from_netcdf(import_name)
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/pypsa/io.py", line 679, in import_from_netcdf
    with ImporterNetCDF(path=path) as importer:
         ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/pypsa/io.py", line 304, in __init__
    self.ds = xr.open_dataset(path)
              ^^^^^^^^^^^^^^^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/backends/api.py", line 554, in open_dataset
    engine = plugins.guess_engine(filename_or_obj)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/koen/.local/opt/mambaforge/envs/pypsa-eur/lib/python3.12/site-packages/xarray/backends/plugins.py", line 197, in guess_engine
    raise ValueError(error_msg)
ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'scipy', 'rasterio']. Consider explicitly selecting one of the installed engines via the ``engine`` parameter, or installing additional IO dependencies, see:
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
https://docs.xarray.dev/en/stable/user-guide/io.html

Changes proposed in this Pull Request

The most logical solution seems to be merging the base_network and build_bus_regions rules; the alternative would be to introduce some intermediate file like base_without_shapes.nc and then base.nc or first base.nc and then base_with_shapes.nc. But that seems rather inelegant.

Checklist

  • I tested my contribution locally and it seems to work fine.
  • Code and workflow changes are sufficiently documented.
  • Changed dependencies are added to envs/environment.yaml.
  • Changes in configuration options are added in all of config.default.yaml.
  • Changes in configuration options are also documented in doc/configtables/*.csv.
  • A release note doc/release_notes.rst is added.

koen-vg added 2 commits May 3, 2024 14:53
Fixes a problem with the `build_bus_regions` writing to base.nc
without declaring base.nc as an output.
@koen-vg
Copy link
Contributor Author

koen-vg commented May 3, 2024

Ah I realise that of course there are some changes in the documentation that have to be made upon removing the build_bus_regions rule; can do this after the weekend or someone else can feel free to do that.

@FabianHofmann
Copy link
Contributor

@koen-vg you are a hero :) I knew that this might be unstable, my bad. The proposed change is perfect

@koen-vg
Copy link
Contributor Author

koen-vg commented May 6, 2024

Alright, it took a couple of iterations, but I would say that the documentation is now good to go. Maybe someone can double-check real quick, but from my side this PR can be merged.

(P.S. it seems quite unnecessary to have images used in the documentation spread between graphics and doc/img; may I propose moving all images (including the ones used in the README) to doc/img? I'd be happy to submit a quick PR for this, or someone else can do it; it's almost no effort. Just makes it easier to do these kinds of updates to the documentation in the future.)

@FabianHofmann
Copy link
Contributor

Great, looks good to me! yes, a single source for the images would be better. would be happy about a PR

@FabianHofmann FabianHofmann merged commit 9343d5d into PyPSA:master May 6, 2024
6 checks passed
@fneum
Copy link
Member

fneum commented May 6, 2024

Great, looks good to me! yes, a single source for the images would be better. would be happy about a PR

I'm just on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants