Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter external repos #396

Merged

Conversation

phackstock
Copy link
Contributor

@phackstock phackstock commented Sep 19, 2024

Closes #326.

This PR adds the feature to filter external repositories using include and exclude filters.
Any number of filters combining any attributes can be defined, example:

repositories:
  common-definitions:
    url: https://github.com/IAMconsortium/common-definitions.git/
definitions:
  variable:
    repository:
      name: common-definitions
      include:
        - name: [Primary Energy*, Final Energy*]
        - name: "Population*"
          tier: 1
      exclude:
        - name: "Final Energy|*|*"
  region:
    repository:
      name: common-definitions
      include:
        - hierarchy: [R5, R10]

For the variable section

in the example above we are including:

  1. All variables starting with Primary Energy or Final Energy
  2. All variables starting with Population and with the tier attribute equal to 1

From this list we are then excluding all variables that match "Final Energy||".
This means that the final resulting list will contain no Final Energy variables with
three or more levels.

For the region section

we are taking only R5 and R10 regions.

Changes

One of the changes that I have made is that all repositories in the definition section need to have the name key-word.
This would be a breaking change so all workflow repos that use external repositories would need to be updated.
I'd be happy to streamline it again so that just the name is allowed but for code simplicity I opted against that for now.

@phackstock phackstock added the enhancement New feature or request label Sep 19, 2024
@phackstock phackstock self-assigned this Sep 19, 2024
Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions inline.

As with my other review, I’d strongly advise against using fnmatch.

docs/user_guide/config.rst Outdated Show resolved Hide resolved
tests/data/config_filter/nomenclature.yaml Outdated Show resolved Hide resolved
nomenclature/config.py Show resolved Hide resolved
@phackstock phackstock force-pushed the feature/filter-external-repos branch from 0163e78 to 1bf3d5c Compare November 14, 2024 13:46
@danielhuppmann
Copy link
Member

Quick clarifying question, one can only have one include and one exclude statement per repo-import, right?

@phackstock
Copy link
Contributor Author

Quick clarifying question, one can only have one include and one exclude statement per repo-import, right?

Correct. You can put multiple filters in both the include and exclude though.

@phackstock
Copy link
Contributor Author

phackstock commented Nov 15, 2024

@danielhuppmann, I added a depth option now so you can match for the given level of variable depth.
I didn't implement the "1-" option, since it would make the code a lot less read- and maintainable and using fnmatch we already have that functionality.
Say you want to select "Final Energy" with the depth "1-" option, in pyam logic. You'd use

...
      include:
        - name: Final Energy*
          depth: [0, 1]

which I think is plenty readable.
Since this option uses fnmatch, I chose to keep it. If you want me to remove fnmatch, I can do that, however for now using it made the most sense to me.

@phackstock
Copy link
Contributor Author

@danielhuppmann, as discussed bilaterally, I changed the string pattern matching to use the pyam character escape and regex.
Should be good now.

Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good, a few minor suggestions inline.

One larger question: if an imported model-mapping includes a common region that is filtered out in the region-codelist, the initialization of the region-processor will fail, right? (That's ok, just make sure to create a follow-up issue so that this can be tackled later).

docs/user_guide/config.rst Outdated Show resolved Hide resolved
nomenclature/config.py Outdated Show resolved Hide resolved
@phackstock
Copy link
Contributor Author

One larger question: if an imported model-mapping includes a common region that is filtered out in the region-codelist, the initialization of the region-processor will fail, right? (That's ok, just make sure to create a follow-up issue so that this can be tackled later).

Looking at the code it looked fine to me but just to be sure I added a test in 60d77ce. I import all the mappings from common-definitions and only load the world region. It fails as expected.

Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thank you!

@phackstock phackstock merged commit 701e68c into IAMconsortium:main Nov 20, 2024
11 checks passed
@phackstock phackstock deleted the feature/filter-external-repos branch November 20, 2024 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow attribute filtering in nomenclature.yaml for importing definitions form external repo
2 participants