Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warning level to DataValidator #431

Merged
merged 5 commits into from
Dec 13, 2024

Conversation

dc-almeida
Copy link
Collaborator

Closes #426
Adds a warning_level attribute to data validation YAML files and registers warnings in log, as specified in #426.
If present, failed validation does not raise errors for corresponding criteria.

@dc-almeida dc-almeida added the enhancement New feature or request label Nov 22, 2024
@dc-almeida dc-almeida self-assigned this Nov 22, 2024
@dc-almeida dc-almeida marked this pull request as ready for review November 22, 2024 13:52
@dc-almeida
Copy link
Collaborator Author

After discussion in weekly stand-up, will change the fail message start for the warning case, as it might be confusing to refer to a "failed" validation.

Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - a few suggestions inline, then good to be merged from my side.

tests/data/validation/validate_data/validate_warning.yaml Outdated Show resolved Hide resolved
nomenclature/processor/data_validator.py Show resolved Hide resolved
Copy link
Contributor

@phackstock phackstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As just discussed bilaterally, I'd suggest a few changes detailed inline below.

nomenclature/processor/data_validator.py Outdated Show resolved Hide resolved
nomenclature/processor/data_validator.py Outdated Show resolved Hide resolved
nomenclature/processor/data_validator.py Show resolved Hide resolved
nomenclature/processor/data_validator.py Outdated Show resolved Hide resolved
@danielhuppmann
Copy link
Member

Seems like I can't respond to the comment by @phackstock on the warning-level inline, so putting this here.

I think that the suggestion to use integers instead of strings is just confusing to non-expert users, without adding any practical benefit.

In yesterday's ScenarioCompass meeting, we discussed the need for different types of warnings in the sense of "warning for deviation from historical energy data" versus a "biodiversity concern in the future". Here, there is no clear ordering in terms of low/medium/high - instead, this could be clearly differentiated using colors.

Going one step further, it may be useful to define the allowed warning-level-names in the nomenclature.yaml project-config, not the package, so that we don't have to always make a new release when adding new levels.

@dc-almeida
Copy link
Collaborator Author

As discussed in our meeting, incorporating the following features now:

  • warning levels, defaulting to error if not specified (ordering is implicit in the Enum)
  • allowing multiple warning levels for same variable (to be refined in future)
  • printing warning level in log output df's (this addition can lead to hiding other columns, such as 'variable', to preserve text length; see test_DataValidator_apply_fails; the variable is still printed in the criteria header, though, and row numbers ultimately identify error/warning sources)

Follow-up PRs will include:

  • a way to refactor all warnings/errors into single df (ordered by warning level) for easy tracing and visualisation
  • rewriting warnings for each variable such that only the highest level is printed (when multiple)
  • include descriptions of constraints for warnings

Copy link
Member

@danielhuppmann danielhuppmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good as a first step!

Copy link
Contributor

@phackstock phackstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to be merged from my side.

@dc-almeida dc-almeida merged commit c16ee58 into IAMconsortium:main Dec 13, 2024
11 checks passed
@dc-almeida dc-almeida deleted the data-validator-warning branch December 17, 2024 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add warning-feature to DataValidator
3 participants