Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix loading4mesmerx #502

Merged

Conversation

yquilcaille
Copy link
Collaborator

Lorenzo warned me that the training with MESMER-X returns many coefficients with NaNs. I looked into that and found two types of issues.
Problem 1:
Loading data with MESMER-X requires to ensure that the same members are loaded. MESMER uses only temperature, thus it is ok. But with different variables, each variable has different ensemble members available. Thus it is important that the predictor and the target are loaded with the same set of members, in the same order. For some reason, it was not the case here.
Solution 1:
I implemented some code that I was using for the former versions of MESMER-X.
Results 1:
It works normally now,

Problem 2:
I validated the training of MESMER-X on different distributions and expressions, but I realized that I forgot to check for distributions with shape parameters. There were NaN because the coefficients on the shape parameter were not sufficiently well estimated during the first guess. Basically, the shape coefficients were not good enough, so points of the sample out of the support, identified during training as an error, thus no convergence.
Solution 2:
In MESMER-X, I was using the analytical expression of the support of a distribution, the estimates for the location and the shape, and the range of the sample to determine a range for the shape parameter. Here, I prefer not to use an analytical expression because it depends on each distribution, and i am trying to be as general as possible. The solution that I chose was to use the function support that comes with distributions of scipy.stats and thus the class Expression, and tune the coefficients that are not on the location or the scale (not always a shape, sometimes more than 1) so that the whole sample is within the support.
Results 2:
No NaN in coefficients anymore. Validated on GEV distribution, with linear evolution of location & scale, and linear evolution of location, scale & shape. I noticed a slight acceleration on the training, but the first guess is still too long.

  • Closes #xxx
  • Tests added
  • Fully documented, including CHANGELOG.rst

Copy link

codecov bot commented Aug 22, 2024

Codecov Report

Attention: Patch coverage is 0% with 49 lines in your changes missing coverage. Please review.

Project coverage is 49.78%. Comparing base (c94a5f1) to head (c15e6fc).
Report is 47 commits behind head on main.

Files with missing lines Patch % Lines
mesmer/mesmer_x/train_l_distrib_mesmerx.py 0.00% 31 Missing ⚠️
mesmer/mesmer_x/load_cmip_mesmerx.py 0.00% 14 Missing ⚠️
mesmer/mesmer_x/temporary_support.py 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #502      +/-   ##
==========================================
- Coverage   50.32%   49.78%   -0.54%     
==========================================
  Files          50       50              
  Lines        3527     3565      +38     
==========================================
  Hits         1775     1775              
- Misses       1752     1790      +38     
Flag Coverage Δ
unittests 49.78% <0.00%> (-0.54%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@veni-vidi-vici-dormivi veni-vidi-vici-dormivi merged commit a4c293a into MESMER-group:main Aug 22, 2024
9 checks passed
@yquilcaille yquilcaille deleted the fix_loading4mesmerx branch February 10, 2025 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants