harmonic model: only fit orders up to local minimum and use coeffs from previous order as first guess #443

veni-vidi-vici-dormivi · 2024-05-07T12:40:22Z

Closes MESMER-M: use coeffs from order - 1 as first guess #422
Fully documented, including CHANGELOG.rst

As already mentioned in the issue above, we can use the fit from the previous order as the first guess for the next order. Moreover, we can only fit orders until we reach a local minimum instead of fitting for all orders and then selecting the one with the smallest loss.
This brings down the time needed to fit the harmonic model from 40.8 seconds to 8.9 seconds for a 250 years x 118 gridpoint dataset.

for more information, see https://pre-commit.ci

codecov · 2024-05-07T13:01:14Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.77%. Comparing base (9b0b76b) to head (7f991da).
Report is 84 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #443      +/-   ##
==========================================
- Coverage   87.90%   83.77%   -4.14%     
==========================================
  Files          40       44       +4     
  Lines        1745     1954     +209     
==========================================
+ Hits         1534     1637     +103     
- Misses        211      317     +106

Flag	Coverage Δ
unittests	`83.77% <100.00%> (-4.14%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

veni-vidi-vici-dormivi · 2024-05-07T14:42:30Z

We could also write a function for finding the local minimum similar to minimize_local_discrete. I wondered if we could use this one here but I would like to preserve all the function output of fit_fourier_series_np and minimize_local_discrete is not made for that, and we also don't need it in its original use case.

What do you think @mathause, leave it like this, factor it out to a separate minimize function, rewrite the original minimize_local_discrete function or use the origin minimize_local_discrete function and get the coeffs and predictions in another call?

mathause

I first thought to allow first_guess=None but as (if) this function is internal I think it is fine.
the way it is currently set up (returning the loss and not the residuals), loss="cauchy" has no effect - can you confirm? (also cauchy does not seem to be recommended)

mesmer/mesmer_m/harmonic_model.py

mathause · 2024-05-08T17:18:56Z

We could also write a function for finding the local minimum similar to minimize_local_discrete. I wondered if we could use this one here but I would like to preserve all the function output of fit_fourier_series_np and minimize_local_discrete is not made for that, and we also don't need it in its original use case.
What do you think @mathause, leave it like this, factor it out to a separate minimize function, rewrite the original minimize_local_discrete function or use the origin minimize_local_discrete function and get the coeffs and predictions in another call?

Careful - you can only abort the loop early if it is strictly monotonic - can you confirm this?
Yes, would be nice if we could use minimize_local_discrete but I don't see an easy way, so lets stay here.

mathause · 2024-05-13T07:08:03Z

(I should maybe not have reviewed this PR, but thought it would be an easy win...)

veni-vidi-vici-dormivi · 2024-05-13T14:01:59Z

the way it is currently set up (returning the loss and not the residuals), loss="cauchy" has no effect - can you confirm? (also cauchy does not seem to be recommended)

Actually, no, it does have an influence... could it be that this is actually wrong? The documentation says func should return the residuals. Could it be that in this way the loss is interpreted as a residual?

mathause · 2024-05-13T15:20:26Z

I agree and think returning the residuals is the correct thing..., however, this sentence in the docstring had me assume returning the loss is fine as well. https://github.com/scipy/scipy/blob/7dcd8c59933524986923cde8e9126f5fc2e6b30b/scipy/optimize/_lsq/least_squares.py#L265

It must allocate and return a 1-D array_like of shape (m,) or a scalar.

I assumed loss="cauchy" would have no influence as np.log1p(z) should be monotonic for any scalar z (is that the right word?). I.e. $x > y \implies \ln{}(x) > \ln{}(y)$. However, the first and second derivative also flows into that (rho[1] and rho[2]) - maybe that has an influence. Or is it the $+1$ part of np.log1p(z)?

https://github.com/scipy/scipy/blob/7dcd8c59933524986923cde8e9126f5fc2e6b30b/scipy/optimize/_lsq/least_squares.py#L190

veni-vidi-vici-dormivi · 2024-05-13T17:03:21Z

Careful - you can only abort the loop early if it is strictly monotonic - can you confirm this?

Empirically - yes, mathematically - 😬 No?.

But logically, since we assume that underlying our data is a "pretty" function, i.e. an oscillating curve with a period of 12, I think it is fair to assume that each additional order brings us closer to the actual curve, and that the residuals decrease monotonically.

mesmer/mesmer_m/harmonic_model.py

mathause

Ok let's add the two TODO comments and merge.

veni-vidi-vici-dormivi · 2024-05-15T08:28:53Z

I assumed loss="cauchy" would have no influence as np.log1p(z) should be monotonic for any scalar z (is that the right word?). I.e. . However, the first and second derivative also flows into that (rho[1] and rho[2]) - maybe that has an influence. Or is it the part of np.log1p(z)?

Hm the first derivative is not monotonic so maybe. I am not exactly sure why it comes to different results. I made a few observations:

returning the loss and using cauchy or not leads to different coefficients at an order of $10^{-5}$ and different predictions at $10^{-4}$ and the returned MSE is the same for both cases (at least up to $10^{-8}$).
returning the residuals without using cauchy (using just the default loss function so linear), the coefficients and predictions are similar to the ones when returning the loss ($10^{-4}$ and $10^{-3}$ ) also with the same MSE as the other two (1.012).
returning the residuals and using cauchy yields quite different coefficients (differing to the others at $10^{-3}$ and predictions at $10^{-1}$ and a greater MSE than all the others (1.016).
returning the residuals and using Huber instead of cauchy gets us an MSE of 1.014.

What it does in the background when we return residuals is apply the loss function to the residuals before computing MSE, and cauchy dampens outliers. I'm not sure though what it does when func returns a scalar. Thus I would say returning the residuals and using linear loss func is the best option.

However I found it nice to have the MSE of the chosen parameters returned by the optimize result, which we lose when we do it like that. OptimizeResult contains the final value of the cost function which is however MSE(lossfunc(resids)) and not MSE(resids). Computationally though it's not so bad I guess calculating the MSE twice. However, in that case I would find it prettier to pass the residuals to calculate_bicand calculate MSE inside of it instead of calculating the MSE in fit_to_bic between fitting the Fourier series and calculating the BIC.

Ok let's add the two TODO comments and merge.

I'll add the TODOs (including some from the stuff above) but I want to wait with merging after we merged the tests.

for more information, see https://pre-commit.ci

…SMER-group#444) * rename i and j to gridcell and year for readability and some commentary --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>

* Adjust bounds for lambda optimization * update comment and change to numpy array

implement nansum in generate_harmonic model

* fix axis in nansum

* add tests for harmonic model * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * flaking * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * generalise tests * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add test for fit to bic xr, rearrange and remove order in generate fs * pandas datetime version compliance * Revert "pandas datetime version compliance" This reverts commit 40a1f45. * actual pandas datetime compliance * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * make coefficients a numpy array * extend tests on generate fourier series * write out tolerance in test_fit_to_bic * write out max_order * use time.dt.month for months * fix for the nan and zero wrangling * verbose 0.1 in xr test * add tests for dataarray in xr func * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add test for numerical stability * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * test with non constant predictors * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove comment about close residuals Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com> * add Note Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>

* adjust bounds of power transformer fit * add comments

…group#447) * remove rosender for pt jacobian * remove rosender import

* align equality check in yeo johnson transform with scikit learn yeo-johnson transform * add NOTE * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* deprecate mask_3D_frac_approx * changelog * use public func again * ignore warnings on the module level * forgotten rename

* deprecate mask_3D_frac_approx * changelog * use public func again * ignore warnings on the module level * forgotten rename * use `mask_3D_frac_approx` from regionmask * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bugfix * deprecate mask_3D_frac_approx * use public func again * ignore warnings on the module level * use `mask_3D_frac_approx` from regionmask * bugfix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix linting * flip names again * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* clean up dependencies * add nc-time-axis * changelog * add plotting dependencies as extras * update min-all-deps.yml * rename extra to complete

* RTD: remove transitive deps from docs env * add comment

for more information, see https://pre-commit.ci

veni-vidi-vici-dormivi and others added 3 commits May 7, 2024 14:36

only fit orders until local minimum

5eb98f4

[pre-commit.ci] auto fixes from pre-commit.com hooks

b3c408d

for more information, see https://pre-commit.ci

readd comment about mse

4627d3a

veni-vidi-vici-dormivi changed the title ~~Harmonic model only fit orders until local minimum~~ Harmonic model only fit orders until local minimum and use coeffs from previous orders as first guess May 7, 2024

output and reuse mse

cd8693a

mathause reviewed May 8, 2024

View reviewed changes

veni-vidi-vici-dormivi added 3 commits May 13, 2024 16:02

remove order arg in minimize

c78c411

add comment

b971913

use np.full

562095d

import scipy as sp

369f2da

mathause reviewed May 14, 2024

View reviewed changes

mesmer/mesmer_m/harmonic_model.py Outdated Show resolved Hide resolved

mathause changed the title ~~Harmonic model only fit orders until local minimum and use coeffs from previous orders as first guess~~ harmonic model: only fit orders up to local minimum and use coeffs from previous order as first guess May 14, 2024

mathause approved these changes May 14, 2024

View reviewed changes

veni-vidi-vici-dormivi mentioned this pull request May 15, 2024

allow nans in generate harmonic model sum #433

Merged

veni-vidi-vici-dormivi and others added 8 commits May 15, 2024 15:03

add TODOs

987d644

[pre-commit.ci] auto fixes from pre-commit.com hooks

95ee897

for more information, see https://pre-commit.ci

Adjust bounds for lambda optimization (MESMER-group#427)

f8a4039

* Adjust bounds for lambda optimization * update comment and change to numpy array

allow nans in generate harmonic model sum (MESMER-group#433)

4c7a920

implement nansum in generate_harmonic model

add axis = 0 for nansum (MESMER-group#445)

16f75b9

* fix axis in nansum

adjust bounds of power transformer fit (MESMER-group#446)

9afe8ac

* adjust bounds of power transformer fit * add comments

veni-vidi-vici-dormivi and others added 12 commits May 23, 2024 16:50

remove jac=rosen_der for from power transformer optimization (MESMER-…

6408cb4

…group#447) * remove rosender for pt jacobian * remove rosender import

deprecate mask_3D_frac_approx (MESMER-group#451)

2b8a30d

* deprecate mask_3D_frac_approx * changelog * use public func again * ignore warnings on the module level * forgotten rename

clean up dependencies (MESMER-group#448)

919015a

* clean up dependencies * add nc-time-axis * changelog * add plotting dependencies as extras * update min-all-deps.yml * rename extra to complete

docs: clean landing page (MESMER-group#450)

b7b22c2

RTD: remove transitive deps from docs env (MESMER-group#449)

e38f86e

* RTD: remove transitive deps from docs env * add comment

remove superfluous arg

794dcd8

adjust tests

1680d94

Merge branch 'main' into hm_optimizing

83038c9

fix

42ec86a

[pre-commit.ci] auto fixes from pre-commit.com hooks

7f991da

for more information, see https://pre-commit.ci

veni-vidi-vici-dormivi merged commit 74e070b into MESMER-group:main May 23, 2024
9 checks passed

veni-vidi-vici-dormivi deleted the hm_optimizing branch May 23, 2024 15:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

harmonic model: only fit orders up to local minimum and use coeffs from previous order as first guess #443

harmonic model: only fit orders up to local minimum and use coeffs from previous order as first guess #443

veni-vidi-vici-dormivi commented May 7, 2024 •

edited

Loading

codecov bot commented May 7, 2024 •

edited

Loading

veni-vidi-vici-dormivi commented May 7, 2024

mathause left a comment

mathause commented May 8, 2024

mathause commented May 13, 2024

veni-vidi-vici-dormivi commented May 13, 2024

mathause commented May 13, 2024

veni-vidi-vici-dormivi commented May 13, 2024

mathause left a comment

veni-vidi-vici-dormivi commented May 15, 2024 •

edited

Loading

harmonic model: only fit orders up to local minimum and use coeffs from previous order as first guess #443

harmonic model: only fit orders up to local minimum and use coeffs from previous order as first guess #443

Conversation

veni-vidi-vici-dormivi commented May 7, 2024 • edited Loading

codecov bot commented May 7, 2024 • edited Loading

Codecov Report

veni-vidi-vici-dormivi commented May 7, 2024

mathause left a comment

Choose a reason for hiding this comment

mathause commented May 8, 2024

mathause commented May 13, 2024

veni-vidi-vici-dormivi commented May 13, 2024

mathause commented May 13, 2024

veni-vidi-vici-dormivi commented May 13, 2024

mathause left a comment

Choose a reason for hiding this comment

veni-vidi-vici-dormivi commented May 15, 2024 • edited Loading

veni-vidi-vici-dormivi commented May 7, 2024 •

edited

Loading

codecov bot commented May 7, 2024 •

edited

Loading

veni-vidi-vici-dormivi commented May 15, 2024 •

edited

Loading