Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

harmonic model: only fit orders up to local minimum and use coeffs from previous order as first guess #443

Merged

Conversation

veni-vidi-vici-dormivi
Copy link
Collaborator

@veni-vidi-vici-dormivi veni-vidi-vici-dormivi commented May 7, 2024

As already mentioned in the issue above, we can use the fit from the previous order as the first guess for the next order. Moreover, we can only fit orders until we reach a local minimum instead of fitting for all orders and then selecting the one with the smallest loss.
This brings down the time needed to fit the harmonic model from 40.8 seconds to 8.9 seconds for a 250 years x 118 gridpoint dataset.

Copy link

codecov bot commented May 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.77%. Comparing base (9b0b76b) to head (7f991da).
Report is 84 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #443      +/-   ##
==========================================
- Coverage   87.90%   83.77%   -4.14%     
==========================================
  Files          40       44       +4     
  Lines        1745     1954     +209     
==========================================
+ Hits         1534     1637     +103     
- Misses        211      317     +106     
Flag Coverage Δ
unittests 83.77% <100.00%> (-4.14%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@veni-vidi-vici-dormivi veni-vidi-vici-dormivi changed the title Harmonic model only fit orders until local minimum Harmonic model only fit orders until local minimum and use coeffs from previous orders as first guess May 7, 2024
@veni-vidi-vici-dormivi
Copy link
Collaborator Author

We could also write a function for finding the local minimum similar to minimize_local_discrete. I wondered if we could use this one here but I would like to preserve all the function output of fit_fourier_series_np and minimize_local_discrete is not made for that, and we also don't need it in its original use case.

What do you think @mathause, leave it like this, factor it out to a separate minimize function, rewrite the original minimize_local_discrete function or use the origin minimize_local_discrete function and get the coeffs and predictions in another call?

Copy link
Member

@mathause mathause left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I first thought to allow first_guess=None but as (if) this function is internal I think it is fine.
  • the way it is currently set up (returning the loss and not the residuals), loss="cauchy" has no effect - can you confirm? (also cauchy does not seem to be recommended)

@mathause
Copy link
Member

mathause commented May 8, 2024

We could also write a function for finding the local minimum similar to minimize_local_discrete. I wondered if we could use this one here but I would like to preserve all the function output of fit_fourier_series_np and minimize_local_discrete is not made for that, and we also don't need it in its original use case.
What do you think @mathause, leave it like this, factor it out to a separate minimize function, rewrite the original minimize_local_discrete function or use the origin minimize_local_discrete function and get the coeffs and predictions in another call?

  • Careful - you can only abort the loop early if it is strictly monotonic - can you confirm this?
  • Yes, would be nice if we could use minimize_local_discrete but I don't see an easy way, so lets stay here.

@mathause
Copy link
Member

(I should maybe not have reviewed this PR, but thought it would be an easy win...)

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

the way it is currently set up (returning the loss and not the residuals), loss="cauchy" has no effect - can you confirm? (also cauchy does not seem to be recommended)

Actually, no, it does have an influence... could it be that this is actually wrong? The documentation says func should return the residuals. Could it be that in this way the loss is interpreted as a residual?

@mathause
Copy link
Member

I agree and think returning the residuals is the correct thing..., however, this sentence in the docstring had me assume returning the loss is fine as well. https://github.com/scipy/scipy/blob/7dcd8c59933524986923cde8e9126f5fc2e6b30b/scipy/optimize/_lsq/least_squares.py#L265

It must allocate and return a 1-D array_like of shape (m,) or a scalar.

I assumed loss="cauchy" would have no influence as np.log1p(z) should be monotonic for any scalar z (is that the right word?). I.e. $x &gt; y \implies \ln{}(x) &gt; \ln{}(y)$. However, the first and second derivative also flows into that (rho[1] and rho[2]) - maybe that has an influence. Or is it the $+1$ part of np.log1p(z)?

https://github.com/scipy/scipy/blob/7dcd8c59933524986923cde8e9126f5fc2e6b30b/scipy/optimize/_lsq/least_squares.py#L190

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Careful - you can only abort the loop early if it is strictly monotonic - can you confirm this?

Empirically - yes, mathematically - 😬 No?.

But logically, since we assume that underlying our data is a "pretty" function, i.e. an oscillating curve with a period of 12, I think it is fair to assume that each additional order brings us closer to the actual curve, and that the residuals decrease monotonically.

@mathause mathause changed the title Harmonic model only fit orders until local minimum and use coeffs from previous orders as first guess harmonic model: only fit orders up to local minimum and use coeffs from previous order as first guess May 14, 2024
Copy link
Member

@mathause mathause left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok let's add the two TODO comments and merge.

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

veni-vidi-vici-dormivi commented May 15, 2024

I assumed loss="cauchy" would have no influence as np.log1p(z) should be monotonic for any scalar z (is that the right word?). I.e. . However, the first and second derivative also flows into that (rho[1] and rho[2]) - maybe that has an influence. Or is it the part of np.log1p(z)?

Hm the first derivative is not monotonic so maybe. I am not exactly sure why it comes to different results. I made a few observations:

  • returning the loss and using cauchy or not leads to different coefficients at an order of $10^{-5}$ and different predictions at $10^{-4}$ and the returned MSE is the same for both cases (at least up to $10^{-8}$).
  • returning the residuals without using cauchy (using just the default loss function so linear), the coefficients and predictions are similar to the ones when returning the loss ($10^{-4}$ and $10^{-3}$ ) also with the same MSE as the other two (1.012).
  • returning the residuals and using cauchy yields quite different coefficients (differing to the others at $10^{-3}$ and predictions at $10^{-1}$ and a greater MSE than all the others (1.016).
  • returning the residuals and using Huber instead of cauchy gets us an MSE of 1.014.

What it does in the background when we return residuals is apply the loss function to the residuals before computing MSE, and cauchy dampens outliers. I'm not sure though what it does when func returns a scalar. Thus I would say returning the residuals and using linear loss func is the best option.

However I found it nice to have the MSE of the chosen parameters returned by the optimize result, which we lose when we do it like that. OptimizeResult contains the final value of the cost function which is however MSE(lossfunc(resids)) and not MSE(resids). Computationally though it's not so bad I guess calculating the MSE twice. However, in that case I would find it prettier to pass the residuals to calculate_bicand calculate MSE inside of it instead of calculating the MSE in fit_to_bic between fitting the Fourier series and calculating the BIC.

Ok let's add the two TODO comments and merge.

I'll add the TODOs (including some from the stuff above) but I want to wait with merging after we merged the tests.

veni-vidi-vici-dormivi and others added 8 commits May 15, 2024 15:03
…SMER-group#444)

* rename i and j to gridcell and year for readability and some commentary
---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
* Adjust bounds for lambda optimization

* update comment and change to numpy array
implement nansum in generate_harmonic model
* add tests for harmonic model

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* flaking

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* generalise tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add test for fit to bic xr, rearrange and remove order in generate fs

* pandas datetime version compliance

* Revert "pandas datetime version compliance"

This reverts commit 40a1f45.

* actual pandas datetime compliance

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* make coefficients a numpy array

* extend tests on generate fourier series

* write out tolerance in test_fit_to_bic

* write out max_order

* use time.dt.month for months

* fix for the nan and zero wrangling

* verbose 0.1 in xr test

* add tests for dataarray in xr func

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add test for numerical stability

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* test with non constant predictors

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove comment about close residuals

Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>

* add Note

Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
* adjust bounds of power transformer fit

* add comments
veni-vidi-vici-dormivi and others added 12 commits May 23, 2024 16:50
…group#447)

* remove rosender for pt jacobian

* remove rosender import
* align equality check in yeo johnson transform with scikit learn yeo-johnson transform

* add NOTE

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* deprecate mask_3D_frac_approx

* changelog

* use public func again

* ignore warnings on the module level

* forgotten rename
* deprecate mask_3D_frac_approx

* changelog

* use public func again

* ignore warnings on the module level

* forgotten rename

* use `mask_3D_frac_approx` from regionmask

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* bugfix

* deprecate mask_3D_frac_approx

* use public func again

* ignore warnings on the module level

* use `mask_3D_frac_approx` from regionmask

* bugfix

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix linting

* flip names again

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* clean up dependencies

* add nc-time-axis

* changelog

* add plotting dependencies as extras

* update min-all-deps.yml

* rename extra to complete
* RTD: remove transitive deps from docs env

* add comment
@veni-vidi-vici-dormivi veni-vidi-vici-dormivi merged commit 74e070b into MESMER-group:main May 23, 2024
9 checks passed
@veni-vidi-vici-dormivi veni-vidi-vici-dormivi deleted the hm_optimizing branch May 23, 2024 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MESMER-M: use coeffs from order - 1 as first guess
2 participants