Skip to content

Commit

Permalink
Remove linear regression parameters from harmonic model (MESMER-group…
Browse files Browse the repository at this point in the history
…#488)

* factor out linear regression harmonic model

* rename generate_fourier_series

* remove unnecessary import

* fix test

* adjust api.rst

* doc nit

* changelog

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mathias Hauser <mathause@users.noreply.github.com>
  • Loading branch information
3 people authored Aug 8, 2024
1 parent a75a73c commit 3035928
Show file tree
Hide file tree
Showing 5 changed files with 56 additions and 52 deletions.
12 changes: 9 additions & 3 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,8 @@ that this led to some numerical changes compared to the MESMER-M publication
- move the harmonic model and power transformer functionalities to the stats module (
`#484 <https://github.com/MESMER-group/mesmer/pull/484>`_).

**Auto-Regression**
Auto-Regression
~~~~~~~~~~~~~~~

- Implement functions performing the monthly (cyclo-stationary) auto-regression and adapt these functions to
work with xarray. This includes extracting the drawing of spatially correlated innovations to a
Expand All @@ -74,7 +75,8 @@ that this led to some numerical changes compared to the MESMER-M publication
- Implement function to localize the empirical covarince matrix for each month individually to use in drawing
of spatially correlated noise in the AR process. (`#479 <https://github.com/MESMER-group/mesmer/pull/479>`_)

**Yeo-Johnson power transformer**
Yeo-Johnson power transformer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Ensure the power transformer yields the correct normalization for more cases (
`#440 <https://github.com/MESMER-group/mesmer/issues/440>`_):
Expand All @@ -100,17 +102,21 @@ that this led to some numerical changes compared to the MESMER-M publication
- add tests (`#430 <https://github.com/MESMER-group/mesmer/pull/430>`_)


**Harmonic model**
Harmonic model
~~~~~~~~~~~~~~

- Performance and other optimizations:

- only fit orders up to local minimum and use coeffs from precious order as first guess (`#443 <https://github.com/MESMER-group/mesmer/pull/443>`_)
- infer the harmonic model order from the coefficients (`#434 <https://github.com/MESMER-group/mesmer/pull/434>`_)
- return residuals instead of the loss for the optimization (`#460 <https://github.com/MESMER-group/mesmer/pull/460>`_)
- remove fitting of linear regression with yearly temperature (`#415 <https://github.com/MESMER-group/mesmer/pull/415/>_` and
`#488 <https://github.com/MESMER-group/mesmer/pull/488>`_) in line with (`Nath et al. 2022 <https://doi.org/10.5194/esd-13-851-2022>`_).
- add helper function to upsample yearly data to monthly resolution (
`#418 <https://github.com/MESMER-group/mesmer/pull/418>`_, and
`#435 <https://github.com/MESMER-group/mesmer/pull/435>`_)
- de-duplicate the expression of months in their harmonic form (`#415 <https://github.com/MESMER-group/mesmer/pull/415>`_)
move creation of the month array to the deepest level (`#487 <https://github.com/MESMER-group/mesmer/pull/487>`_).
- fix indexing of harmonic model coefficients (`#415 <https://github.com/MESMER-group/mesmer/pull/415>`_)
- Refactor variable names, small code improvements, fixes and clean docstring (
`#415 <https://github.com/MESMER-group/mesmer/pull/415>`_,
Expand Down
2 changes: 1 addition & 1 deletion docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Harmonic Model
.. autosummary::
:toctree: generated/

~stats.generate_fourier_series
~stats.predict_harmonic_model
~stats.fit_harmonic_model

Power Transformer
Expand Down
4 changes: 2 additions & 2 deletions mesmer/stats/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
select_ar_order,
)
from mesmer.stats._gaspari_cohn import gaspari_cohn, gaspari_cohn_correlation_matrices
from mesmer.stats._harmonic_model import fit_harmonic_model, generate_fourier_series
from mesmer.stats._harmonic_model import fit_harmonic_model, predict_harmonic_model
from mesmer.stats._linear_regression import LinearRegression
from mesmer.stats._localized_covariance import (
adjust_covariance_ar1,
Expand Down Expand Up @@ -48,7 +48,7 @@
"lowess",
# harmonic model
"fit_harmonic_model",
"generate_fourier_series",
"predict_harmonic_model",
# power transformer
"lambda_function",
"get_lambdas_from_covariates",
Expand Down
22 changes: 11 additions & 11 deletions mesmer/stats/_harmonic_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,6 @@ def _generate_fourier_series_np(yearly_predictor, coeffs):
n_years = yearly_predictor.size // 12
months = np.tile(np.arange(1, 13), n_years)

# fix these parameters, according to paper
# we could also fit them and give an initial guess of 0 and 1 in the coeffs array as before
beta0 = 0
beta1 = 1

seasonal_cycle = np.nansum(
[
(coeffs[idx * 4] * yearly_predictor + coeffs[idx * 4 + 1])
Expand All @@ -54,11 +49,11 @@ def _generate_fourier_series_np(yearly_predictor, coeffs):
],
axis=0,
)
return beta0 + beta1 * yearly_predictor + seasonal_cycle
return seasonal_cycle


def generate_fourier_series(yearly_predictor, coeffs, time, time_dim="time"):
"""construct a Fourier Series from yearly predictors with fitted coeffs - numpy wrapper.
def predict_harmonic_model(yearly_predictor, coeffs, time, time_dim="time"):
"""construct a Fourier Series from yearly predictors with fitted coeffs.
Parameters
----------
Expand Down Expand Up @@ -91,7 +86,7 @@ def generate_fourier_series(yearly_predictor, coeffs, time, time_dim="time"):
yearly_predictor, time, time_dim
)

predictions = xr.apply_ufunc(
predictions = upsampled_y + xr.apply_ufunc(
_generate_fourier_series_np,
upsampled_y,
coeffs,
Expand Down Expand Up @@ -270,21 +265,26 @@ def fit_harmonic_model(yearly_predictor, monthly_target, max_order=6, time_dim="
yearly_predictor, monthly_target[time_dim], time_dim=time_dim
)

# subtract annual mean to have seasonal anomalies around 0
seasonal_deviations = monthly_target - yearly_predictor

n_sel, coeffs, preds = xr.apply_ufunc(
_fit_fourier_order_np,
yearly_predictor,
monthly_target,
seasonal_deviations,
input_core_dims=[[time_dim], [time_dim]],
output_core_dims=([], ["coeff"], [time_dim]),
vectorize=True,
output_dtypes=[int, float, float],
kwargs={"max_order": max_order},
)

preds = yearly_predictor + preds

data_vars = {
"n_sel": n_sel,
"coeffs": coeffs,
"predictions": preds,
"predictions": preds.transpose(time_dim, ...),
}

return xr.Dataset(data_vars)
68 changes: 33 additions & 35 deletions tests/unit/test_harmonic_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,11 @@
from packaging.version import Version

import mesmer
from mesmer.core.utils import _check_dataarray_form, upsample_yearly_data
from mesmer.core.utils import _check_dataarray_form
from mesmer.stats._harmonic_model import (
_fit_fourier_order_np,
_generate_fourier_series_np,
predict_harmonic_model,
)
from mesmer.testing import trend_data_1D, trend_data_2D

Expand All @@ -18,28 +19,26 @@ def test_generate_fourier_series_np():
n_years = 10
n_months = n_years * 12

yearly_predictor = np.zeros(n_months)
yearly_predictor = np.ones(n_months)
months = np.tile(np.arange(1, 13), n_years)

coeffs = np.array([0, -1, 0, -2])

# dummy yearly cycle
expected = -np.sin(2 * np.pi * (months) / 12) - 2 * np.cos(
expected = coeffs[1] * np.sin(2 * np.pi * (months) / 12) + coeffs[3] * np.cos(
2 * np.pi * (months) / 12
)
result = _generate_fourier_series_np(yearly_predictor, np.array([0, -1, 0, -2]))
np.testing.assert_equal(result, expected)

yearly_predictor = np.ones(n_months)
result = _generate_fourier_series_np(yearly_predictor, np.array([0, -1, 0, -2]))
# NOTE: yearly_predictor is added to the Fourier series
expected += 1
result = _generate_fourier_series_np(yearly_predictor, coeffs)

np.testing.assert_equal(result, expected)

result = _generate_fourier_series_np(yearly_predictor, np.array([3.14, -1, 1, -2]))
expected += 3.14 * np.sin(np.pi * months / 6) + 1 * np.cos(np.pi * months / 6)
np.testing.assert_allclose(result, expected, atol=1e-10)


def test_generate_fourier_series():
def test_predict_harmonic_model():
n_years = 10
n_lat, n_lon, n_gridcells = 2, 3, 2 * 3
freq = "AS" if Version(pd.__version__) < Version("2.2") else "YS"
Expand All @@ -53,7 +52,7 @@ def test_generate_fourier_series():

coeffs = get_2D_coefficients(order_per_cell=[1, 2, 3], n_lat=n_lat, n_lon=n_lon)

result = mesmer.stats.generate_fourier_series(
result = mesmer.stats.predict_harmonic_model(
yearly_predictor, coeffs, monthly_time, time_dim="time"
)

Expand Down Expand Up @@ -139,7 +138,9 @@ def test_fit_harmonic_model():

coefficients = get_2D_coefficients(order_per_cell=orders, n_lat=3, n_lon=2)

yearly_predictor = trend_data_2D(n_timesteps=n_ts, n_lat=3, n_lon=2)
yearly_predictor = trend_data_2D(n_timesteps=n_ts, n_lat=3, n_lon=2).transpose(
"time", "cells"
)

freq = "AS" if Version(pd.__version__) < Version("2.2") else "YS"
yearly_predictor["time"] = xr.cftime_range(
Expand All @@ -148,16 +149,9 @@ def test_fit_harmonic_model():

time = xr.cftime_range(start="2000-01-01", periods=n_ts * 12, freq="MS")
monthly_time = xr.DataArray(time, dims=["time"], coords={"time": time})
upsampled_yearly_predictor = upsample_yearly_data(yearly_predictor, monthly_time)

monthly_target = xr.apply_ufunc(
_generate_fourier_series_np,
upsampled_yearly_predictor,
coefficients,
input_core_dims=[["time"], ["coeff"]],
output_core_dims=[["time"]],
vectorize=True,
output_dtypes=[float],

monthly_target = predict_harmonic_model(
yearly_predictor, coefficients, time=monthly_time
)

# test if the model can recover the monthly target from perfect fourier series
Expand All @@ -177,22 +171,26 @@ def test_fit_harmonic_model():
# compare numerically one cell of one year
expected = np.array(
[
9.99630445,
9.98829217,
7.32212458,
2.73123514,
-2.53876124,
-7.07931947,
-9.69283667,
-9.6945128,
-7.08035255,
-2.53178204,
2.74790275,
7.34046832,
9.975936,
9.968497,
7.32234,
2.750445,
-2.520796,
-7.081546,
-9.713699,
-9.71333,
-7.077949,
-2.509761,
2.76855,
7.340076,
]
)

result_comp = result.predictions.isel(cells=0, time=slice(0, 12)).values
np.testing.assert_allclose(
result.predictions.isel(cells=0, time=slice(0, 12)).values, expected
result_comp,
expected,
atol=1e-6,
)


Expand Down

0 comments on commit 3035928

Please sign in to comment.