Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move temporal 'regridding' from multi_model_statistics preprocessor to regrid_time preprocessor #744

Open
bouweandela opened this issue Aug 7, 2020 · 8 comments
Labels
enhancement New feature or request preprocessor Related to the preprocessor

Comments

@bouweandela
Copy link
Member

At the moment, the preprocessor multi_model_statistics will also attempt to automatically 'regrid' the time coordinate of the incoming datasets. Other coordinates, such as horizontal and vertical coordinates, need to be explicitly regridded by defining a preprocesor in the recipe. It would be good to be consistent and make the temporal regridding more visible and move the temporal regridding to the regrid_time preprocessor. If the datasets have a time coordinate, users would then need to specify the regrid_time preprocessor in the recipe as well as as the regrid and extract_levels.

This would also fix the problem described in #738, that multi_model_statistics does not work for variables that do not have a time coordinate.

@bouweandela bouweandela added enhancement New feature or request preprocessor Related to the preprocessor labels Aug 7, 2020
@Peter9192
Copy link
Contributor

It would be really useful if regrid_time could then also check the calendar (as does mmstats, also see #685). I keep running into incompatible time coordinates because the calendars differ.

@schlunma
Copy link
Contributor

It would be really useful if regrid_time could then also check the calendar (as does mmstats, also see #685). I keep running into incompatible time coordinates because the calendars differ.

Yes please!

I (and @hb326) encountered problems with iris.quickplot.plot when the time coordinates of the different cubes have different calendars. As a fix we use the following code:

def _unify_time_coord(cube):
    """Unify time coordinate of cube."""
    if not cube.coords('time', dim_coords=True):
        return
    time_coord = cube.coord('time')
    dates_points = time_coord.units.num2date(time_coord.points)
    dates_bounds = time_coord.units.num2date(time_coord.bounds)
    new_units = Unit('days since 1850-01-01 00:00:00')
    new_time_coord = iris.coords.DimCoord(
        new_units.date2num(dates_points),
        bounds=new_units.date2num(dates_bounds),
        var_name='time',
        standard_name='time',
        long_name='time',
        units=new_units,
    )
    coord_dims = cube.coord_dims('time')
    cube.remove_coord('time')
    cube.add_dim_coord(new_time_coord, coord_dims)

It think it would make sense to include this in regrid_time.

Peter9192 added a commit that referenced this issue Jan 7, 2021
The new regression test for the daily statistics of a 365-day calendar
failed because some days had time points at 00:00 and others at 12:00.

Since we're planning to move time handling to regrid time altogether,
this behaviour is consistent with regrid_time (see issue #744).
nielsdrost pushed a commit that referenced this issue Jan 14, 2021
* Fix calendar units to 'days since 1850-01-01' on a standard calendar

* More thorough check on source time frequency; align behaviour with regrid time; raise for daily data

* Simplify _get_overlap

* Align behaviour for union (full) and intersection (overlap) of time arrays

* Add function to make all cubes use the same calendar.

This function also checks the frequency of the cubes (previously in _datetime_to_int_days),
this seems to be a much more logical place.

_datetime_to_int_days now simplifies to a one-liner, and can probably be eliminated completely.

* Remove _datetime_to_int_days, as we can just use the time points

* Simplify and rename _slice_cube

* Align _assemble_overlap_data more with _assemble_full_data

* Align _assemble_full_data more with _assemble_overlap_data

* Futher align assemble full and overlap data

* Merge assemble full and overlap data.

* Further simplify

* Remove stuff about bounds and aux coords as it is not used anyway

* Remove stuff about bounds and aux coords as it is not used anyway

* Clean up tests and add tests for new functions

* Valeriu's suggestions

* fix tests

* Realize data before making time slices

* Avoid codacy 'pointless-statement' message

* Address Bouwe's comments

* Update esmvalcore/preprocessor/_multimodel.py

Co-authored-by: Bouwe Andela <b.andela@esciencecenter.nl>

* Don't change the calendar if not necessary

* Use template cube's calendar and don't slice by time

* Use num2date rather than cells() method

* Apply suggestions from code review

Co-authored-by: Bouwe Andela <b.andela@esciencecenter.nl>

* fix bug for monthly data in unify time

* add time bounds to output cube

* Add missing var_name attribute to time coordinate

* Force regrid of daily data and be more consistent with regrid time.

The new regression test for the daily statistics of a 365-day calendar
failed because some days had time points at 00:00 and others at 12:00.

Since we're planning to move time handling to regrid time altogether,
this behaviour is consistent with regrid_time (see issue #744).

* Add long name to new time coordinates

* Temporarily bypass coordinate check
The reference data contains some metadata that is redundant or inconsistent.
After updating the reference data, this can be reverted.

* Revert "Temporarily bypass coordinate check"

This reverts commit 2bd2d62.

* Update reference data

* Typo

Co-authored-by: Stef Smeets <stefsmeets@users.noreply.github.com>

Co-authored-by: Bouwe Andela <b.andela@esciencecenter.nl>
Co-authored-by: Stef Smeets <stefsmeets@users.noreply.github.com>
@zklaus
Copy link

zklaus commented May 10, 2021

I support the idea, but careful with the suggested fix above. That may be appropriate in a visualization script where you know the input, but I see two problems with it:

  • It seems to ignore the calendar on the input data. If the models differ in the calendar, say you have one "normal" Gregorian calendar, one calendar without leap years, and one 360 day calendar, it's not clear that the resulting time coordinate is correct.
  • It changes the reference point. This is probably usually not problematic, but may be problematic in special cases for things like paleo runs, spin-ups, or very long piControl runs. It certainly needs to take care of the branch_time attributes.

@bouweandela
Copy link
Member Author

It seems to ignore the calendar on the input data. If the models differ in the calendar, say you have one "normal" Gregorian calendar, one calendar without leap years, and one 360 day calendar, it's not clear that the resulting time coordinate is correct.

The idea would be to only do this for time frequencies where it does not matter, e.g. for monthly averages or lower frequencies it does not matter whether or not your calendar includes leap days, right?

@zklaus
Copy link

zklaus commented May 14, 2021

So you are saying this would be only applicable if both source and target frequency are monthly, seasonal, annual? There may still be a (small) problem of weights, but also this could be quite annoying since going from daily to monthly is probably an important use case.

@bouweandela
Copy link
Member Author

So you are saying this would be only applicable if both source and target frequency are monthly, seasonal, annual?

I think so. Or maybe it would still work for daily data if you use some sort of interpolation scheme? I think there is just no way you can meaningfully regrid data with a frequency lower than daily from a 360 day calendar to a 365 day calendar.

going from daily to monthly is probably an important use case.

At the moment we have the preprocessor function monthly_statistics for that. Maybe that could somehow be integrated with regrid_time, but you'd have to supply the kind of statistic (min, max, mean) as a function argument to regrid_time in the recipe because it depends on the variable what you need.

@schlunma
Copy link
Contributor

schlunma commented Feb 4, 2022

Moving this to v2.6 since there is not open PR yet.

@schlunma schlunma modified the milestones: v2.5.0, v2.6.0 Feb 4, 2022
@sloosvel
Copy link
Contributor

Any chance this can make it for 2.6? Or should it be moved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request preprocessor Related to the preprocessor
Projects
None yet
Development

No branches or pull requests

5 participants