Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird issue with 'step' value from seasonal forecasts #38

Closed
matteodefelice opened this issue Dec 6, 2018 · 5 comments
Closed

Weird issue with 'step' value from seasonal forecasts #38

matteodefelice opened this issue Dec 6, 2018 · 5 comments
Labels
duplicate This issue or pull request already exists

Comments

@matteodefelice
Copy link

I have downloaded a dataset from seasonal-monthly-single-levels with starting month 1 and lead-time from 1 to 4 for several years.
I have loaded the grib file with:
d = xr.open_dataset('/path/to/grib', engine = 'cfgrib')
and in the step field I get some strange numbers:

<xarray.DataArray 'step' (step: 7)>
array([ 2678400000000000,  5097600000000000,  5184000000000000,
        7776000000000000,  7862400000000000, 10368000000000000,
       10454400000000000], dtype='timedelta64[ns]')
Coordinates:
  * step     (step) timedelta64[ns] 31 days 59 days ... 120 days 121 days
    surface  int64 ...
Attributes:
    standard_name:  forecast_period
    long_name:      time since forecast_reference_time

Apparently, the steps take into account the variable number of days for February. This is a big problem because for some values of step I have NaN in some years with the impossibility to calculate climatologies for example.

Instead, I would expect to have step with the same size of leadtime_month. Is this an expected behaviour? Can you suggest a way to deal with this?

@alexamici
Copy link
Contributor

@matteodefelice unfortunately this is the expected behaviour at the moment. I feel your pain.

The problem is that these GRIB files express the step in hours and since months have different length, the files don't typically produce a well formed hypercube in time and step. cfgrib builds a technically correct hypercube by filling the 'gaps' with np.nan but it is not easy to work with the result.

Changing the dimension coordinate from step to valid_time (non-trivial to do) may help, but the data would still include a lot of np.nan. On the other hand defining a real leadtime_month coordinate looks really tricky.

Ideas and especially PRs are welcome!

@alexamici alexamici added bug Something isn't working help wanted Extra attention is needed labels Dec 6, 2018
@matteodefelice
Copy link
Author

Well, the most important thing for me is that you folks are aware of this. The good news is that if you convert with grib_to_netcdf this problem disappears because the conversion "merges" the dimensions similarly to what you get if you sum time and step dimensions.
I have the impression what with a bit of pandas and xarray wizardry we may find a workaround, for example stacking the two temporal dimensions and aggregating in a smart way according to the month. Before going back (again!) to the NetCDF conversion I will try to find a workaround, I will keep you posted.

@alexamici
Copy link
Contributor

@matteodefelice what request are you using?

Last time I checked grib_to_netcdf crashed on my test request:

>>> cds.retrieve('seasonal-monthly-single-levels', {
            'originating_centre': 'ecmwf',
            'variable': 'maximum_2m_temperature_in_the_last_24_hours',
            'product_type': 'monthly_mean',
            'year': '2018',
            'month': ['04', '05'],
            'leadtime_month': ['1', '2'],
            'grid': ['3', '3'],
            'format': 'grib',
    },
    'out.grib',
)

@matteodefelice
Copy link
Author

This is my request:

c.retrieve(
    'seasonal-monthly-single-levels',
    {
        'originating_centre':'ecmwf',
        'variable':'mean_sea_level_pressure',
        'product_type':'monthly_mean',
        'year':[
            '1993','1994','1995',
            '1996','1997','1998',
            '1999','2000'
        ],
        'month':'01',
        'leadtime_month':[
            '1','2','3'
        ],
        'format':'grib'
    },
    'download.grib')

This one works perfectly with grib_to_netcdf (grib_api version 1.26.1 installed from anaconda)

@alexamici alexamici added duplicate This issue or pull request already exists and removed bug Something isn't working help wanted Extra attention is needed labels Sep 24, 2019
@alexamici
Copy link
Contributor

Even if this issue is older than #97 I'd like to close this one as duplicate of that as the discussion there has added useful insights. @matteodefelice hope you don't mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants