Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: unexpected encoding for scipy backend: ['chunksizes'] #1853

Closed
christianversloot opened this issue Oct 14, 2021 · 8 comments · Fixed by #1857
Closed

ValueError: unexpected encoding for scipy backend: ['chunksizes'] #1853

christianversloot opened this issue Oct 14, 2021 · 8 comments · Fixed by #1857

Comments

@christianversloot
Copy link

Describe the bug
I have created a pipeline which loads MSG Native *.nat files, resamples them using a custom AreaDefinition and then uses the NetCDF writer to create *.nc files. I used Windows for development.

To Reproduce
This is the relevant code, because the problem happens internally:

file = "some_msg_name.nat"
suffix = file.split("/")[-1].replace(".nat", ".nc")
channels_save = ['IR_108', 'IR_120', 'VIS006', 'WV_062', 'IR_039', 'IR_134', 'IR_097', 'IR_087', 'VIS008', 'IR_016', 'WV_073']
scn.save_datasets(writer="cf", datasets=channels_save, filename=suffix)

Expected behavior
A *.nc file is created with my resampled satellite channels.

Actual results
On Windows, this works. On Linux, both in and outside of a Docker container, I am however getting the following error:

ValueError: unexpected encoding for scipy backend: ['chunksizes']

The full stack trace suggests that this happens in xarray's scipy backend:

  File "/home/infoplaza/some-eumetsat-processing/process.py", line 287, in satpy_process                                                                                                                                                            
    some_scn.save_datasets(writer="cf", datasets=channels_save, filename=suffix)                                                                                                                                                                    
  File "/home/infoplaza/.local/lib/python3.9/site-packages/satpy/scene.py", line 1089, in save_datasets                                                                                                                                                
    return writer.save_datasets(dataarrays, compute=compute, **save_kwargs)                                                                                                                                                                            
  File "/home/infoplaza/.local/lib/python3.9/site-packages/satpy/writers/cf_writer.py", line 808, in save_datasets                                                                                                                                     
    res = dataset.to_netcdf(filename, engine=engine, group=group_name, mode='a', encoding=encoding,                                                                                                                                                    
  File "/home/infoplaza/.local/lib/python3.9/site-packages/xarray/core/dataset.py", line 1900, in to_netcdf                                                                                                                                            
    return to_netcdf(                                                                                                                                                                                                                                  
  File "/home/infoplaza/.local/lib/python3.9/site-packages/xarray/backends/api.py", line 1077, in to_netcdf                                                                                                                                            
    dump_to_store(                                                                                                                                                                                                                                     
  File "/home/infoplaza/.local/lib/python3.9/site-packages/xarray/backends/api.py", line 1124, in dump_to_store                                                                                                                                        
    store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)                                                                                                                                                               
  File "/home/infoplaza/.local/lib/python3.9/site-packages/xarray/backends/common.py", line 266, in store                                                                                                                                              
    self.set_variables(                                                                                                                                                                                                                                
  File "/home/infoplaza/.local/lib/python3.9/site-packages/xarray/backends/common.py", line 304, in set_variables                                                                                                                                      
    target, source = self.prepare_variable(                                                                                                                                                                                                            
  File "/home/infoplaza/.local/lib/python3.9/site-packages/xarray/backends/scipy_.py", line 214, in prepare_variable                                                                                                                                   
    raise ValueError(                                                                                                                                                                                                                                  
ValueError: unexpected encoding for scipy backend: ['chunksizes']  

Screenshots
If applicable, add screenshots to help explain your problem.

Environment Info:

  • OS: Linux, both with and without Docker
  • Satpy Version: 0.29.0
  • PyResample Version: 1.21.0
  • Readers and writers dependencies (when relevant): [run from satpy.utils import check_satpy; check_satpy()]

Additional context

@christianversloot
Copy link
Author

Interestingly, a fix is also available:

pip install netcdf4.

After installing it, the error is gone.

I am a bit hesitant to call this a bug because it is likely an obscure error thrown by the omission of a necessary dependency for Satpy to work properly with NetCDF4 files. Still, I am posting it here, because:

  1. It may be an idea to more strictly check for dependencies in Satpy itself, if this is the case.
  2. This error literally cannot be found on Google. Other people may benefit from the fix.

@mraspaud
Copy link
Member

mraspaud commented Oct 14, 2021

@christianversloot thanks a lot for reporting this!

It is indeed a good thing to have this shown here, at least so that other users encountering the same error can find the solution.

Regarding what satpy should do about it: We have chosen not to put the full requirements for all the functionality in the setup.py, but rather to use the "extra" requirement syntax. For the cf writer, we have https://github.com/pytroll/satpy/blob/main/setup.py#L67
'cf': ['h5netcdf >= 0.7.3'],, so requiring h5netcdf if you install satpy with pip install satpy[cf]

of course, it might not be obvious when you install satpy that you need just that extra requirement, and xarray can use multiple backends to write to netcdf, so we can end up in the kind of situation that you had here if you already have scipy installed.

So as you say, a fix could be to check that either h5netcdf or netcdf4 is installed before doing the writing.

Do you want to have a shot at a PR?

@christianversloot
Copy link
Author

Thanks for your quick response! I have reserved some time in my agenda to take a look at creating a PR on Monday.

Per https://stackoverflow.com/questions/49222824/make-an-either-or-distinction-for-install-requires-in-setup-py/49222904, my approach would be to adapt setup.py. If neither h5netcdf or netcdf4 is installed, we can do multiple things:

  • Push both to requires
  • Push only one to requires (which one?)

Can you let me know your preference?

@djhoese
Copy link
Member

djhoese commented Oct 14, 2021

My vote is to leave the setup.py as is. Build systems in python are moving towards static declarative definitions of metadata including dependencies. Having something determine what is installed at install time and only adding dependencies based on that seems prone to confusion.

I think what @mraspaud was hoping for is in the cf writer itself try to import both h5netcdf and netcdf4 and if neither is available produce an error about it that makes sense (probably an ImportError).

@christianversloot
Copy link
Author

That makes sense. Would look like this, which I find somewhat ugly though:

has_netcdf4 = True
has_h5netcdf = True

try:
  import netcdf4
except ImportError:
  has_netcdf4 = False

try:
  import h5netcdf
except ImportError:
  has_h5netcdf = False

if not has_netcdf4 and not has_h5netcdf:
  raise ImportError("Please install either netcdf4 or h5netcdf package to use the 'cf' writer")

If you agree, I'll add this to the cf writer.

@djhoese
Copy link
Member

djhoese commented Oct 14, 2021

Slightly cleaner perhaps is to put the has_X = True assignments right after the import (inside the try block).

The check for the import should probably go in the __init__ method or save_dataset method of the writer. Actually...maybe having the ImportError on import matches what we have for other writers 🤔 Yeah maybe keep it where you have it. Otherwise, looks good to me.

@pnuu
Copy link
Member

pnuu commented Oct 14, 2021

How about simplifying slightly (no additional variables):

try:
    import netcdf4
except ImportError:
    netcdf4 = None
...
if not netcdf4 and not h5netcdf:
   ...

@christianversloot
Copy link
Author

PR opened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants