You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I'm creating a MultiScene from ABI and GLM. When the time coverage between the two is not equal, or one or more files are missing from one but not the other, the result of group_files may include one or more groups in which only one of the readers has any files assigned. When I'm not calling group_files directly, but via MultiScene.from_files, which can lead to hard-to-debug errors that occur when trying to load a dataset and then access the scene object.
[DEBUG: 2021-06-29 12:27:11 : satpy.readers.yaml_reader] Reading ('/home/gholl/checkouts/satpy/satpy/etc/readers/abi_l1b.yaml', '/media/nas/o16091/00_MITARBEITER/HOLL/Arbeit/checkouts-perforce/dev_Accso_EBP/config/readers/abi_l1b.yaml')
[DEBUG: 2021-06-29 12:27:11 : satpy.readers.yaml_reader] Reading ('/home/gholl/checkouts/satpy/satpy/etc/readers/glm_l2.yaml',)
[DEBUG: 2021-06-29 12:27:11 : satpy.multiscene] Forcing iteration of generator-like object of Scenes
[DEBUG: 2021-06-29 12:27:11 : satpy.readers.yaml_reader] Reading ('/home/gholl/checkouts/satpy/satpy/etc/readers/abi_l1b.yaml', '/media/nas/o16091/00_MITARBEITER/HOLL/Arbeit/checkouts-perforce/dev_Accso_EBP/config/readers/abi_l1b.yaml')
[DEBUG: 2021-06-29 12:27:11 : satpy.readers.yaml_reader] Assigning to abi_l1b: ['OR_ABI-L1b-RadF-M6C14_G16_s19000010000000_e19000010005000_c20403662359590.nc']
[DEBUG: 2021-06-29 12:27:12 : satpy.readers.yaml_reader] Reading ('/home/gholl/checkouts/satpy/satpy/etc/readers/glm_l2.yaml',)
[DEBUG: 2021-06-29 12:27:12 : satpy.readers.yaml_reader] Assigning to glm_l2: ['OR_GLM-L2-GLMF-M3_G16_s19000010000000_e19000010001000_c20403662359590.nc']
[DEBUG: 2021-06-29 12:27:12 : satpy.composites.config_loader] Looking for composites config file abi.yaml
[DEBUG: 2021-06-29 12:27:12 : satpy.composites.config_loader] Looking for composites config file visir.yaml
[DEBUG: 2021-06-29 12:27:12 : satpy.composites.config_loader] Looking for composites config file glm.yaml
[DEBUG: 2021-06-29 12:27:12 : satpy.readers.glm_l2] Reading in get_dataset flash_extent_density.
[DEBUG: 2021-06-29 12:27:12 : satpy.readers.abi_l1b] Reading in get_dataset C14.
[DEBUG: 2021-06-29 12:27:12 : satpy.readers.abi_l1b] Calibrating to brightness temperatures
/data/gholl/miniconda3/envs/py39/lib/python3.9/site-packages/pyproj/crs/crs.py:1216: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
return self._crs.to_proj4(version=version)
[DEBUG: 2021-06-29 12:27:12 : satpy.writers] Enhancement configuration options: [{'name': 'btemp_threshold', 'method': <function btemp_threshold at 0x7fab658ad940>, 'kwargs': {'threshold': 242.0, 'min_in': 163.0, 'max_in': 330.0}}]
[DEBUG: 2021-06-29 12:27:12 : satpy.scene] Unloading dataset: DataID(name='flash_extent_density', resolution=2000, modifiers=())
[DEBUG: 2021-06-29 12:27:12 : satpy.scene] Unloading dataset: DataID(name='C14', wavelength=WavelengthRange(min=10.8, central=11.2, max=11.6, unit='µm'), resolution=2000, calibration=<calibration.brightness_temperature>, modifiers=())
[DEBUG: 2021-06-29 12:27:12 : satpy.scene] Unloading dataset: DataID(name='highlight_C14', resolution=2000)
[DEBUG: 2021-06-29 12:27:12 : satpy.readers.yaml_reader] Reading ('/home/gholl/checkouts/satpy/satpy/etc/readers/abi_l1b.yaml', '/media/nas/o16091/00_MITARBEITER/HOLL/Arbeit/checkouts-perforce/dev_Accso_EBP/config/readers/abi_l1b.yaml')
[DEBUG: 2021-06-29 12:27:12 : satpy.readers.yaml_reader] Reading ('/home/gholl/checkouts/satpy/satpy/etc/readers/glm_l2.yaml',)
[DEBUG: 2021-06-29 12:27:12 : satpy.readers.yaml_reader] Assigning to glm_l2: ['OR_GLM-L2-GLMF-M3_G16_s19000010001000_e19000010002000_c20403662359590.nc']
[DEBUG: 2021-06-29 12:27:12 : satpy.composites.config_loader] Looking for composites config file glm.yaml
[DEBUG: 2021-06-29 12:27:12 : satpy.composites.config_loader] Looking for composites config file visir.yaml
Traceback (most recent call last):
File "/home/gholl/checkouts/satpy/satpy/scene.py", line 1162, in _update_dependency_tree
self._dependency_tree.populate_with_keys(needed_datasets, query)
File "/home/gholl/checkouts/satpy/satpy/dependency_tree.py", line 241, in populate_with_keys
raise MissingDependencies(unknown_datasets, "Unknown datasets:")
satpy.node.MissingDependencies: Unknown datasets: {DataQuery(name='highlight_C14'): {DataQuery(name='highlight_C14')}}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/gholl/checkouts/protocode/mwe/group-files-desire.py", line 34, in <module>
ms.scenes
File "/home/gholl/checkouts/satpy/satpy/multiscene.py", line 221, in scenes
self._scenes = list(self._scenes)
File "/home/gholl/checkouts/satpy/satpy/multiscene.py", line 123, in __iter__
scn = next(self._self_iter)
File "/home/gholl/checkouts/satpy/satpy/multiscene.py", line 113, in _create_cached_iter
for scn in self._scene_gen:
File "/home/gholl/checkouts/satpy/satpy/multiscene.py", line 263, in _call_scene_func
new_scn = getattr(scn, func_name)(*args, **kwargs)
File "/home/gholl/checkouts/satpy/satpy/scene.py", line 1153, in load
self._update_dependency_tree(needed_datasets, query)
File "/home/gholl/checkouts/satpy/satpy/scene.py", line 1164, in _update_dependency_tree
raise KeyError(str(err))
KeyError: "Unknown datasets: {DataQuery(name='highlight_C14'): {DataQuery(name='highlight_C14')}}"
It took me a while to understand why this was happening.
Describe the solution you'd like
I would like that group_files and MultiScene.from_files have configurable behaviour if one or more groups do not have data for all readers. The current behaviour, "ignore", would be the default. Other behaviour could be: issue a warning, raise an exception, or skip this group (thus skipping some files).
Describe any changes to existing user workflow
None; the current behaviour would remain the default.
Additional context
Alternatively, I could go through the data files myself first and check their time coverage. Considering that in my actual code I'm reading ABI directly from an S3 bucket, that would be quite involved, and a solution within Satpy is preferable.
The text was updated successfully, but these errors were encountered:
gerritholl
changed the title
Allow group_files with multiple readers to require each group has at least one file matched to each reader
For group_files with multiple readers, allow user to configure behaviour if some groups have zero files for some readers
Jun 29, 2021
Makes sense. What are you thinking, a keyword argument on group_files? And maybe the default is different for MultiScene.from_files? 🤔 maybe they should both be the same behavior by default.
If we want to be backward compatible, both should default to the current behaviour. But considering that (1) Satpy is <1.0, (2) MultiScene is explicitly noted as experimental, (3) probably nobody is using this function in production, (4) probably almost nobody is using this function with multiple readers at all considering I added this functionality in #1269, and (5) good programmers will notice backward incompatible behaviour due to breaking unit tests (ha ha); probably we can tolerate non-backward compatible behaviour in this case.
Agreed. I think we can also assume that people who are using group_files and MultiScene.from_files with multiple readers (the 1-3 people doing that right now) probably want this behavior anyway.
Feature Request
Is your feature request related to a problem? Please describe.
I'm creating a MultiScene from ABI and GLM. When the time coverage between the two is not equal, or one or more files are missing from one but not the other, the result of
group_files
may include one or more groups in which only one of the readers has any files assigned. When I'm not callinggroup_files
directly, but viaMultiScene.from_files
, which can lead to hard-to-debug errors that occur when trying to load a dataset and then access the scene object.For example:
currently results in
It took me a while to understand why this was happening.
Describe the solution you'd like
I would like that
group_files
andMultiScene.from_files
have configurable behaviour if one or more groups do not have data for all readers. The current behaviour, "ignore", would be the default. Other behaviour could be: issue a warning, raise an exception, or skip this group (thus skipping some files).Describe any changes to existing user workflow
None; the current behaviour would remain the default.
Additional context
Alternatively, I could go through the data files myself first and check their time coverage. Considering that in my actual code I'm reading ABI directly from an S3 bucket, that would be quite involved, and a solution within Satpy is preferable.
The text was updated successfully, but these errors were encountered: