-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make FSFile hashable again #1606
Conversation
FSFile objects had been inadvertently made unhashable by pytroll#1582 . Make FSFile objects hashable again by implementing a __hash__ method. - [x] Fixes pytroll#1604 - [x] Fixes pytroll#1605
Question: we should probably make |
Remove blank line after docstring to meet PEP 257 requirements as requested by CI.
In test_init_with_fsfile, when capturing the ValueError for "no supported files found", more precisely capture the error message Satpy reports in this situation. Also remove a spurious # in the middle of a comment line.
My hope is that with some extra work in fsspec that we won't need FSFile at all and can pass fsspec File objects (or URLs) directly to |
Codecov Report
@@ Coverage Diff @@
## master #1606 +/- ##
==========================================
+ Coverage 92.35% 92.43% +0.08%
==========================================
Files 254 254
Lines 37409 37500 +91
==========================================
+ Hits 34548 34663 +115
+ Misses 2861 2837 -24
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Although this works with a
Gives:
|
It's hashable in fsspec master, but not in the latest release 0.8.7. |
Sounds like a bug? Edit: A bug that was fixed? |
A bug that was fixed, but not very well fixed, as the hash is equal for file system instances that aren't equal… |
DeepCode failed to analyze this pull requestSomething went wrong despite trying multiple times, sorry about that. |
could you work around the problem by using the json representation of the filesystem and hash that? |
Unfortunately not, because those are equal too!
Probably the same underlying bug causing the same hash is also causing the same JSON... |
Add a test to ensure that FSFile objects that should hash differently actually do hash differently.
The remaining failing test is caused by a problem in fsspec, see fsspec/filesystem_spec#576 and fsspec/filesystem_spec#577. |
Adapt FSFile hashing for fsspec#578, which disables JSON for CachingFileSystem because this representation is currently incorrect and thus not implemented.
Some types of some versions of fsspec implementations may return TypeError when attempted to be hashed. Capture this and use fallback option.
Add a workaround using pickle in FSFile.__hash__. This workaround is probably slow and I'm happy to hear better suggestions, but in currently released versions of fsspec as well as in fsspec master hashing of CachingFileSystem is broken.
Add more detail in the comment on the except-block for resorting to pickling for hashing.
NB, fsspec/filesystem_spec#578 is now merged, so with fsspec master this PR now works with CachedFileSystem without the terrible pickle workaround (but JSON still does not work for CachedFileSystem). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing a future-proof hashing. Just one comment.
In the fsfile init test, replace the string "dummy" by a long random string that is very unlikely to exist as a file.
FSFile objects had been inadvertently made unhashable by
#1582 . Make FSFile objects
hashable again by implementing a hash method.