Skip to content

Commit

Permalink
Added more robust archive file corruption handling
Browse files Browse the repository at this point in the history
This PR will add an additional check when reading archive hdf5 files so that each group is read once to check that there are no runtime errors due to file corruption in the group
  • Loading branch information
Evan Goetz committed Jun 12, 2023
1 parent 7a3f4d3 commit ca96e71
Showing 1 changed file with 12 additions and 0 deletions.
12 changes: 12 additions & 0 deletions gwsumm/archive.py
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,18 @@ def read_data_archive(sourcefile, rm_source_on_fail=True):

with File(sourcefile, 'r') as h5file:

# Make sure that each group is not corrupted by trying to read the data
for group in h5file.keys():
try:
dataset = h5file.get(group, {})
except RuntimeError as exc:
if not rm_source_on_fail:
raise
warnings.warn(f"failed to read {sourcefile} group {group} "
f"[{exc}], removing...")
os.remove(sourcefile)
return

# -- channels ---------------------------

try:
Expand Down

0 comments on commit ca96e71

Please sign in to comment.