-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
catch failures with tiff file thumbnail generation #10509
catch failures with tiff file thumbnail generation #10509
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code changes look reasonable - finding places where the failure wasn't getting flagged and making sure the 0 length file is deleted if there's a failure. Also some nice cleanup.
The one add that would be nice is to get an IT test with a tiff file known to fail - is that possible?
Also - I see that jenkins is failing in a new way - with a file not found during the build. Doesn't seem related to this PR.
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
uploadLogoResponse = UtilIT.uploadDatasetLogo(datasetPersistentId, goodTiff, apiToken); | ||
uploadLogoResponse.prettyPrint(); | ||
uploadLogoResponse.then().assertThat().statusCode(FORBIDDEN.getStatusCode()); | ||
uploadLogoResponse.then().assertThat().body("message", equalTo("File is larger than maximum size: 500000.")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: Including the size makes the test dependent on the default max size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the test
This comment has been minimized.
This comment has been minimized.
@stevenwinship - Heads up that your change causes a failure in a SearchIT test which is checking for the exact string (looks like you dropped a period in going to the bundled string). |
Does this PR also fix the following issue? Should we put it through QA as well? |
The files will now show up with a default image but the tiff files will not generate an actual thumbnail image. I agree that we should switch to another image conversion library, but I don't think it's part of this bug. I'll create another issue to address this. |
Created new feature issue for enhancing the code |
This comment has been minimized.
This comment has been minimized.
I second this. Expanding the image format coverage is out of scope and should be a separate issue. |
Fixed |
The new code is working as advertised for new uploads (the 0-size .thumb64 files are still there; but Is there an easy-enough way to fix this as an existing condition as well? I.e., I have an existing dataset with problematic TIFFs uploaded under a "before" version of Dataverse; with thumbnails failing to generate, but |
Hmm - I thought the code would delete the 0 size file. In any case, https://guides.dataverse.org/en/latest/api/native-api.html#reset-thumbnail-failure-flags are API calls to remove the fail flags. Not sure if that's enough with the changes in this PR to get the thumbs completely reprocessed or not. |
I may be missing something, but the api above resets the failure flag; it's the other way around with a preexisting case - the failure flag is false, the previewImageAvailable flag is true. Plus zero-size thumbnail files on disk. In my experiments the workaround required BOTH setting |
This comment has been minimized.
This comment has been minimized.
66f56df
to
4c78209
Compare
This comment has been minimized.
This comment has been minimized.
Appears to be working as advertised - a pre-existing "bad" thumbnail still looking broken on the first page load, fixed going forward. |
@stevenwinship OK, I have one request: Once again, this is not a problem for as long as the images are bad, thumbnails generation-wise. But it does become a problem if they become processable. If/when, for example, we upgrade the image libraries to handle the extra flavors of TIFF. Then the presence of the 0-size cache prevents the re-generation, even after the I tested this by replacing one of the existing bad tiffs on disk with a tiff known to be processable, adjusting the filesize in the db accordingly and resetting There should be an easy way to both not leave that 0-size cache behind, AND delete any existing 0-size caches when detected by the new code from the last commits - is there? Please use your judgement re: whether this is worth any extra dev. effort. If this is remotely non-trivial for whatever reason, and you feel we should instead say that it's the responsibility of the admin to manually delete these 0-size caches when processing fixes become available, I can accept that. |
@stevenwinship |
@stevenwinship
... then update the file size in the db (I think this is necessary for the whole thing to work) and reset the flag (could've used the API, but did so in the db):
... then reload the dataset page |
// ImageThumbConverter fixed the DataFile | ||
// Now we need to update dataset since this is a bad logo | ||
DatasetServiceBean datasetService = CDI.current().select(DatasetServiceBean.class).get(); | ||
datasetService.removeDatasetThumbnail(dataset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevenwinship Just to follow up on the slack conversation:
datasetService.removeDatasetThumbnail(dataset);
does NOT, and is not supposed to, remove the cached thumbnails associated with a specific DataFile. We may want to come up with a better, and potentially less confusing name for the method. Something like datasetService.clearDatasetThumbnail(dataset);
. Because that's what it does - the purpose of the method is to make sure the Dataset no longer has a designated thumbnail. If that designated thumbnail was a "custom logo", it will attempt to delete the physical file (because, once we stop using it as the dataset thumbnail, there's no other use for it); but if it was one of the DataFile-level thumbnail, it will only clear the dataset-level thumbnail designation, but it normally would not have any reason to attempt to delete the caches, because normally they would still be useful elsewhere on the page.
I do believe that it is entirely safe to just leave the code above as is; I don't think you need to do anything special to delete these 0-size caches here, in the dataset-level thumbnail method; instead, those should be deleted in the datafile-level method, in ImageThumbConverter.getImageThumbnailAsInputStream(thumbnailFile.getStorageIO(),ImageThumbConverter.DEFAULT_DATASETLOGO_SIZE);
where you are checking for that size == 0
.
I hope this makes sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that I corrected an important typo in the above. When I was talking about changing the method name, I meant to say I wanted to name it clearDatasetThumbnail()
, instead of remove...
I just typed the same name twice earlier ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'll go ahead and do that)
…ro-size files are not left behind when rescaling fails; and to erase such when they are encountered for legacy files; renamed a method to better reflect what it does; and dropped TIF from the list of formats supported for custom logos. Nothing too serious. it's still a mess, don't get me wrong. but should be ready to be merged, for now. #10352
This comment has been minimized.
This comment has been minimized.
@qqmyers I offered @stevenwinship to add a couple of last minute improvements yesterday. I'm fairly confident they are not going to break anything, but for the sake of due process, if you have a moment before standup, please take a look at the commit 6d1f70a and see if anything seems off. I'm planning to merge it otherwise. The only functional changes are for removing broken/zero size caches. The rest are comments and such. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looks good.
@qqmyers @stevenwinship I'll take any further discussion of things like this to the refactoring issue #10515. |
📦 Pushed preview images as
🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
Handles failures with TIFF file thumbnail failures by preventing broken images as thumbnails or previews
Which issue(s) this PR closes: #10352
Special notes for your reviewer:
Suggestions on how to test this: Use some of the tiff files in question from [
](IQSS/dataverse.harvard.edu#250)
When testing with existing bad tiff files you will see the broken images first since the ui gets the dataset(s) first and doesn't know the images failed. After loading the dataset the db is updated so each time the dataset is loaded going forward the images should look correct (ie default thumbnails with no preview).
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
data:image/s3,"s3://crabby-images/5a7bb/5a7bb5a7eca0da76aba91883b931047fb52313bd" alt="Screenshot 2024-04-19 at 10 08 43 AM"
data:image/s3,"s3://crabby-images/a2d4e/a2d4e0d8d3e1f2cae9bc776741cab9b42b7f5b4a" alt="Screenshot 2024-04-19 at 10 09 36 AM"
Before:
After:
data:image/s3,"s3://crabby-images/ad41a/ad41aadbd3a9a8e06cabc512bfa1e9bbcb4a588c" alt="Screenshot 2024-04-19 at 10 16 55 AM"
Is there a release notes update needed for this change?: No