-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UNF Recalculation Endpoint #3589
Comments
We should investigate the why, of course, but to fix these it should be fairly straightforward to add a simple admin api to generate a UNF (this should not create a new version, since the UNF should have been generated before). |
We identified a few potential tasks in sprint planning today.
|
Also in yesterday's meeting it was clarified that the cause of the missing UNFs was #2327 (fixed in Dataverse 4.5 via pull request #3226) so these datasets were not migrated. They were created post 4.0. It sounds like @scolapasta has a SQL query in mind to identify which datasets have been affected in an installation of Dataverse and we should provide this query as part of this issue. I'd actually be in favor of adding an API endpoint that iterates through all datasets and checks for anomalies such as missing UNFs but if it's more expedient for now to simply provide the query, that's a good start. |
I created pull request #3605 and associated it with this issue, which I moved to Code Review at https://waffle.io/IQSS/dataverse |
@kcondon here are some expected messages you'll may see from the new "Dataset Integrity" API I just documented in 08d50a4 assuming you mess with the data in the database a bit. 😄
|
OK, basic functionality works as described: The integrity check checks whether there is a unf when there shouldn't and says so and whether there should be and isn't and says so. The fixunf endpoint only fixes the should be a unf but isn't case as designed and for the case where there is one but shouldn't be, says: Am now testing against copy of production db but so far integrity test dies after 40 mins. |
Throws 500 error after running for nearly 40 mins with no output until the error: |
@kcondon as of cc4158c for the check I'm returning |
Tested the new output method, still fails after around 40 mins: [2017-02-07T16:29:02.073-0500] [glassfish 4.1] [WARNING] [] [edu.harvard.iq.dataverse.dataaccess.ImageThumbConverter] [tid: _ThreadID=71 _ThreadName=http-listener-1(40)] [timeMillis: 1486502942073] [levelValue: 900] [[ [2017-02-07T16:37:20.597-0500] [glassfish 4.1] [WARNING] [] [javax.enterprise.web.core] [tid: _ThreadID=48 _ThreadName=http-listener-1(17)] [timeMillis: 1486503440597] [levelValue: 900] [[ |
Also, discussed whether harvested datasets should be checked since we do not have ability to update them and not sure that UNF info is harvested: ie. dataset or file. |
@kcondon sorry for all the bugs in the See the note from @djbrooke above that says, "Once this is implemented and available in production, we'll need to identify the files/datasets that should have this recalculated". This task still needs to be done. I assume someone will write a SQL script or something. |
This dataset in production has no UNF listed: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/26935
But it has 3 files with UNFs, including this one: https://dataverse.harvard.edu/file.xhtml?fileId=2491887&version=7.2
It's not missing for all datasets, as this recently created dataset has a UNF listed:
https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/S2VGJ1
The text was updated successfully, but these errors were encountered: