-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mint DOIs for Datasets with Handles in dataverse.harvard.edu #4
Comments
This is how it works; we take a dataset with a handle id: The citation is always showing the DOI, regardless of whether you've used the DOI or the handle to get to the page. The handle only appears in the metadata tab, here: Is this how we wanted it to work/look? (we didn't want that handle to appear more prominently somehow, at the top of the page, did we?) |
Yes, this is what we had decided. Looks good, seems ready for CR. |
The update job is still running. The script sleeps for a few seconds between registration calls, so that we don't flood DC with requests. |
Quite a few of the datasets are failing to re-register; trying to understand why. |
May be due to a lack of some metadata fields, that are mandatory for the DOI registration? - Like this dataset: |
Thanks @landreev for getting most of these registered and for the list of those that were not successful. #5559 has been created to handle all of those that could not be registered for whatever reason. Between this and #5559, we'll unblock the Make Data Count work. |
When 4.12 (contains the fix for #5559) is on prod, we can finish this up. |
Started a new batch job for the still un-converted handle-ed datasets earlier today. |
Of the 4225 datasets that still had handles, only 5 are still failing to obtain a DOI:
|
A new version of the dataset 1902.1/01957 is now published with the metadata in the Producer fields removed. @landreev, could you try again to register a DOI for this dataset? Some of the datasets are missing either a Contact Name or Contact Email. DataCite doesn't require either of these, but
Not sure if this missing Contact metadata is the culprit. 1902.1/01957 has no Contact Name. If it's able to get a DOI, then the missing Contact Name isn't the problem. |
@jggautier as you know, I'm of the opinion that we should simply make Contact Email a required field. EZID didn't require it but DataCite does. It would be a fix for IQSS/dataverse#3839 (thanks for linking to that issue above). |
Do you mean you think we should make Contact Name a required field? (Dataverse already requires a Contact Email.) How can I tell that DataCite requires Contact Name (or Contact Email)? None of the DataCite schema documentation lists those as required fields. |
@jggautier bah! Sorry, I meant Contact Name. The easiest way to exercise the bug is to simply delete the Contact Name (which is auto populated) and try to publish the dataset. I just tried this on the demo site and I was a little surprised to see that it published just fine. Since you have a superuser account maybe you could try this in production and "destroy" the dataset afterwards. Basically, I wondering if IQSS/dataverse#3839 is still a bug or not. It's hard to tell from the demo site. |
@pdurbin I published a dataset on Harvard Dataverse without Contact Name metadata. I had to delete it, but I'm sure other real datasets have been published without a Contact Name, too. To be honest, I only brought this bug up in this issue on chance that it might somehow be related (feels like I'm grasping at straws). |
I reran the API for 1902.1/01957 and it now has a DOI. The others will take more investigation. |
I figured it out! The issue is that the name column for datavariables is null (well ''). This query: returns 7 files, all of which belong the the 4 datasets above. |
So, todos:
Let's discuss this week either after standup or at backlog grooming. |
|
OK, the affected variables have been given names of the type "varNN" where NN is the variable order in the datatable. |
Finally, the last 4 handle datasets have been assigned DOIs.
@scolapasta I'm moving this straight to QA, since there's nothing to review. |
Checked dvobject table for non-dois, all clear. Also eyeballed alternativepersistentidentifier table. |
We have this endpoint:
http://guides.dataverse.org/en/latest/admin/dataverses-datasets.html#mint-new-pid-for-a-dataset
We should use it to mint DOIs for datasets in dataverse.harvard.edu with Handles in support of Make Data Count in #4821.
The text was updated successfully, but these errors were encountered: