Skip to content

Commit

Permalink
stub out dataset types IQSS#10517
Browse files Browse the repository at this point in the history
  • Loading branch information
pdurbin committed Jul 15, 2024
1 parent 5ba74e8 commit 830ea35
Show file tree
Hide file tree
Showing 13 changed files with 234 additions and 3 deletions.
3 changes: 3 additions & 0 deletions doc/release-notes/10517-datasetType.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
### Initial Support for Dataset Types (Dataset, Software, Workflow)

Datasets now have types. By default the dataset type will be "dataset" but if you turn on support for additional types, datasets can have a type of "software" or "workflow" as well. For more details see doc/sphinx-guides/source/user/dataset-types.rst and #10517. Please note that this feature is highly experimental.
82 changes: 82 additions & 0 deletions doc/sphinx-guides/source/_static/api/dataset-create-software.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
{
"datasetType": "software",
"datasetVersion": {
"license": {
"name": "CC0 1.0",
"uri": "http://creativecommons.org/publicdomain/zero/1.0"
},
"metadataBlocks": {
"citation": {
"fields": [
{
"value": "Darwin's Finches",
"typeClass": "primitive",
"multiple": false,
"typeName": "title"
},
{
"value": [
{
"authorName": {
"value": "Finch, Fiona",
"typeClass": "primitive",
"multiple": false,
"typeName": "authorName"
},
"authorAffiliation": {
"value": "Birds Inc.",
"typeClass": "primitive",
"multiple": false,
"typeName": "authorAffiliation"
}
}
],
"typeClass": "compound",
"multiple": true,
"typeName": "author"
},
{
"value": [
{ "datasetContactEmail" : {
"typeClass": "primitive",
"multiple": false,
"typeName": "datasetContactEmail",
"value" : "finch@mailinator.com"
},
"datasetContactName" : {
"typeClass": "primitive",
"multiple": false,
"typeName": "datasetContactName",
"value": "Finch, Fiona"
}
}],
"typeClass": "compound",
"multiple": true,
"typeName": "datasetContact"
},
{
"value": [ {
"dsDescriptionValue":{
"value": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds.",
"multiple":false,
"typeClass": "primitive",
"typeName": "dsDescriptionValue"
}}],
"typeClass": "compound",
"multiple": true,
"typeName": "dsDescription"
},
{
"value": [
"Medicine, Health and Life Sciences"
],
"typeClass": "controlledVocabulary",
"multiple": true,
"typeName": "subject"
}
],
"displayName": "Citation Metadata"
}
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"http://purl.org/dc/terms/title": "Darwin's Finches",
"http://purl.org/dc/terms/subject": "Medicine, Health and Life Sciences",
"http://purl.org/dc/terms/creator": {
"https://dataverse.org/schema/citation/authorName": "Finch, Fiona",
"https://dataverse.org/schema/citation/authorAffiliation": "Birds Inc."
},
"https://dataverse.org/schema/citation/datasetContact": {
"https://dataverse.org/schema/citation/datasetContactEmail": "finch@mailinator.com",
"https://dataverse.org/schema/citation/datasetContactName": "Finch, Fiona"
},
"https://dataverse.org/schema/citation/dsDescription": {
"https://dataverse.org/schema/citation/dsDescriptionValue": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds."
},
"datasetType": "software"
}
30 changes: 30 additions & 0 deletions doc/sphinx-guides/source/user/dataset-types.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Dataset Types
+++++++++++++

NOTE: This separate page will be folded into individual pages and removed as the pull request is finalized

.. contents:: |toctitle|
:local:

Intro
=====

Datasets can have a dataset type such as "dataset", "software", or "workflow".

Enabling Dataset Types
======================

Turn on ``dataverse.feature.dataset-types``. See also :ref:`feature-flags`.

Specifying a Dataset Type When Creating a Dataset
=================================================

Native API
----------

An example JSON file is available at :download:`dataset-create-software.json <../_static/api/dataset-create-software.json>`

Semantic API
---------------------------------

An example JSON-LD file is available at :download:`dataset-create-software.jsonld <../_static/api/dataset-create-software.jsonld>`
1 change: 1 addition & 0 deletions doc/sphinx-guides/source/user/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@ User Guide
dataset-management
tabulardataingest/index
appendix
dataset-types
1 change: 1 addition & 0 deletions docker-compose-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ services:
SKIP_DEPLOY: "${SKIP_DEPLOY}"
DATAVERSE_JSF_REFRESH_PERIOD: "1"
DATAVERSE_FEATURE_API_BEARER_AUTH: "1"
DATAVERSE_FEATURE_DATASET_TYPES: "1"
DATAVERSE_MAIL_SYSTEM_EMAIL: "dataverse@localhost"
DATAVERSE_MAIL_MTA_HOST: "smtp"
DATAVERSE_AUTH_OIDC_ENABLED: "1"
Expand Down
14 changes: 13 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/Dataset.java
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
import edu.harvard.iq.dataverse.storageuse.StorageUse;
import edu.harvard.iq.dataverse.util.StringUtil;
import edu.harvard.iq.dataverse.util.SystemConfig;
import jakarta.persistence.Transient;

/**
*
Expand Down Expand Up @@ -128,6 +129,9 @@ public class Dataset extends DvObjectContainer {
*/
private boolean useGenericThumbnail;

@Transient
private String datasetType;

@OneToOne(cascade = {CascadeType.MERGE, CascadeType.PERSIST})
@JoinColumn(name = "guestbook_id", unique = false, nullable = true, insertable = true, updatable = true)
private Guestbook guestbook;
Expand Down Expand Up @@ -736,7 +740,15 @@ public boolean isUseGenericThumbnail() {
public void setUseGenericThumbnail(boolean useGenericThumbnail) {
this.useGenericThumbnail = useGenericThumbnail;
}


public String getDatasetType() {
return datasetType;
}

public void setDatasetType(String datasetType) {
this.datasetType = datasetType;
}

public List<DatasetMetrics> getDatasetMetrics() {
return datasetMetrics;
}
Expand Down
8 changes: 8 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
import edu.harvard.iq.dataverse.engine.command.impl.*;
import edu.harvard.iq.dataverse.pidproviders.PidProvider;
import edu.harvard.iq.dataverse.pidproviders.PidUtil;
import edu.harvard.iq.dataverse.settings.FeatureFlags;
import edu.harvard.iq.dataverse.settings.JvmSettings;
import edu.harvard.iq.dataverse.settings.SettingsServiceBean;
import edu.harvard.iq.dataverse.util.BundleUtil;
Expand Down Expand Up @@ -240,6 +241,13 @@ public Response createDataset(@Context ContainerRequestContext crc, String jsonB
//Throw BadRequestException if metadataLanguage isn't compatible with setting
DataverseUtil.checkMetadataLangauge(ds, owner, settingsService.getBaseMetadataLanguageMap(null, true));

try {
logger.info("about to call checkDatasetType...");
DataverseUtil.checkDatasetType(ds, FeatureFlags.DATASET_TYPES.enabled());
} catch (BadRequestException ex) {
return badRequest(ex.getLocalizedMessage());
}

// clean possible version metadata
DatasetVersion version = ds.getVersions().get(0);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser;
import edu.harvard.iq.dataverse.authorization.users.User;
import edu.harvard.iq.dataverse.engine.command.DataverseRequest;
import edu.harvard.iq.dataverse.settings.FeatureFlags;
import edu.harvard.iq.dataverse.util.BundleUtil;
import edu.harvard.iq.dataverse.util.json.JsonLDTerm;

Expand Down Expand Up @@ -122,4 +123,17 @@ public static void checkMetadataLangauge(Dataset ds, Dataverse owner, Map<String
}
}

public static void checkDatasetType(Dataset ds, boolean enabled) {
logger.info("called checkDatasetType...");
String datasetType = ds.getDatasetType();
logger.info("datasetType: " + datasetType);
if (datasetType != null) {
if (!enabled) {
throw new BadRequestException("The dataset types feature is not enabled but a type was sent: " + datasetType);
}
// TODO: check for valid types.
logger.info("The dataset type sent was: " + datasetType);
}
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,11 @@ public enum FeatureFlags {
* @since Dataverse 6.3
*/
DISABLE_RETURN_TO_AUTHOR_REASON("disable-return-to-author-reason"),
/**
* With this flag enabled, datasets can be created with various dataset
* types such as "dataset", "software", or "workflow".
*/
DATASET_TYPES("dataset-types"),
;

final String flag;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -328,7 +328,12 @@ public Dataset parseDataset(JsonObject obj) throws JsonParseException {
}else {
throw new JsonParseException("Specified metadatalanguage not allowed.");
}

String datasetType = obj.getString("datasetType",null);
logger.info("datasetType: " + datasetType);
if (datasetType != null) {
dataset.setDatasetType(datasetType);
}

DatasetVersion dsv = new DatasetVersion();
dsv.setDataset(dataset);
dsv = parseDatasetVersion(obj.getJsonObject("datasetVersion"), dsv);
Expand Down
54 changes: 54 additions & 0 deletions src/test/java/edu/harvard/iq/dataverse/api/DatasetTypesIT.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
package edu.harvard.iq.dataverse.api;

import io.restassured.RestAssured;
import io.restassured.path.json.JsonPath;
import io.restassured.response.Response;
import static jakarta.ws.rs.core.Response.Status.CREATED;
import static jakarta.ws.rs.core.Response.Status.OK;
import org.junit.jupiter.api.BeforeAll;
import org.junit.jupiter.api.Disabled;
import org.junit.jupiter.api.Test;

public class DatasetTypesIT {

@BeforeAll
public static void setUpClass() {
RestAssured.baseURI = UtilIT.getRestAssuredBaseUri();
}

@Test
public void testCreateSoftwareDatasetNative() {
Response createUser = UtilIT.createRandomUser();
createUser.then().assertThat().statusCode(OK.getStatusCode());
String username = UtilIT.getUsernameFromResponse(createUser);
String apiToken = UtilIT.getApiTokenFromResponse(createUser);

Response createDataverse = UtilIT.createRandomDataverse(apiToken);
createDataverse.then().assertThat().statusCode(CREATED.getStatusCode());
String dataverseAlias = UtilIT.getAliasFromResponse(createDataverse);
Integer dataverseId = UtilIT.getDataverseIdFromResponse(createDataverse);

// String datasetJsonPath = "doc/sphinx-guides/source/_static/api/dataset-create-software.json";
String jsonIn = UtilIT.getDatasetJson("doc/sphinx-guides/source/_static/api/dataset-create-software.json");
// System.out.println("native: " + datasetJsonPath);

Response createSoftware = UtilIT.createDataset(dataverseAlias, jsonIn, apiToken);
createSoftware.prettyPrint();
createSoftware.then().assertThat()
.statusCode(CREATED.getStatusCode());

//TODO: try sending "junk" instead of "software".

Integer datasetId = UtilIT.getDatasetIdFromResponse(createSoftware);
String datasetPid = JsonPath.from(createSoftware.getBody().asString()).getString("data.persistentId");

}

@Disabled
@Test
public void testCreateSoftwareDatasetSemantic() {
String jsonIn = "doc/sphinx-guides/source/_static/api/dataset-create-software.jsonld";
System.out.println("semantic: " + jsonIn);
}

}
2 changes: 1 addition & 1 deletion tests/integration-tests.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
DataversesIT,DatasetsIT,SwordIT,AdminIT,BuiltinUsersIT,UsersIT,UtilIT,ConfirmEmailIT,FileMetadataIT,FilesIT,SearchIT,InReviewWorkflowIT,HarvestingServerIT,HarvestingClientsIT,MoveIT,MakeDataCountApiIT,FileTypeDetectionIT,EditDDIIT,ExternalToolsIT,AccessIT,DuplicateFilesIT,DownloadFilesIT,LinkIT,DeleteUsersIT,DeactivateUsersIT,AuxiliaryFilesIT,InvalidCharactersIT,LicensesIT,NotificationsIT,BagIT,MetadataBlocksIT,NetcdfIT,SignpostingIT,FitsIT,LogoutIT,DataRetrieverApiIT,ProvIT,S3AccessIT,OpenApiIT,InfoIT
DataversesIT,DatasetsIT,SwordIT,AdminIT,BuiltinUsersIT,UsersIT,UtilIT,ConfirmEmailIT,FileMetadataIT,FilesIT,SearchIT,InReviewWorkflowIT,HarvestingServerIT,HarvestingClientsIT,MoveIT,MakeDataCountApiIT,FileTypeDetectionIT,EditDDIIT,ExternalToolsIT,AccessIT,DuplicateFilesIT,DownloadFilesIT,LinkIT,DeleteUsersIT,DeactivateUsersIT,AuxiliaryFilesIT,InvalidCharactersIT,LicensesIT,NotificationsIT,BagIT,MetadataBlocksIT,NetcdfIT,SignpostingIT,FitsIT,LogoutIT,DataRetrieverApiIT,ProvIT,S3AccessIT,OpenApiIT,InfoIT,DatasetTypesIT

0 comments on commit 830ea35

Please sign in to comment.