This document specifies the SciCat PublishedData profile. A profile imposes further constraints on top of the RO-Crate specification in order to enable reliable programmatic processing of the crates.
A RO-Crate that conforms to this profile must include @type
Dataset
on the root data entity (This is already a constraint of the RO-Crate spec).
Moreover, each of the entities in hasPart
must have the @type
of scicat:PublishedData
. The rest of the document describes the properties of scicat:PublishedData
.
All properties have a prefix scicat:
(as shown in the example), but is omitted from the tables below for brevity.
Property | Expected value range | Definition | Equivalent Schema.org property | Equivalent Datacite property |
---|---|---|---|---|
doi |
string |
Digital Object Identifier for the dataset. | identifier | identifier |
creator |
List<string> |
List of creators of the dataset. | creator > Person > name | creator#creatorName |
publisher |
string |
Organization or entity publishing the dataset. | Organization > publisher | publisher |
publicationYear |
number |
Year the dataset was published. | CreativeWork > datePublished | publicationYear |
title |
string |
Title of the dataset. | name | title |
abstract |
string |
Abstract summarizing the dataset. | abstract | description |
resourceType |
Enum["raw", "derived"] |
Type of the dataset (e.g., raw/derived). | additionalType | resourceType |
pidArray |
List<string> |
Array of one or more persistent identifiers which make up the published data | identifier | relatedIdentifiers |
registeredTime |
timestamp |
Timestamp when the DOI was registered. | CreativeWork > sdDatePublished | dates#date#submitted |
status |
string |
Status in the publication workflow. | status | N/A |
createdAt |
timestamp |
Date the dataset was created (system-generated). | CreativeWork > dateCreated | dates#date#created |
updatedAt |
timestamp |
Date the dataset was last updated (system-generated). | dateModified | dates#date#updated |
dataDescription |
string |
Link to a description of how to reuse the dataset. | ?? url | ?? description |
Property | Expected value range | Definition | Equivalent Schema.org property | Equivalent Datacite property |
---|---|---|---|---|
affiliation |
string |
Affiliations of the dataset creators. | Person > affiliation | creator#affiliation |
url |
string |
Landing page URL for the dataset's DOI. | CreativeWork > url | relatedIdentifier, relatedIdentifierType=URL |
numberOfFiles |
number |
Number of files included in the dataset. | CreativeWork > size | sizes |
sizeOfArchive |
number |
Size of the dataset archive. | contentSize | sizes |
authors |
List<string> |
List of contributors/authors of the dataset. | CreativeWork > contributor | contributors |
scicatUser |
string |
Username of the person initiating publication in the system. | Person > accountName | ?? contributor, contributorType=relatedPerson |
thumbnail |
string |
Thumbnail image for the dataset (base64 encoded, < 16 MB). | image | relatedItem, relatedItemType=Image |
relatedPublications |
List<string> |
List of URLs pointing to related publications. | isBasedOn | relatedIdentifiers |
downloadLink |
string |
URL for downloading the dataset. | contentUrl | relatedIdentifier, resourceTypeGeneral=Dataset |
A conformant RO-Crate is present in ro-crate-metadata.json