+ [https://guides.library.ubc.ca](https://guides.library.ubc.ca){target="_blank"}
**Q: Are there any conditions regarding the usage of data published in DesignSafe?**
**A**: Yes, users that download and reuse data agree to the Data Usage conditions published here: These conditions outline the responsibilities of and expectations for data usage including aspects of data licensing, citation, privacy and confidentiality, and data quality.
diff --git a/user-guide/docs/curating/policies.md b/user-guide/docs/curating/policies.md
index 990e73bd..66c20e05 100644
--- a/user-guide/docs/curating/policies.md
+++ b/user-guide/docs/curating/policies.md
@@ -61,11 +61,11 @@ Users that deposit data that does not correspond to the accepted types will be a
#### Data Size { #size }
-Researchers in the natural hazards community generate very large datasets during large-scale experiments, simulations, and field research projects. At the moment the DDR does not pose limitations on the amount of data to be published, but we do recommend to be selective and publish data that is relevant to a project completeness and is adequately described for reuse. Our Data Curation Best Practices include recommendations to achieve a quality data publication. We are observing trends in relation to sizes and subsequent data reuse of these products, which will inform if and how we will implement data size publication limit policies.
+Researchers in the natural hazards community generate very large datasets during large-scale experiments, simulations, and field research projects. At the moment the DDR does not pose limitations on the amount of data to be published, but we do recommend to be selective and publish data that is relevant to a project completeness and is adequately described for reuse. Our Data Curation Best Practices include recommendations to achieve a quality data publication. We are observing trends in relation to sizes and subsequent data reuse of these products, which will inform if and how we will implement data size publication limit policies.
#### Data Formats { #formats }
-We do not pose file format restrictions. The natural hazards research community utilizes diverse research methods to generate and record data in both open and proprietary formats, and there is continual update of equipment used in the field. We do encourage our users to convert to open formats when possible. The DDR follows the Library of Congress Recommended Format Statement and has guidance in place to convert proprietary formats to open formats for long term preservation; see our Accepted and Recommended Data Formats for more information. However, conversion can present challenges; Matlab, for example, allows saving complex data structures, yet not all of the files stored can be converted to a csv or a text file without losing some clarity and simplicity for handling and reusing the data. In addition, some proprietary formats such as jpeg, and excel have been considered standards for research and teaching for the last two decades. In attention to these reasons, we allow users to publish the data in both proprietary and open formats. Through our Fedora repository we keep file format identification information of all the datasets stored in DDR.
+We do not pose file format restrictions. The natural hazards research community utilizes diverse research methods to generate and record data in both open and proprietary formats, and there is continual update of equipment used in the field. We do encourage our users to convert to open formats when possible. The DDR follows the Library of Congress Recommended Format Statement and has guidance in place to convert proprietary formats to open formats for long term preservation; see our [Accepted and Recommended Data Formats](/user-guide/curating/bestpractices/#acceptedfileformats) for more information. However, conversion can present challenges; Matlab, for example, allows saving complex data structures, yet not all of the files stored can be converted to a csv or a text file without losing some clarity and simplicity for handling and reusing the data. In addition, some proprietary formats such as jpeg, and excel have been considered standards for research and teaching for the last two decades. In attention to these reasons, we allow users to publish the data in both proprietary and open formats. Through our Fedora repository we keep file format identification information of all the datasets stored in DDR.
### Data Curation
Data curation involves the organization, description, representation, permanent publication, and preservation of datasets in compliance with community best practices and FAIR data principles. In the DDR, data curation is a joint responsibility between the researchers that generate data and the DDR team. Researchers understand better the logic and functions of the datasets they create, and our team's role is to help them make these datasets FAIR-compliant.
@@ -74,13 +74,13 @@ Our goal is to enable researchers to curate their data from the beginning of a r
#### Data Management Plan { #management }
-For natural hazards researchers submitting proposals to the NSF using any of the NHERI network facilities/resources, or alternative facilities/resources, we developed Data Management guidelines that explain how to use the DDR for data curation and publication. See Data Management Plan at: https://www.designsafe-ci.org/rw/user-guides/ and https://converge.colorado.edu/data/data-management
+For natural hazards researchers submitting proposals to the NSF using any of the NHERI network facilities/resources, or alternative facilities/resources, we developed Data Management guidelines that explain how to use the DDR for data curation and publication. See Data Management Plan at: Data Management Plan and https://converge.colorado.edu/data/data-management
#### Data Models { #models }
-To facilitate data curation of the diverse and large datasets generated in the fields associated with natural hazards, we worked with experts in natural hazards research to develop five data models that encompass the following types of datasets: experimental, simulation, field research, hybrid simulation, and other data products (See: 10.3390/publications7030051; 10.2218/ijdc.v13i1.661) as well as lists of specialized vocabulary. Based on the Core Scientific Metadata Model, these data models were designed considering the community's research practices and workflows, the need for documenting these processes (provenance), and using terms common to the field. The models highlight the structure and components of natural hazards research projects across time, tests, geographical locations, provenance, and instrumentation. Researchers in our community have presented on the design, implementation and use of these models broadly.
+To facilitate data curation of the diverse and large datasets generated in the fields associated with natural hazards, we worked with experts in natural hazards research to develop [five data models](/user-guide/curating/guides/) that encompass the following types of datasets: experimental, simulation, field research, hybrid simulation, and other data products (See: 10.3390/publications7030051; 10.2218/ijdc.v13i1.661) as well as lists of specialized vocabulary. Based on the Core Scientific Metadata Model, these data models were designed considering the community's research practices and workflows, the need for documenting these processes (provenance), and using terms common to the field. The models highlight the structure and components of natural hazards research projects across time, tests, geographical locations, provenance, and instrumentation. Researchers in our community have presented on the design, implementation and use of these models broadly.
-In the DDR web interface the data models are implemented as interactive functions with instructions that guide the researchers through the curation and publication tasks. As researchers move through the tion pipelines, the interactive features reinforce data and metadata completeness and thus the quality of the publication. The process will not move forward if requirements for metadata are not in place (See Metadata in Best Practices), or if key files are missing.
+In the DDR web interface the data models are implemented as interactive functions with instructions that guide the researchers through the curation and publication tasks. As researchers move through the tion pipelines, the interactive features reinforce data and metadata completeness and thus the quality of the publication. The process will not move forward if requirements for metadata are not in place (See [Metadata in Best Practices](/user-guide/curating/bestpractices/#metadata)), or if key files are missing.
#### Metadata { #metadata }
@@ -88,7 +88,7 @@ Up to date, there is no standard metadata schema to describe natural hazards dat
Embedded in the DDR data models are categories and terms as metadata elements that experts in the NHERI network contributed and deemed important for data explainability and reuse. Categories reflect the structure and components of the research dataset, and the terms describe these components. The structure and components of the published datasets are represented on the dataset landing pages and through the Data Diagram presented for each dataset.
-Due to variations in their research methods, researchers may not need all the categories and terms available to describe and represent their datasets. However, we have identified a core set of metadata that allows proper data representation, explainability, and citation. These sets of core metadata are shown for each data model in our Metadata Requirements in Best Practices.
+Due to variations in their research methods, researchers may not need all the categories and terms available to describe and represent their datasets. However, we have identified a core set of metadata that allows proper data representation, explainability, and citation. These sets of core metadata are shown for each data model in our Metadata Requirements in Best Practices.
To further describe datasets, the curation interface offers the possibility to add both predefined and custom file tags. Predefined file tags are specialized terms provided by the natural hazard community; their use is optional, but highly recommended. The lists of tags are evolving for each data model, continuing to be expanded, updated, and corrected as we gather feedback and observe how researchers use them in their publications.
@@ -104,7 +104,7 @@ Metadata quality: Metadata is fundamental to data explainability and reuse. To s
Data content quality: Different groups in the NHERI network have developed benchmarks and guidelines for data quality assurance, including StEER, CONVERGE and RAPID. In turn, each NHERI Experimental Facility has methods and criteria in place for ensuring and assessing data quality during and after experiments are conducted. Most of the data curated and published along NHERI guidelines in the DDR are related to peer-reviewed research projects and papers, speaking to the relevance and standards of their design and outputs. Still, the community acknowledges that for very large datasets the opportunity for detailed quality assessment emerges after publication, as data are analyzed and turned into knowledge. Because work in many projects continues after publication, both for the data producers and reusers, the community has the opportunity to version datasets.
-Data completeness and representation: We understand data completeness as the presence of all relevant files that enable reproducibility, understandability, and therefore reuse. This may include readme files, data dictionaries and data reports, as well as data files. The DDR complies with data completeness by recommending and requesting users to include required data to fullfill the data model required categories indispensable for a publication understandibility and reuse. During the publication process the system verifies that those categories have data assigned to them.The Data Diagram on the landing page reflects which relevant data categories are present in each publication. A similar process happens for metadata during the publication pipeline; metadata is automatically vetted against the research community’s Metadata Requirements before moving on to receive a DOI for persistent identification.
+Data completeness and representation: We understand data completeness as the presence of all relevant files that enable reproducibility, understandability, and therefore reuse. This may include readme files, data dictionaries and data reports, as well as data files. The DDR complies with data completeness by recommending and requesting users to include required data to fullfill the data model required categories indispensable for a publication understandibility and reuse. During the publication process the system verifies that those categories have data assigned to them.The Data Diagram on the landing page reflects which relevant data categories are present in each publication. A similar process happens for metadata during the publication pipeline; metadata is automatically vetted against the research community's Metadata Requirements before moving on to receive a DOI for persistent identification.
We also support citation best practices for datasets reused in our publications. When users reuse data from other sources in their data projects, they have the opportunity to include them in the metadata through the Related Works and Referenced Data fields.
@@ -116,9 +116,9 @@ We believe that researchers are best prepared to tell the story of their project
Interactive pipelines: The DDR interface is designed to facilitate large scale data curation and publication through interactive step-by-step capabilities aided by onboarding instructions. This includes the possibility to categorize and tag multiple files in relation to the data models, and to establish relations between categories via diagrams that are intuitive to data producers and easy to understand for data consumers. Onboarding instructions including vocabulary definitions, suggestions for best practices, availability of controlled terms, and automated quality control checks are in place.
-One-on-one support: We hold virtual office hours twice a week during which a curator is available for real-time consulting with individuals and teams. Other virtual consulting times can be scheduled on demand. Users can also submit Help tickets, which are answered within 24 hours, as well as send emails to the curators. Users also communicate with curatorial staff via the DesignSafe Slack channel. The curatorial staff includes a natural hazards engineer, a data librarian, and a USEX specialist. Furthermore, developers are on call to assist when needed.
+One-on-one support: We hold virtual office hours twice a week during which a curator is available for real-time consulting with individuals and teams. Other virtual consulting times can be scheduled on demand. Users can also submit Help tickets, which are answered within 24 hours, as well as send emails to the curators. Users also communicate with curatorial staff via the DesignSafe Slack channel. The curatorial staff includes a natural hazards engineer, a data librarian, and a USEX specialist. Furthermore, developers are on call to assist when needed.
-Guidance on Best Practices: Curatorial staff prepares guides and video tutorials, including special training materials for Undergraduate Research Experience students and for Graduate Students working at Experimental Facilities.
+Guidance on Best Practices: Curatorial staff prepares guides and video tutorials, including special training materials for Undergraduate Research Experience students and for Graduate Students working at Experimental Facilities.
Webinars by Researchers: Various researchers in our community contribute to our curation and publication efforts by conducting webinars in which they relay their data curation and publication experiences. Some examples are webinars on curation and publication of hybrid simulations, field research and social sciences datasets.
@@ -132,11 +132,11 @@ Publishing protected data in the DDR involves complying with the requirements, n
Unless approved by an IRB, most forms of protected data cannot be published in DesignSafe. No direct identifiers and only up to three indirect identifiers are allowed in published datasets. However, data containing PII can be published in the DDR with proper consent from the subject(s) and documentation of that consent in the project's IRB paperwork. In all publications involving human subjects, researchers should include and publish their IRB documentation showing the agreement.
-If as a consequence of data de-identification the data looses meaning, it is possible to publish a description of the data, the corresponding IRB documents, the data instruents if applicable, and obtain a DOI and a citation for the dataset. In this case, the dataset will show as with Restricted Access. In addition, authors should include information of how to reach them in order to gain access or discuss more information about the dataset. The responsibility to maintain the protected dataset in compliance with the IRB comitements and for the long term will lie on the authors, and they can use TACC's Protected Data Services if they need to. For more information on how to manage this case see our Protected Data Best Practices.
+If as a consequence of data de-identification the data looses meaning, it is possible to publish a description of the data, the corresponding IRB documents, the data instruents if applicable, and obtain a DOI and a citation for the dataset. In this case, the dataset will show as with Restricted Access. In addition, authors should include information of how to reach them in order to gain access or discuss more information about the dataset. The responsibility to maintain the protected dataset in compliance with the IRB comitements and for the long term will lie on the authors, and they can use TACC's Protected Data Services if they need to. For more information on how to manage this case see our [Protected Data Best Practices](/user-guide/curating/bestpractices/#protecteddata).
It is the user’s responsibility to adhere to these policies and the procedures and standards of their IRB or other equivalent institution, and DesignSafe will not be held liable for any violations of these terms regarding improper publication of protected data. User uploads that we are notified of that violate this policy may be removed from the DDR with or without notice, and the user may be asked to suspend their use of the DDR and other DesignSafe resources. We may also contact the user’s IRB and/or other respective institution with any cases of violation, which could incur in an active audit (See 24) of the research project, so users should review their institution’s policies regarding publishing with protected data before using DesignSafe and DDR.
-For any data not subject to IRB oversight but may still contain PII, such as Google Earth images containing images of people not studied in the scope of the research project, we recommend blocking out or blurring any information that could be considered PII before publishing the data in the DDR. We still invite any researchers that are interested in seeing the raw data to contact the PI of the research project to try and attain that. See our Protected Data Best Practices for information on how to manage protected data in DDR.
+For any data not subject to IRB oversight but may still contain PII, such as Google Earth images containing images of people not studied in the scope of the research project, we recommend blocking out or blurring any information that could be considered PII before publishing the data in the DDR. We still invite any researchers that are interested in seeing the raw data to contact the PI of the research project to try and attain that. See our [Protected Data Best Practices](/user-guide/curating/bestpractices/#protecteddata) for information on how to manage protected data in DDR.
#### Subsequent Publishing { #publishing }
@@ -144,7 +144,7 @@ Attending to the needs expressed by the community, we enable the possibility to
#### Timely Data Publication { #timely }
-Although no firm deadline requirements are specified for data publishing, as an NSF-funded platform we expect researchers to publish in a timely manner, so we provide recommended timelines for publishing different types of research data in our Timely Data Publication Best Practices.
+Although no firm deadline requirements are specified for data publishing, as an NSF-funded platform we expect researchers to publish in a timely manner, so we provide recommended timelines for publishing different types of research data in our [Timely Data Publication Best Practices](/user-guide/curating/bestpractices/#data-publication).
#### Peer Review { #peerreview }
@@ -152,11 +152,11 @@ Users that need to submit their data for revision prior to publishing and assign
#### Public Accessibility Delay { #accessiblity }
-Many researchers request a DOI for their data before it is made publicly available to include in papers submitted to journals for review. In order to assign a DOI in the DDR, the data has to be curated and ready to be published. Once the DOI is in place, we provide services to researchers with such commitments to delay the public accessibility of their data publication in the DDR, i.e. to make the user’s data publication, via their assigned DOI, not web indexable through DataCote and or not publicly available in DDR's data browser until the corresponding paper is published in a journal, or for up to one year after the data is deposited. The logic behind this policy is that once a DOI has been assigned, it will inevitably be published, so this delay can be used to provide reviewers access to a data publication before it is broadly distributed. Note that data should be fully curated, and that while not broadly it will be eventually indexed by search engines. Users that need to amend/correct their publications will be able to do so via version control. See our Data Delay Best Practices for more information on obtaining a public accessibility delay.
+Many researchers request a DOI for their data before it is made publicly available to include in papers submitted to journals for review. In order to assign a DOI in the DDR, the data has to be curated and ready to be published. Once the DOI is in place, we provide services to researchers with such commitments to delay the public accessibility of their data publication in the DDR, i.e. to make the user's data publication, via their assigned DOI, not web indexable through DataCote and or not publicly available in DDR's data browser until the corresponding paper is published in a journal, or for up to one year after the data is deposited. The logic behind this policy is that once a DOI has been assigned, it will inevitably be published, so this delay can be used to provide reviewers access to a data publication before it is broadly distributed. Note that data should be fully curated, and that while not broadly it will be eventually indexed by search engines. Users that need to amend/correct their publications will be able to do so via version control. See our [Data Delay Best Practices](/user-guide/curating/bestpractices/#accessibilitydelay) for more information on obtaining a public accessibility delay.
#### Data Licenses { #licenses }
-DDR provides users with 5 licensing options to accommodate the variety of research outputs generated and how researchers in this community want to be attributed. The following licenses were selected after discussions within our community. In general, DDR users are keen about sharing their data openly but expect attribution. In addition to data, our community issues reports, survey instruments, presentations, learning materials, and code. The licenses are: Creative Commons Attribution (CC-BY 4.0), Creative Commons Public Domain Dedication (CC-0 1.0), Open Data Commons Attribution (ODC-BY 1.0), Open Data Commons Public Domain Dedication (ODC-PPDL 1.0), and GNU General Public License (GNU-GPL 3). During the publication process users have the option of selecting one license per publication with a DOI. More specifications of these license options and the works they can be applied to can be found in Licensing Best Practices.
+DDR provides users with 5 licensing options to accommodate the variety of research outputs generated and how researchers in this community want to be attributed. The following licenses were selected after discussions within our community. In general, DDR users are keen about sharing their data openly but expect attribution. In addition to data, our community issues reports, survey instruments, presentations, learning materials, and code. The licenses are: Creative Commons Attribution (CC-BY 4.0), Creative Commons Public Domain Dedication (CC-0 1.0), Open Data Commons Attribution (ODC-BY 1.0), Open Data Commons Public Domain Dedication (ODC-PPDL 1.0), and GNU General Public License (GNU-GPL 3). During the publication process users have the option of selecting one license per publication with a DOI. More specifications of these license options and the works they can be applied to can be found in [Licensing Best Practices](/user-guide/curating/bestpractices/#data-publication).
DDR also requires that users reusing data from others in their projects do so in compliance with the terms of the data original license.
@@ -172,40 +172,26 @@ For users publishing data in DDR, we enable referencing works and or data reused
The expectations of DDR and the responsibilities of users in relation to the application and compliance with data citation are included in the DesignSafe Terms of Use, the Data Usage Agreement, and the Data Publication Agreement. As clearly stated in those documents, in the event that we note or are notified that citation policies and best practices are not followed, we will notify the user of the infringement and may cancel their DesignSafe account.
-However, given that it is not feasible to know with certainty if users comply with data citation, our approach is to educate our community by reinforcing citation in a positive way. For this we implement outreach strategies to stimulate data citation. Through diverse documentation, FAQs webinars, and via emails, we regularly train our users on data citation best practices. And, by tracking and publishing information about the impact and science contributions of the works they publish citing the data that they use, we demonstrate the value of data reuse and further stimulate publishing and citing data.
+However, given that it is not feasible to know with certainty if users comply with data citation, our approach is to educate our community by reinforcing citation in a positive way. For this we implement outreach strategies to stimulate data citation. Through diverse [documentation](/user-guide/curating/guides/), [FAQs](/user-guide/curating/faq/), Q&As, webinars, and via emails, we regularly train our users on data citation best practices. And, by tracking and publishing information about the impact and science contributions of the works they publish citing the data that they use, we demonstrate the value of data reuse and further stimulate publishing and citing data.
#### Data Publication Agreement { #agreement }
This agreement is read and has to be accepted by the user prior to publishing a dataset.
-This submission represents my original work and meets the policies and requirements established by the DesignSafe Policies and Best Practices. I grant the Data Depot Repository (DDR) all required permissions and licenses to make the work I publish in the DDR available for archiving and continued access. These permissions include allowing DesignSafe to:
+This submission represents my original work and meets the policies and requirements established by the DesignSafe [Policies](/user-guide/curating/policies/) and [Best Practices](/user-guide/curating/bestpractices/). I grant the Data Depot Repository (DDR) all required permissions and licenses to make the work I publish in the DDR available for archiving and continued access. These permissions include allowing DesignSafe to:
-
- -
- Disseminate the content in a variety of distribution formats according to the DDR Policies and Best Practices.
-
- -
- Promote and advertise the content publicly in DesignSafe.
-
- -
- Store, translate, copy, or re-format files in any way to ensure its future preservation and accessibility,
-
- -
- Improve usability and/or protect respondent confidentiality.
-
- -
- Exchange and or incorporate metadata or documentation in the content into public access catalogues.
-
- -
- Transfer data, metadata with respective DOI to other institution for long-term accessibility if needed for continuos access.
-
-
+1. Disseminate the content in a variety of distribution formats according to the DDR [Policies](/user-guide/curating/policies/) and [Best Practices](/user-guide/curating/bestpractices/).
+2. Promote and advertise the content publicly in DesignSafe.
+3. Store, translate, copy, or re-format files in any way to ensure its future preservation and accessibility,
+4. Improve usability and/or protect respondent confidentiality.
+5. Exchange and or incorporate metadata or documentation in the content into public access catalogues.
+6. Transfer data, metadata with respective DOI to other institution for long-term accessibility if needed for continuos access.
I understand the type of license I choose to distribute my data, and I guarantee that I am entitled to grant the rights contained in them. I agree that when this submission is made public with a unique digital object identifier (DOI), this will result in a publication that cannot be changed. If the dataset requires revision, a new version of the data publication will be published under the same DOI.
I warrant that I am lawfully entitled and have full authority to license the content submitted, as described in this agreement. None of the above supersedes any prior contractual obligations with third parties that require any information to be kept confidential.
-If applicable, I warrant that I am following the IRB agreements in place for my research and following Protected Data Best Practices.
+If applicable, I warrant that I am following the IRB agreements in place for my research and following [Protected Data Best Practices](/user-guide/curating/bestpractices/#protecteddata).
I understand that the DDR does not approve data publications before they are posted; therefore, I am solely responsible for the submission, publication, and all possible confidentiality/privacy issues that may arise from the publication.
@@ -258,11 +244,11 @@ Documentation of versions requires including the name of the file/s changed, rem
The Fedora repository manages all amends and versions so there is a record of all changes. Version number is passed to DataCite as metadata.
-More information about the reasons for amends and versioning are in Publication Best Practices.
+More information about the reasons for amends and versioning are in [Publication Best Practices](/user-guide/curating/bestpractices/#data-publication).
#### Leave Data Feedback { #feedback }
-Users can click a “Leave Feedback” button on the projects’ landing pages to provide comments on any publication. This feedback is forwarded to the curation team for any needed actions, including contacting the authors. In addition, it is possible for users to message the authors directly as their contact information is available via the authors field in the publication landing pages. We encourage users to provide constructive feedback and suggest themes they may want to discuss about the publication in our Leave Data Feedback Best Practices
+Users can click a “Leave Feedback” button on the projects’ landing pages to provide comments on any publication. This feedback is forwarded to the curation team for any needed actions, including contacting the authors. In addition, it is possible for users to message the authors directly as their contact information is available via the authors field in the publication landing pages. We encourage users to provide constructive feedback and suggest themes they may want to discuss about the publication in our [Leave Data Feedback Best Practices](/user-guide/curating/bestpractices/#feedback).
#### Data Impact { #impact }
@@ -292,27 +278,27 @@ We started counting since May 17, 2021. We update the reports on a monthly basis
Data Vignettes
-Since 2020 we conduct Data Reuse Vignettes. For this, we identify published papers and interview researchers that have reused data published in DDR. In this context, reuse means that researchers are using data published by others for purposes different than those intended by the data creators. During the interviews we use a semi-structured questionnaire to discuss the academic relevance of the research, the ease of access to the data in DDR, and the understandability of the data publication in relation to metadata and documentation clarity and completeness. We feature the data stories on the DesignSafe website and use the feedback to make changes and to design new reuse strategies. The methodology used in this project was presented at the International Qualitative and Quantitative Methods in Libraries 2020 International Conference . See Perspectives on Data Reuse from the Field of Natural Hazards Engineering.
+Since 2020 we conduct Data Reuse Vignettes. For this, we identify published papers and interview researchers that have reused data published in DDR. In this context, reuse means that researchers are using data published by others for purposes different than those intended by the data creators. During the interviews we use a semi-structured questionnaire to discuss the academic relevance of the research, the ease of access to the data in DDR, and the understandability of the data publication in relation to metadata and documentation clarity and completeness. We feature the data stories on the DesignSafe website and use the feedback to make changes and to design new reuse strategies. The methodology used in this project was presented at the International Qualitative and Quantitative Methods in Libraries 2020 International Conference. See Perspectives on Data Reuse from the Field of Natural Hazards Engineering.
Data Awards
In 2021 we launched the first Data Publishing Award to encourage excellence in data publication and to stimulate reuse. Data publications are nominated by our user community based on contribution to scientific advancement and curation
### Data Preservation
-Data preservation encompasses diverse activities carried out by all the stakeholders involved in the lifecycle of data, from data management planning to data curation, publication and long-term archiving. Once data is submitted to the Data Depot Repository (DDR,) we have functionalities and Guidance in place to address the long-term preservation of the submitted data.
+Data preservation encompasses diverse activities carried out by all the stakeholders involved in the lifecycle of data, from data management planning to data curation, publication and long-term archiving. Once data is submitted to the Data Depot Repository (DDR) we have functionalities and [Guidance](/user-guide/curating/policies/#data-preservation) in place to address the long-term preservation of the submitted data.
The DDR has been operational since 2016 and is currently supported by the NSF from October 1st, 2020 through September 30, 2025. During this award period, the DDR will continue to preserve the natural hazards research data published since its inception, as well as supporting preservation of and access to legacy data and the accompanying metadata from the Network for Earthquake Engineering Simulation (NEES), a NHERI predecessor, dating from 2005. The legacy data comprising 33 TB, 5.1 million files,2 and their metadata was transferred to DesignSafe in 2016 as part of the conditions of the original grant. See NEES data here.
-Data in the (DDR) is preserved according to state-of-the art digital library standards and best practices. DesignSafe is implemented within the reliable, secure, and scalable storage infrastructure at the Texas Advanced Computing Center (TACC), with 20 years of experience and innovation in High Performance Computing. TACC is currently over 20 years old, and TACC and its predecessors have operated a digital data archive continuously since 1986 – currently implemented in the Corral Data Management system and the Ranch tape archive system, with capacity of approximately half an exabyte. Corral and Ranch hold the data for DesignSafe and hundreds of other research data collections. For details about the digital preservation architecture and procedures for DDR go to Data Preservation Best Practices.
+Data in the (DDR) is preserved according to state-of-the art digital library standards and best practices. DesignSafe is implemented within the reliable, secure, and scalable storage infrastructure at the Texas Advanced Computing Center (TACC), with 20 years of experience and innovation in High Performance Computing. TACC is currently over 20 years old, and TACC and its predecessors have operated a digital data archive continuously since 1986 – currently implemented in the Corral Data Management system and the Ranch tape archive system, with capacity of approximately half an exabyte. Corral and Ranch hold the data for DesignSafe and hundreds of other research data collections. For details about the digital preservation architecture and procedures for DDR go to [Data Preservation Best Practices](/user-guide/curating/bestpractices/#data-preservation).
-Within TACC’s storage infrastructure a Fedora repository, considered a standard for digital libraries, manages the preservation of the published data. Through its functionalities, Fedora assures the authenticity and integrity of the digital objects, manages versioning, identifies file formats, records preservation events as metadata, maintains RDF metadata in accordance to standard schemas, conducts audits, and maintains the relationships between data and metadata for each published research project and its corresponding datasets. Each published dataset in DesignSafe has a Digital Object Identifier, whose maintenance we understand as a firm commitment to data persistence.
+Within TACC’s storage infrastructure a Fedora repository, considered a standard for digital libraries, manages the preservation of the published data. Through its functionalities, Fedora assures the authenticity and integrity of the digital objects, manages versioning, identifies file formats, records preservation events as metadata, maintains RDF metadata in accordance to standard schemas, conducts audits, and maintains the relationships between data and metadata for each published research project and its corresponding datasets. Each published dataset in DesignSafe has a Digital Object Identifier, whose maintenance we understand as a firm commitment to data persistence.
-While at the moment DDR is committed to preserve data in the format in which it is submitted, we procure the necessary authorizations from users to conduct further preservation actions as well as to transfer the data to other organizations if applicable. These permissions are granted through our Data Publication Agreement, which authors acknowledge and have the choice to agree to at the end of the publication workflow and prior to receiving a DOI for their dataset.
+While at the moment DDR is committed to preserve data in the format in which it is submitted, we procure the necessary authorizations from users to conduct further preservation actions as well as to transfer the data to other organizations if applicable. These permissions are granted through our [Data Publication Agreement](/user-guide/curating/policies/#agreement), which authors acknowledge and have the choice to agree to at the end of the publication workflow and prior to receiving a DOI for their dataset.
Data sustainability is a continuous effort that DDR accomplishes along with the rest of the NHERI partners. In the natural hazards space, data is central to new advances, which is evidenced by the data reuse record of our community and the following initiatives: