From 7925f3e2b53d910e07623801a3493041f829eb0d Mon Sep 17 00:00:00 2001 From: mesteva Date: Mon, 9 Sep 2024 14:05:16 -0500 Subject: [PATCH] Update policies.md kept updating policies. --- user-guide/docs/curating/policies.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/user-guide/docs/curating/policies.md b/user-guide/docs/curating/policies.md index 09e0027c..63776541 100644 --- a/user-guide/docs/curating/policies.md +++ b/user-guide/docs/curating/policies.md @@ -8,7 +8,7 @@ We accept engineering and social and behavioural sciences datasets derived from #### Data Size -Given the nature of research in the natural hazards community, which involves large-scale experiments, simulations, and field research projects, we currently do not impose restrictions on the size of the datasets that can be published. Largest published datasets in DDR are ~4 TB. This approach recognizes the necessity of comprehensive data collection and the importance of making this data available for future research and analysis. We do recommend researchers to be selective and publish data that is relevant to the dataset completeness and to research reproducibility and it is adequately described for reuse. The [Curation and Publication Best Practices](https://www.designsafe-ci.org/rw/user-guides/curating-publishing-projects/best-practices/data-curation/)include recommendations to achieve a quality dataset publications. We remain open to revisiting this policy as we observe changes in data usage patterns and technological advancements. Any future modifications will be communicated clearly to the community. +Given the nature of research in natural hazards which involves large-scale experiments, simulations, and field research projects, we currently do not impose restrictions on the size of the datasets that can be published. Largest published datasets in DDR are ~5 TB. This approach recognizes the necessity of comprehensive data collection and the importance of making this data available for future research and analysis. However, we do recommend researchers to be selective and to publish data that is relevant to research reproducibility. Importantly, the dataset should be adequately organized and described so that other researchers interested in reusing the data can find what they need. Thus, publishing large sized datasets imply significant curation work. The [Curation and Publication Best Practices](https://www.designsafe-ci.org/rw/user-guides/curating-publishing-projects/best-practices/data-curation/)include recommendations to achieve quality dataset publications. #### File Formats @@ -32,9 +32,9 @@ Reviews datasets pre and post-publication and suggests changes and improvements. The DDR team worked with NHERI experts to develop data models to curate the datasets generated in the natural hazards field. Based on the [Core Scientific Metadata Model], the models represent the structure and provenance of: experimental, simulation, field research/interdisciplinary, and hybrid simulation datasets. A data model type "other" was also developed for datasets that do not correspond to the research methods mentioned above and for other products such as posters, presentations, reports, check sheets, benchmarks, reports, etc. In the DDR interface users select one of this models as project type at the beginning of their interactive curation process. Implemented as interactive curation pipelines in the DDR curation interface the models allow users to organize their datasets in relation to research method and natural hazard type. This allows for a uniform curation experience and representation of published datasets. -(To facilitate data curation of the diverse and large datasets generated in the fields associated with natural hazards, we worked with experts in natural hazards research to develop five data models that encompass the following types of datasets: experimental, simulation, field research, hybrid simulation, and other data products (See: 10.3390/publications7030051; 10.2218/ijdc.v13i1.661) as well as lists of specialized vocabulary. Based on the Core Scientific Metadata Model, these data models were designed considering the community's research practices and workflows, the need for documenting these processes (provenance), and using terms common to the field. The models highlight the structure and components of natural hazards research projects across time, tests, geographical locations, provenance, and instrumentation. Researchers in our community have presented on the design, implementation and use of these models broadly. +To facilitate data curation of the diverse and large datasets generated in the fields associated with natural hazards, we worked with experts in natural hazards research to develop five data models that encompass the following types of datasets: experimental, simulation, field research, hybrid simulation, and other data products (See: 10.3390/publications7030051; 10.2218/ijdc.v13i1.661) as well as lists of specialized vocabulary. Based on the Core Scientific Metadata Model, these data models were designed considering the community's research practices and workflows, the need for documenting these processes (provenance), and using terms common to the field. The models highlight the structure and components of natural hazards research projects across time, tests, geographical locations, provenance, and instrumentation. Researchers in our community have presented on the design, implementation and use of these models broadly. -In the DDR web interface the data models are implemented as interactive functions with instructions that guide the researchers through the curation and publication tasks. As researchers move through the tion pipelines, the interactive features reinforce data and metadata completeness and thus the quality of the publication. The process will not move forward if requirements for metadata are not in place (See Metadata in Best Practices), or if key files are missing. ) +In the DDR web interface the data models are implemented as interactive functions with instructions that guide the researchers through the curation and publication tasks. As researchers move through the tion pipelines, the interactive features reinforce data and metadata completeness and thus the quality of the publication. The process will not move forward if requirements for metadata are not in place (See Metadata in Best Practices), or if key files are missing. #### Metadata @@ -230,7 +230,10 @@ More information about the reasons for amends and versioning are inLeave Data Feedback Best Practices -#### Data Impact { #impact } +### Tombstone +A tombstone is a landing page describing a dataset that has been removed from public access in a data repository. Creation and maintenance of the tombstone is a responsibility of the repository. In DDR curators review published datasets regularly. If data creators deposit data that does not meet the accepted data types, or if their dataset does not comply with DDR's Policies and Best Practices, they will be alerted after publication to improve their dataset presentation via amends or versioning. Users have 30 days from notification before tombstoning so they have enough time to implement improvements, and the curator will work with them . In DDR the data will still be available to the creator in My Project. + +#### Data Impact We understand data impact as a strategy that includes complementary efforts at the crossroads of data discoverability, usage metrics, and scholarly communications.