Skip to content

Commit

Permalink
changes made for compatibility with ACM/LaTeX
Browse files Browse the repository at this point in the history
  • Loading branch information
HM Rando committed May 1, 2021
1 parent 778ea22 commit 16a295d
Showing 1 changed file with 13 additions and 13 deletions.
26 changes: 13 additions & 13 deletions content/60.methods.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Additional curation by CoronaCentral [@doi:10.1101/2020.12.21.423860] has produc
Thus, any effort to synthesize, summarize, and contextualize COVID-19 research will face a vast corpus of potentially relevant material.

**Change over time in the number of publications in the CORD-19 dataset.**
As of {{cord19_date_pretty}}, there are {{cord19_total_pubs}} in the CORD-19 dataset.
As of {{cord19_date_pretty}}, there were {{cord19_total_pubs}} articles in the CORD-19 dataset.
The first release, on March 16, 2020, contained 28,000 manuscripts on topics relevant to SARS-CoV-2 and related coronaviruses [@arxiv:2004.10706].
Since then, these articles have continued to proliferate (left), with both traditionally published and preprint manuscripts in the corpus (right).
At present, it contains {{cord19_total_preprints}} preprints from _arXiv_, _bioRxiv_, and _medRxiv_.
Expand All @@ -53,14 +53,14 @@ While not all of the manuscripts are focused explicitly on SARS-CoV-2 or COVID-1

With information being produced rapidly through both traditional publishing venues and preprint servers, some papers that are published face scrutiny after their initial release.
Concerns have been raised that the number of COVID-19 papers being retracted may be higher, or potentially much higher, than is typical, although a thorough investigation of this question will not be possible until more time has elapsed [@doi:10.1080/08989621.2020.1782203; @doi:10.1080/08989621.2020.1793675].
Other papers are updated with corrections or expressions of concern [@doi:10.1080/08989621.2020.1793675; @url:https://retractionwatch.com/retracted-coronavirus-covid-19-papers].
Other papers are updated with corrections or expressions of concern [@doi:10.1080/08989621.2020.1793675;url:https://retractionwatch.com/retracted-coronavirus-covid-19-papers].
These include both preprints and papers published in more traditional venues [@url:https://retractionwatch.com/retracted-coronavirus-covid-19-papers; @url:https://asapbio.org/preprints-and-covid-19].
Preprints provide a venue for scientists to release findings rapidly but have both the advantage and disadvantage of making research available before it has undergone the peer-review process.
However, some traditional publishing venues have also fast-tracked COVID-19 through peer review, leading to questions about whether this research is being held to the usual standards for publication [@doi:10.1111/bioe.12772].
Therefore, monitoring the COVID-19 literature requires not only digesting the high volume of information released but also critically evaluating it and/or monitoring for subsequent adjustments.

Because of the fast-moving nature of the topic, many efforts to summarize and synthesize the COVID-19 literature have been undertaken.
These efforts include newsletters [@url:https://depts.washington.edu/pandemicalliance/covid-19-literature-report/latest-reports/; @doi:10.1080/10872981.2020.1770562], web portals (such as [@url:https://outbreaksci.prereview.org; @doi:10.1126/science.abc7839] or the now-defunct http://covidpreprints.com/, which was described in [@url:https://asapbio.org/preprints-and-covid-19]), comments on preprint servers [@doi:10.1038/s41577-020-0319-0; @url:https://disqus.com/by/sinaiimmunologyreviewproject/], and even a journal [@url:https://rapidreviewscovid19.mitpress.mit.edu/].
These efforts include newsletters [@url:https://depts.washington.edu/pandemicalliance/covid-19-literature-report/latest-reports/; @doi:10.1080/10872981.2020.1770562], web portals (such as [@url:https://outbreaksci.prereview.org; @doi:10.1126/science.abc7839] or the now-defunct http://covidpreprints.com/, which was described in [@url:https://asapbio.org/preprints-and-covid-19]), comments on preprint servers [@doi:10.1038/s41577-020-0319-0] (see https://disqus.com/by/sinaiimmunologyreviewproject), and even a journal [@url:https://rapidreviewscovid19.mitpress.mit.edu/].
However, the explosive rate of publication presents challenges for such efforts, many of which are no longer publishing summaries.
Similarly, many literature reviews have been written on the available COVID-19 literature [@doi:10.1016/j.molmed.2020.02.008; @doi:10.1016/j.immuni.2020.05.002; @doi:10.1126/scitranslmed.abc1931; @doi:10.1016/j.immuni.2020.05.002; @doi:10.1001/jama.2020.6019; @doi:10.1038/d41591-020-00026-w; @doi:10.1001/jama.2020.12839].
However, static reviews quickly become outdated as new research is released or existing research is retracted or superseded; one example is a review of topics in COVID-19 research including vaccine development [@doi:10.1001/jama.2020.12839].
Expand Down Expand Up @@ -101,7 +101,7 @@ These contributions were not strictly defined and could range from minor correct
Each pull request was reviewed and approved by at least one other contributor before being merged into the main branch.
We sought to tag potential reviewers based on the introductions they had contributed in order to encourage participation.
Emphasizing the use of issues and pull requests was designed to encourage authors with and without git experience to discuss papers and provide feedback (both formal and informal) on proposed text additions or changes.
We also used gitter [@url:https://gitter.im] to promote informal questions and sharing of information among collaborators.
We also used gitter (@url:https://gitter.im) to promote informal questions and sharing of information among collaborators.

#### Utilization and Expansion of Manubot

Expand Down Expand Up @@ -130,9 +130,9 @@ The workflow file is available from <https://github.com/greenelab/covid19-review
The Python package versions are available in <https://github.com/greenelab/covid19-review/blob/external-resources/environment.yml>.<!-- To Do: These files are archived with [Zenodo?](...). -->

Another issue that emerged was the need for a standardized way to cite clinical trials.
Clinical trials that are registered with clinicaltrials.gov [@url:https://clinicaltrials.gov] receive a unique clinical trial identifier, or "NCT ID."
Clinical trials that are registered with https://clinicaltrials.gov receive a unique clinical trial identifier, or "NCT ID."
Because clinical trials are registered long before results are available in manuscript form, it was important to this project to be able to refer to the clinical trial identifiers associated with a large number of relevant trials (Figure @fig:ebm-trials).
Manubot uses the Zotero translation server [@url:https://www.zotero.org; @url:https://github.com/zotero/translation-server] to extract metadata for some types of citations.
Manubot uses the Zotero translation server (https://www.zotero.org and https://github.com/zotero/translation-server) to extract metadata for some types of citations.
However, Zotero did not support clinical trial identifiers and could not extract relevant metadata from the clinical trial's URL.
In order to enable Manubot to pull metadata associated with clinical trials based on their identifiers, we added Zotero support for these identifiers.
Other researchers identified the same need [@url:https://forums.zotero.org/discussion/74933/import-from-clinical-trials-registry; @url:https://forums.zotero.org/discussion/77721/add-reference-from-clinical-trials-org].
Expand All @@ -150,11 +150,11 @@ This addition was invaluable given the nature of the project, where we were diss
The badges also allow readers to ascertain a rough evaluation of the reliability of cited sources at a glance.

Because most collaborators were writing and editing text through the GitHub website rather than in a local text editor, we also needed to add spell-checking functionalities to Manubot.
We integrated an existing Pandoc [@url:https://pandoc.org/] spell-check extension with AppVeyor CI to automatically post spelling errors as comments in a GitHub pull request.
We integrated an existing Pandoc (https://pandoc.org/) spell-check extension with AppVeyor CI to automatically post spelling errors as comments in a GitHub pull request.
The comment reported both unique misspelled words and all locations in which those spelling errors were detected.
Project maintainers created and updated a custom dictionary to ignore over 1,500 scientific and technical terms that are not common English words.
Spell-checking also helped standardize the writing style across dozesn of authors by detecting British English spelling.
The actual spell-checking was implemented using GNU Aspell [@url:http://aspell.net/] and the Pandoc spellcheck filter [@url:https://github.com/pandoc/lua-filters/tree/master/spellcheck].
The actual spell-checking was implemented using GNU Aspell (http://aspell.net/) and the Pandoc spellcheck filter [@url:https://github.com/pandoc/lua-filters/tree/master/spellcheck].
The filter enables checking only the manuscript text, ignoring URLs and formatting text.
<!-- To Do: Acknowledge David Nicholson for the suggestion and edit to report locations of spelling errors -->

Expand Down Expand Up @@ -199,8 +199,8 @@ The remaining four manuscripts are in preparation.
These manuscripts cover a wide range of topics including the fundamental biology of SARS-CoV-2 (pathogenesis [@individual-pathogenesis] and evolution [@individual-evolution]), biomedical advances in responding to the virus and COVID-19 (pharmaceutical therapeutics [@individual-pharmaceuticals], nutraceutical therapeutics [@individual-nutraceuticals], vaccines [@individual-vaccines], and diagnostic technologies [@individual-diagnostics]), and biological and social factors influencing disease transmission and outcomes [@individual-inequality].
To date, 51 authors are associated with the consortium (Figure @fig:projectstats).<!--To Do: break down by undergrad, grad/med student, post-doc, jr faculty?-->
Efforts to integrate with existing projects providing support for undergraduate students during COVID-19 were also successful.
Appendix A contains summaries written by the students, post-docs, and faculty of the Immunology Institute at the Mount Sinai School of Medicine [@url:https://github.com/ismms-himc/covid-19_sinai_reviews; @doi:10.1038/s41577-020-0319-0].
Additionally, two of the consortium authors were undergraduate students recruited through the American Physician Scientist Association's Virtual Summer Research Program [@url:https://www.physicianscientists.org/page/summer-research-pilot-program].
We collaborated with the Immunology Institute at the Mount Sinai School of Medicine to incorporate summaries written by their students, post-docs, and faculty [@url:https://github.com/ismms-himc/covid-19_sinai_reviews; @doi:10.1038/s41577-020-0319-0].
Additionally, two of the consortium authors were undergraduate students recruited through the American Physician Scientist Association's Virtual Summer Research Program.
Thus, the consortium is expected to be successful in providing a venue for researchers across all career stages to continue investigating and publishing at a time when many biomedical researchers were unable to access their laboratory facilities.

#### Using Manubot to Investigate COVID-19
Expand All @@ -209,7 +209,7 @@ Data were integrated into the manuscripts from several sources.
Data about worldwide cases and deaths from the COVID-19 Data Repository by the Center for Systems Science and Engineering at Johns Hopkins University [@https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series] were read using a Python script.
Similarly, the clinical trials statistics and figure were generated based on data from the University of Oxford Evidence-Based Medicine Data Lab's COVID-19 TrialsTracker [@doi:10.5281/zenodo.3732709].
The evolution of this figure over time is shown in Figure @fig:ebm-trials.
Information about vaccine distribution was extracted from Our World In Data [@url:https://github.com/owid/covid-19-data].
Information about vaccine distribution was extracted from Our World In Data (https://github.com/owid/covid-19-data).
Figure 1 (above) also uses this approach to dynamically integrate data directly from the CORD-19 dataset [@arxiv:2004.10706].<!--To Do: Flow chart of data integration? We could have a summary figure showing all of the external data sources that are integrated into the manuscript. We have icons for MSSM reviews, JHU data, etc. at the top. That flows into a GitHub repo, which also takes input from all the contributors. Then the output of the repo is the manuscript and other CI artifacts.-->

![
Expand All @@ -230,7 +230,7 @@ Using CI, Manubot now checks that the manuscript was built correctly, runs spell

### DISCUSSION

The current project was managed through GitHub [@url:https://github.com/greenelab/covid19-review] using Manubot [@doi:10.1371/journal.pcbi.1007128] to continuously generate a version of the manuscript online [@url:https://greenelab.github.io/covid19-review].
The current project was managed through GitHub (https://github.com/greenelab/covid19-review) using Manubot [@doi:10.1371/journal.pcbi.1007128] to continuously generate a version of the manuscript online [@url:https://greenelab.github.io/covid19-review].
The Manubot framework facilitated a massive collaborative review on an urgent topic.
This project demonstrates that Manubot can be applied to projects where not all collaborators have expertise or even experience working with version control pipelines.
Through the development of cyberinfrastructure both for training novice users to interact with GitHub and to simplify the workflows to allow them to receive many of the benefits of What You See Is What You Get platforms such as Google Docs, we were able to adapt a powerful open publishing tool to harness the domain expertise of a large group of non-technical users and to respond to the flood of COVID-19 publications.
Expand Down Expand Up @@ -263,7 +263,7 @@ Common responses included terms like contribute (44 uses), biology (42 uses), da
With the worldwide scientific community uniting during 2020 and 2021 to investigate SARS-CoV-2 and COVID-19 from a wide range of perspectives, findings from many disciplines are relevant on a rapid timescale to a broad scientific audience.
As many other efforts have described, the publishing rate of formal manuscripts and preprints about COVID-19 has been unprecedented [@doi:10.1053/j.ackd.2020.08.003], and efforts to review the body of COVID-19 literature are faced with an ever-expanding corpus to evaluate.
In the case of the seven manuscripts produced by the COVID-19 Review Consortium, Manubot will allow for continuous updating of the manuscripts as the pandemic enters its second year and the landscape shifts with the emergence of promising therapeutics and vaccines [@individual-pharmaceuticals; @individual-vaccines].
These manuscripts pull data from XX data sources, allowing for information and visualizations to be updated daily using CI. <!--To Do: Need to update the "XX" sources.-->
These manuscripts pull data from four data sources, allowing for information and visualizations to be updated daily using CI.
This computational approach allows for some of the updating process to be off-loaded so that domain experts can focus on the broader implications of new information as it emerges.
As a result, centralizing, summarizing, and critiquing data and literature broadly relevant to COVID-19 can help to expedite the interdisciplinary scientific process that is currently happening at an advanced pace.
The efforts of the COVID-19 Review Consortium illustrate the value of including open source tools, including those focused on open publishing, in these efforts.
Expand Down

0 comments on commit 16a295d

Please sign in to comment.