Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to text for ACM-BCB submission #947

Merged
merged 70 commits into from
May 11, 2021
Merged

Conversation

rando2
Copy link
Contributor

@rando2 rando2 commented Apr 28, 2021

I'm currently doing a pass through the manuscript to make some updates. I know others might do this, so figured I should do it section by section. Here is a second pass through on the intro.

Description of the proposed additions or changes

Related issues

Suggested reviewers (optional)

Checklist

  • Text is formatted so that each sentence is on its own line.
  • Pre-prints cited in this pull request have a GitHub issue opened so that they can be reviewed.

@rando2 rando2 added the Methods Strategies for review label Apr 28, 2021
@AppVeyorBot
Copy link

AppVeyor build 1.0.4011 for commit afcf53c is now complete.

Found 18 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymous
content/09.evolution.md:139:LVNA
content/23.vaccines-app.md:15:IgGs
content/23.vaccines-app.md:387:IgGs
content/60.methods.md:8:CCS
content/60.methods.md:24:CoronaCentral
content/60.methods.md:55:MOOPs
content/60.methods.md:75:ECRs
content/60.methods.md:91:Manubot's
content/60.methods.md:121:scite
content/60.methods.md:121:Scite
content/60.methods.md:157:scite
content/60.methods.md...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.4015 for commit a9a1a06 is now complete.

Found 18 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymous
content/09.evolution.md:139:LVNA
content/23.vaccines-app.md:15:IgGs
content/23.vaccines-app.md:387:IgGs
content/60.methods.md:8:CCS
content/60.methods.md:24:CoronaCentral
content/60.methods.md:55:MOOPs
content/60.methods.md:75:ECRs
content/60.methods.md:91:Manubot's
content/60.methods.md:121:scite
content/60.methods.md:121:Scite
content/60.methods.md:157:scite
content/60.methods.md...
The rendered manuscript from this build is temporarily available for download at:

@rando2 rando2 marked this pull request as draft April 29, 2021 12:31
@AppVeyorBot
Copy link

AppVeyor build 1.0.4020 for commit 9d7ab7a failed.

@rando2
Copy link
Contributor Author

rando2 commented Apr 29, 2021

I think this integration with #946 isn't working due to the json not yet being available post-merge. I'm going to hold off debugging in case it fixes itself.

@rando2 rando2 changed the title Updates to Introduction for methods section Updates to text for ACM-BCB submission Apr 29, 2021
@AppVeyorBot
Copy link

AppVeyor build 1.0.4021 for commit 3b8859d failed.

@AppVeyorBot
Copy link

AppVeyor build 1.0.4022 for commit d7f1072 failed.

@AppVeyorBot
Copy link

AppVeyor build 1.0.4023 for commit 309e1aa failed.

@AppVeyorBot
Copy link

AppVeyor build 1.0.4026 for commit 5453432 failed.

Copy link
Collaborator

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some initial comments. However, given the timeline I think you should address what you can/want to and then merge when you're ready. We should try to have a full draft merged for others to edit or comment on.

While some of this information has been disseminated by traditional publishing mechanisms, in other cases, it is made public through preprint servers or even press releases.
With information being produced rapidly through both traditional publishing venues and preprint servers, some papers that are published face scrutiny after their initial release.
Concerns have been raised that the number of COVID-19 papers being retracted may be higher, or potentially much higher, than is typical, although a thorough investigation of this question will not be possible until more time has elapsed [@doi:10.1080/08989621.2020.1782203; @doi:10.1080/08989621.2020.1793675].
Other papers are updated with corrections or expressions of concern [@doi:10.1080/08989621.2020.1793675; @url:https://retractionwatch.com/retracted-coronavirus-covid-19-papers].
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL citations to a website will probably not covert well when we translate the .json reference to .bib. We could add manual references to try to override the reference type in the bib file. Or we could add the URLs in the text instead of citing them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yikes, I had a few URLs in this one! I'm happy to just manually fix things if that's the easiest way to do it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the best approach is. Check the latest PDF build in #943 to see how these will render in the reference list. Considering our time crunch, this may be low priority. Let's take care of the other content first.

An additional four manuscripts are in preparation.
All seven will be submitted to _mSystems_ as part of a special issue that is providing support for evolving reviews, so that they will continue to be updated as more information becomes available.
These manuscripts cover a wide range of topics including the fundamental biology of SARS-CoV-2 (pathogenesis [@arxiv:2102.01521] and evolution [@individual-evolution]), biomedical advances in responding to the virus and COVID-19 (pharmaceutical therapeutics [@arxiv:2103.02723], nutraceutical therapeutics [@arxiv:2102.02250], vaccines [@individual-vaccines], and diagnostic technologies [@individual-diagnostics]), and biological and social factors influencing variable effects on patients [@individual-inequality].<!--To Do: add these additional individual citations-->
To date, [number] authors are associated with the consortium.<!--To Do: break down by undergrad, grad/med student, post-doc, jr faculty?-->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could a place to cross-reference growth in the number of authors if we plot that.

#### Integration of COVID-19 Resources with Manubot

Data were integrated into the manuscripts from a number of sources.
The workflow file is available from <https://github.com/greenelab/covid19-review/blob/master/.github/workflows/update-external-resources.yaml> and the scripts are available from <https://github.com/greenelab/covid19-review/tree/external-resources>.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would stick our code availability in METHODS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, I'll move it up in my next commit!

The workflow file is available from <https://github.com/greenelab/covid19-review/blob/master/.github/workflows/update-external-resources.yaml> and the scripts are available from <https://github.com/greenelab/covid19-review/tree/external-resources>.
The Python package versions are available in <https://github.com/greenelab/covid19-review/blob/external-resources/environment.yml>.
<!-- To Do: These files are archived with [Zenodo?](...). -->
<!--To Do: Discussion of the absurd number of references?-->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully we can plot this! 🤞

@AppVeyorBot
Copy link

AppVeyor build 1.0.4031 for commit 1c57ddd failed.

The plugin uses the [Scite](https://scite.ai/) service to display a badge below any citation with a DOI.
The badge contains a set of icons and numbers that indicate how many times that source has been mentioned, supported, or disputed, and whether there have been any important editorial notices, such as retractions or corrections.
Using this, we were able to quickly identify references that needed to be checked again since the time they had been added.
This was invaluable given the nature of the project, where we were disseminating rapidly evolving information of great consequence from over a thousand different sources.
The badges also allow readers to roughly evaluate the reliability of cited sources at a glance.

Because in this implementation of Manubot, most collaborators were writing and editing text through the GitHub website rather than in a local text editor, we also needed to add spell-checking functionalities to Manubot.
<!--To Do: Describe-->
<!--To Do: Describe -- perhaps @agitter can lead this?-->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agitter this is an area where I am super not knowledgeable, in case you're looking for a place to fill in text!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can write about this.


Manubot provides the advantage of allowing a manuscript to be rendered in several formats that serve different purposes, and the current project extended these options.
Docx is a necessary format for a biological collaboration where authors are typically not working in LaTeX.
<!--To Do: continue expanding to describe docx updates and possibly LaTeX updates-->
<!-- To Do: Also, if we end up automating anything to do with the Manubot-to-LaTeX or bibtex workflow while trying to submit this paper, we should describe that too, as well as the udpates for the Microsoft Word crowd to generating the complete review manuscript along with the individual manuscripts reviewing specific topics. <!-- To Do: reference individual manuscripts -->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mprobson if you want to add some info about LaTeX, this would be the place! I'll work on expanding this paragraph as well about the docx stuff (as @agitter mentioned above)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm writing something for this

Copy link
Contributor Author

@rando2 rando2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @agitter!

While some of this information has been disseminated by traditional publishing mechanisms, in other cases, it is made public through preprint servers or even press releases.
With information being produced rapidly through both traditional publishing venues and preprint servers, some papers that are published face scrutiny after their initial release.
Concerns have been raised that the number of COVID-19 papers being retracted may be higher, or potentially much higher, than is typical, although a thorough investigation of this question will not be possible until more time has elapsed [@doi:10.1080/08989621.2020.1782203; @doi:10.1080/08989621.2020.1793675].
Other papers are updated with corrections or expressions of concern [@doi:10.1080/08989621.2020.1793675; @url:https://retractionwatch.com/retracted-coronavirus-covid-19-papers].
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yikes, I had a few URLs in this one! I'm happy to just manually fix things if that's the easiest way to do it.

#### Integration of COVID-19 Resources with Manubot

Data were integrated into the manuscripts from a number of sources.
The workflow file is available from <https://github.com/greenelab/covid19-review/blob/master/.github/workflows/update-external-resources.yaml> and the scripts are available from <https://github.com/greenelab/covid19-review/tree/external-resources>.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, I'll move it up in my next commit!

@rando2 rando2 marked this pull request as ready for review April 29, 2021 23:44
Using CI, Manubot now checks that the manuscript was built correctly, runs spellchecking, and cross-references the manuscripts cited in this review, as summarized in Appendix A and discussed in the project's issues and pull requests.
The new features required for the COVID-19 project are now included in Manubot's rootstock.
Manubot and Zotero now support citing clinical trial identifiers such as `clinicaltrials:NCT04292899` [@clinicaltrials:NCT04292899], and the scite integration and spell-checking functionalities have been integrated into the current release of Manubot <!--To Do: Link-->.
<!--To Do: have the individual builds been pushed up to rootstock?-->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question for you @agitter!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the individual manuscript modifications to build.sh are custom and only live in this repository. I feel like this is a pretty niche use case so we won't move it into rootstock unless someone requests it.

@AppVeyorBot
Copy link

AppVeyor build 1.0.4036 for commit b9e31ec is now complete.

Found 23 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymous
content/09.evolution.md:139:LVNA
content/23.vaccines-app.md:15:IgGs
content/23.vaccines-app.md:387:IgGs
content/60.methods.md:8:CCS
content/60.methods.md:24:CoronaCentral
content/60.methods.md:76:ECRs
content/60.methods.md:91:Manubot's
content/60.methods.md:125:scite
content/60.methods.md:125:Scite
content/60.methods.md:135:docx
content/60.methods.md:136:docx
content/60.methods.md:...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.4045 Build execution time has reached the maximum allowed time for your plan (60 minutes).

@@ -2,8 +2,24 @@

### ABSTRACT
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ACM have a word limit on abstracts? I couldn't find it. This is 363 words which seems too long, but if they don't say otherwise...

Copy link
Collaborator

@agitter agitter Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sample manuscript doesn't state a word limit either

\begin{abstract}
A clear and well-documented \LaTeX\ document is presented as an
article formatted for publication by ACM in a conference proceedings
or journal publication. Based on the ``acmart'' document class, this
article presents and explains many of the common variations, as well
as many of the formatting elements an author may use in the
preparation of the documentation of their work.
\end{abstract}

This development was necessary because the manuscript grew so large that it needed to be split into seven separate papers for submission.
Similarly, we expanded the range of possible export formats to include LaTeX.
Because LaTeX is used for manuscript submission in many fields, automating the process of converting markdown to a submission-friendly format is desirable.
<!-- To Do: Also, if we end up automating anything to do with the Manubot-to-LaTeX or bibtex workflow while trying to submit this paper, we should describe that too, as well as the udpates for the Microsoft Word crowd to generating the complete review manuscript along with the individual manuscripts reviewing specific topics.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AppVeyorBot
Copy link

AppVeyor build 1.0.4049 for commit 5023e7f is now complete....

Found 9 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymous
content/09.evolution.md:139:LVNA
content/23.vaccines-app.md:15:IgGs
content/23.vaccines-app.md:387:IgGs
content/60.methods.md:16:Manubot's
content/60.methods.md:106:Manubot's
content/60.methods.md:205:clinial
content/60.methods.md:213:Manubot's
content/60.methods.md:230:Manubot's The rendered manuscript from this build is temporarily available for download at:

Copy link
Collaborator

@rdvelazquez rdvelazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Fantastic job summarizing the project.

A few random comments:

  • The tense shifted around quite a bit. It would be too much to try to keep the tense consistent across the whole thing, and even within some paragraphs a tense shift seemed necessary, but if in the course of finalizing you see places to keep tense more consistent, it may help readability
  • More mention of the potential downsides/limitations of this approach could actually strengthen the argument (address them ourselves rather than have the reader think of them); I'm thinking specifically of questions people may have around things like "what is the canonical version of the paper?" or "is it always kept up to date; how do I know if it's out of date?"

Manubot uses Zotero [@url:https://www.zotero.org] to extract metadata for some types of citations, but Zotero did not support clinical trial identifiers.
In order to enable Manubot to pull metadata associated with clinical trials based on their identifiers, we needed to develop Zotero support for these identifiers.
The need for this feature to be enabled in Zotero had also been identified by other researchers [@url:https://forums.zotero.org/discussion/74933/import-from-clinical-trials-registry; @url:https://forums.zotero.org/discussion/77721/add-reference-from-clinical-trials-org].
This update was achieved through the implementation of a query of clinicaltrials.gov to retrieve XML metadata associated with each identifier using JavaScript [@url:https://github.com/zotero/translators/pull/2153].
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Thanks for summarizing that!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdvelazquez I pushed a little more detail to this paragraph. If you have time, could you please double-check it again?

@AppVeyorBot
Copy link

AppVeyor build 1.0.4057 for commit 8e14947 is now complete.

Found 22 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymous
content/09.evolution.md:139:LVNA
content/23.vaccines-app.md:15:IgGs
content/23.vaccines-app.md:387:IgGs
content/60.methods.md:16:Manubot's
content/60.methods.md:106:Manubot's
content/60.methods.md:166:projectstats
content/60.methods.md:168:projectstats
content/60.methods.md:171:projectstats
content/60.methods.md:172:projectstats
content/60.methods.md:187:projectstats
content/60.met...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.4062 for commit f446681 is now complete.

Found 21 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymous
content/09.evolution.md:139:LVNA
content/23.vaccines-app.md:15:IgGs
content/23.vaccines-app.md:387:IgGs
content/60.methods.md:16:Manubot's
content/60.methods.md:107:Manubot's
content/60.methods.md:167:projectstats
content/60.methods.md:169:projectstats
content/60.methods.md:172:projectstats
content/60.methods.md:173:projectstats
content/60.methods.md:187:projectstats
content/60.met...
The rendered manuscript from this build is temporarily available for download at:

Copy link
Member

@cgreene cgreene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some suggestions and comments but nothing breaking. My biggest takeaway though for this venue is: "why is this an ACM-BCB paper?" What should a computational biologist or bioinformaticist take away from this? Is it the power of manubot, the opportunity for the science of science, something about education? I think a summary (or intro paragraph) that helps a reviewer understand why this is a fit for the venue (I think it's a story that is a combo of the above items) by explicitly stating it might go a long way to helping get reviewers excited.

The first release, on March 16, 2020, contained 28,000 manuscripts on topics relevant to SARS-CoV-2 and related coronaviruses [@arxiv:2004.10706].
Since then, these articles have continued to proliferate (left), with both traditionally published and preprint manuscripts in the corpus (right).
At present, it contains {{cord19_total_preprints}} preprints from _arXiv_, _bioRxiv_, and _medRxiv_.
While not all of the manuscripts are focused explicitly on SARS-CoV-2 or COVID-19, this corpus is likely to contain all or most manuscripts relevant to writing a literature review, which requires assessing both emerging and prior research.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, one could imagine examining the fraction of papers that we cite that are in CORD.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I think this would be a cool analysis!

A new plugin was added to Manubot to support "smart citations" in the HTML build of manuscripts.
The plugin uses the [Scite](https://scite.ai/) [@doi:10.1101/2021.03.15.435418] service to display a badge below any citation with a DOI.
The badge contains a set of icons and numbers that indicate how many times that source has been mentioned, supported, or disputed and whether there have been any important editorial notices, such as retractions or corrections.
Using this, we were able to identify references that needed to be reevaluated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the idea world, we'd cite a retracted paper here so that folks could see the rendering but that won't work with ACM's template. What about citing a retracted paper and then putting a link to the instance in the manubot html? Maybe this is too self referential.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could put a screengrab if we have room? But we have a lot of figs so I'm not sure if we'll have room!

Similarly, the clinical trials statistics and figure were generated based on data from the University of Oxford Evidence-Based Medicine Data Lab's COVID-19 TrialsTracker [@doi:10.5281/zenodo.3732709].
The evolution of this figure over time is shown in Figure @fig:ebm-trials.
Information about vaccine distribution was extracted from Our World In Data [@url:https://github.com/owid/covid-19-data].
Figure 1 (above) also uses this approach to dynamically integrate data directly from the CORD-19 dataset [@arxiv:2004.10706].<!--To Do: Flow chart of data integration? We could have a summary figure showing all of the external data sources that are integrated into the manuscript. We have icons for MSSM reviews, JHU data, etc. at the top. That flows into a GitHub repo, which also takes input from all the contributors. Then the output of the repo is the manuscript and other CI artifacts.-->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time probably doesn't exist, but I do like this figure idea. It should definitely be in a presentation if accepted at the conference.

@rando2 rando2 mentioned this pull request Apr 30, 2021
2 tasks
Copy link
Contributor Author

@rando2 rando2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some responses to @cgreene feedback

The first release, on March 16, 2020, contained 28,000 manuscripts on topics relevant to SARS-CoV-2 and related coronaviruses [@arxiv:2004.10706].
Since then, these articles have continued to proliferate (left), with both traditionally published and preprint manuscripts in the corpus (right).
At present, it contains {{cord19_total_preprints}} preprints from _arXiv_, _bioRxiv_, and _medRxiv_.
While not all of the manuscripts are focused explicitly on SARS-CoV-2 or COVID-19, this corpus is likely to contain all or most manuscripts relevant to writing a literature review, which requires assessing both emerging and prior research.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I think this would be a cool analysis!

A new plugin was added to Manubot to support "smart citations" in the HTML build of manuscripts.
The plugin uses the [Scite](https://scite.ai/) [@doi:10.1101/2021.03.15.435418] service to display a badge below any citation with a DOI.
The badge contains a set of icons and numbers that indicate how many times that source has been mentioned, supported, or disputed and whether there have been any important editorial notices, such as retractions or corrections.
Using this, we were able to identify references that needed to be reevaluated.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could put a screengrab if we have room? But we have a lot of figs so I'm not sure if we'll have room!

@AppVeyorBot
Copy link

AppVeyor build 1.0.4104 for commit 16a295d is now complete.

Found 29 potential spelling error(s). Preview:content/00.front-matter.md:4:Pandoc
content/00.front-matter.md:14:appveyor
content/09.evolution.md:83:nonsynonymous
content/09.evolution.md:139:LVNA
content/22.vaccines.md:286:immunogen
content/22.vaccines.md:287:metastable
content/22.vaccines.md:288:immunogens
content/23.vaccines-app.md:15:IgGs
content/23.vaccines-app.md:387:IgGs
content/60.methods.md:16:Manubot's
content/60.methods.md:108:Manubot's
content/60.methods.md...
The rendered manuscript from this build is temporarily available for download at:

Copy link
Collaborator

@cbrueffer cbrueffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor improvements; it's really a nice read and made me realize how much I underappreciated the technical improvements going on in the background. Great work everyone involved!

The multifaceted nature of COVID-19 demands a multidisciplinary approach, but the urgency of the crisis combined with the need for social distancing measures presents unique challenges to collaborative science.
We sought to apply a massive online open publishing approach to this problem using Manubot.
Through GitHub, collaborators contributed summaries and critiques of literature via issue templates and contributed literature summaries as pull requests.
Manubot rendered the manuscript content into pdf, HTML, and docx outputs, and an up-to-date version was always available online.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Manubot rendered the manuscript content into pdf, HTML, and docx outputs, and an up-to-date version was always available online.
Manubot rendered the manuscript content into PDF, HTML, and DOCX outputs, and an up-to-date version was always available online.

We adapted Manubot's figure generation workflow to retrieve up-to-date data from online sources nightly.
Additionally, we integrated scite, a tool for checking the status of references, including retractions, into the HTML build to simplify the process of monitoring changes to publications after their release.

Through this effort, we organized over 50 scientists from a range of backgrounds who evaluated over 1000 sources and authored seven literature reviews.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Through this effort, we organized over 50 scientists from a range of backgrounds who evaluated over 1000 sources and authored seven literature reviews.
Through this effort, we organized over 50 scientists from a range of backgrounds who evaluated over 1,000 sources and authored seven literature reviews.

However, any static review is likely to quickly become dated as new research is released or existing research is retracted or superseded, and the explosive rate of publication made localized efforts to curate new publications increasingly difficult.
Additionally, the complex nature of COVID-19 means that significant advantages can be gained from examining the virus and disease in a multidisciplinary context.
The downsides of "excessive publication" have been recognized for over forty years, as it was raised as a major concern about the move towards electronic, rather than print, publishing at the turn of the millennium [@doi:10/d3bmnv].
The contents of the COVID-19 Open Research Dataset (CORD-19) [@arxiv:2004.10706], which was developed in part to assist in efforts to train machine learning algorithms on COVID-19-related text, illustrates the volume of publication relevant to understanding this virus (Figure @fig:cord19-growth).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The contents of the COVID-19 Open Research Dataset (CORD-19) [@arxiv:2004.10706], which was developed in part to assist in efforts to train machine learning algorithms on COVID-19-related text, illustrates the volume of publication relevant to understanding this virus (Figure @fig:cord19-growth).
The contents of the COVID-19 Open Research Dataset (CORD-19) [@arxiv:2004.10706], which was developed in part to assist in efforts to train machine learning algorithms on COVID-19-related text, illustrates the volume of publications relevant to understanding this virus (Figure @fig:cord19-growth).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps "scholarly literature" instead of publications, since it includes preprints? At least to me, perhaps falsely, publication suggests peer review.

Concerns have been raised that the number of COVID-19 papers being retracted may be higher, or potentially much higher, than is typical, although a thorough investigation of this question will not be possible until more time has elapsed [@doi:10.1080/08989621.2020.1782203; @doi:10.1080/08989621.2020.1793675].
Other papers are updated with corrections or expressions of concern [@doi:10.1080/08989621.2020.1793675;url:https://retractionwatch.com/retracted-coronavirus-covid-19-papers].
These include both preprints and papers published in more traditional venues [@url:https://retractionwatch.com/retracted-coronavirus-covid-19-papers; @url:https://asapbio.org/preprints-and-covid-19].
Preprints provide a venue for scientists to release findings rapidly but have both the advantage and disadvantage of making research available before it has undergone the peer-review process.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Preprints provide a venue for scientists to release findings rapidly but have both the advantage and disadvantage of making research available before it has undergone the peer-review process.
Preprints provide a venue for scientists to release findings rapidly but have both the advantage and disadvantage of making research available before it has undergone the peer review process.

Manubot is an ideal platform for analyzing COVID-19 literature because it facilitates the automatic integration of new data through CI.
However, the Manubot workflow can appear intimidating to contributors who are not well-versed in git [@doi:10.5334/kula.63].
The synthesis and discussion of the emerging literature by biomedical scientists and clinicians is imperative to a robust interpretation of COVID-19 research, but in biology, such efforts often rely on What You See Is What You Get tools such as Google Docs, despite the significant limitations of these platforms in the face of excessive publication.
Therefore, we recognized that the problem of synthesizing the COVID-19 literature lent itself well to the Manubot platform, but that the potential technical expertise required to work with Manubot present a significant technical barrier to domain experts.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Therefore, we recognized that the problem of synthesizing the COVID-19 literature lent itself well to the Manubot platform, but that the potential technical expertise required to work with Manubot present a significant technical barrier to domain experts.
Therefore, we recognized that the problem of synthesizing the COVID-19 literature lent itself well to the Manubot platform, but that the potential technical expertise required to work with Manubot presents a significant technical barrier to domain experts.


With the worldwide scientific community uniting during 2020 to investigate SARS-CoV-2 and COVID-19 from a wide range of perspectives, findings from many disciplines are relevant on a rapid timescale to a broad scientific audience.
While Manubot manuscripts are written in markdown, they can be rendered in several formats that provide different advantages.
For example, beyond building just a PDF, Manubot also renders the manuscript in HTML, docx, and now, LaTeX.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For example, beyond building just a PDF, Manubot also renders the manuscript in HTML, docx, and now, LaTeX.
For example, beyond building just a PDF, Manubot also renders the manuscript in HTML, DOCX, and now, LaTeX.

While Manubot manuscripts are written in markdown, they can be rendered in several formats that provide different advantages.
For example, beyond building just a PDF, Manubot also renders the manuscript in HTML, docx, and now, LaTeX.
The HTML manuscript format offers several advantages over a static PDF to harmonize available resources that we were able to apply to specific problems of COVID-19.
The integration of scite into the HTML build makes references more manageable by visually representing whether their results are contested or whether they have been corrected or retracted.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The integration of scite into the HTML build makes references more manageable by visually representing whether their results are contested or whether they have been corrected or retracted.
The integration of Scite into the HTML build makes references more manageable by visually representing whether their results are contested or whether they have been corrected or retracted.

The HTML manuscript format offers several advantages over a static PDF to harmonize available resources that we were able to apply to specific problems of COVID-19.
The integration of scite into the HTML build makes references more manageable by visually representing whether their results are contested or whether they have been corrected or retracted.
Cross-referencing different pieces of the manuscript, such as cited preprints with reviews stored in an appendix, is another dynamic option presented by HTML.
Additionally, because of the heavy emphasis on Word processing in biology, Manubot's ability to generate docx outputs was expanded to allow users to generate docx files containing only a section of the manuscript.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Additionally, because of the heavy emphasis on Word processing in biology, Manubot's ability to generate docx outputs was expanded to allow users to generate docx files containing only a section of the manuscript.
Additionally, because of the heavy emphasis on Word processing in biology, Manubot's ability to generate DOCX outputs was expanded to allow users to generate DOCX files containing only a section of the manuscript.

Cross-referencing different pieces of the manuscript, such as cited preprints with reviews stored in an appendix, is another dynamic option presented by HTML.
Additionally, because of the heavy emphasis on Word processing in biology, Manubot's ability to generate docx outputs was expanded to allow users to generate docx files containing only a section of the manuscript.
In our case, where the full project is nearly 100,000 words, this allows individual pieces to be shared widely.
Finally, the addition of LaTeX output is useful for researchers from computational fields who submit papers in tex format and removes the step of reformatting markdown prior to submission.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Finally, the addition of LaTeX output is useful for researchers from computational fields who submit papers in tex format and removes the step of reformatting markdown prior to submission.
Finally, the addition of LaTeX output is useful for researchers from computational fields who submit papers in TeX format and removes the step of reformatting Markdown prior to submission.

@@ -917,6 +923,7 @@ dimorphism
disaggregate
dl
docosahexaenoic
docx
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
docx
DOCX

@agitter
Copy link
Collaborator

agitter commented May 5, 2021

Thanks for these suggestions @cbrueffer. We can accept all of these proposed changes, but I want to make a plan with @rando2 about the order of operations. We submitted the first version of this manuscript to the ACM-BCB conference last Friday to meet their deadline.

@rando2 what do you think of the following order for updates to this methods manuscript:

  • merge this pull request
  • use the Overleaf diff (I have the source files) to reapply tex version text changes to the markdown version (ignoring formatting, figures, etc.)
  • tag that version as v1
  • open a new pull request to apply @cbrueffer's suggestions
  • continue editing to address feedback we didn't get to before the deadline, working toward a preprint posting in the near future

@agitter agitter merged commit 33b6023 into greenelab:master May 11, 2021
agitter added a commit that referenced this pull request May 26, 2021
Original review #947 (review)

Co-authored-by: Christian Brueffer <christian.brueffer@med.lu.se>
@rando2 rando2 deleted the methods branch July 28, 2021 17:16
@agitter agitter mentioned this pull request Aug 27, 2021
28 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Methods Strategies for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants