Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare methods for ACM - BCB #939

Merged
merged 8 commits into from
Apr 27, 2021
Merged

Prepare methods for ACM - BCB #939

merged 8 commits into from
Apr 27, 2021

Conversation

rando2
Copy link
Contributor

@rando2 rando2 commented Apr 22, 2021

Description of the proposed additions or changes

For reasons involving a job that might be posted soon, we are hoping to pivot the methods section to ACM - BRB. Unfortunately the deadline is 4/30! (Hopefully it will be pushed back though!)

Here is my very rough draft outline of the methods section as we could potentially present it for this conference (= Computer Science version of a journal for those used to bio publishing!)

@mprobson has agreed to take a look and give feedback on how to structure this type of content for an ACM journal. Based on this feedback, I'll start hitting the To Dos!

Related issues

Suggested reviewers (optional)

Checklist

  • Text is formatted so that each sentence is on its own line.
  • Pre-prints cited in this pull request have a GitHub issue opened so that they can be reviewed.

@rando2 rando2 requested a review from mprobson April 22, 2021 15:41
@AppVeyorBot
Copy link

AppVeyor build 1.0.3976 for commit 421d256 is now complete....

Found 8 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymous
content/09.evolution.md:139:LVNA
content/23.vaccines-app.md:15:IgGs
content/23.vaccines-app.md:387:IgGs
content/60.methods.md:8:CCS
content/60.methods.md:45:backend
content/60.methods.md:48:autoupdated
content/60.methods.md:60:ECR The rendered manuscript from this build is temporarily available for download at:

@rando2 rando2 added the Methods Strategies for review label Apr 23, 2021
Copy link
Collaborator

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great to have a potential venue for this manuscript. I'm adding a few quick thoughts now and will work on a full review soon.

Are you targeting the 10 page paper or 6 page short paper? https://acm-bcb.org/2021/index.php?page=call_for_papers

These include both preprints and papers that are published in more traditional venues.
The large number of retractions may also be influenced by the fact that the time from submission to peer review for papers related to COVID-19 is very low.

The rate of this proliferation also presents challenges to efforts to summarize and synthesize existing literature, which are necessary given the volume.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an important point to emphasize here an maybe elsewhere. There have been related efforts to review COVID-19 preprints (which we should search for an cite). We can explain why our project goes beyond reviewing preprints and the advantages of summarizing and synthesizing a massive amount of preprints, news stories, and journal publications.

Copy link
Contributor Author

@rando2 rando2 Apr 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds great! I can definitely add this.
Edited: should be addressed in a824ec6


#### Collaborative Writing and Manuscript Generation
Few researchers in biological and medical fields are trained in version control tools such as git <!--To Do: Find some sort of reference for this, software carpentry?-->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nice article but it isn't supporting what you say here https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004668

Maybe something citing it would be more appropriate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is one of my favorite ways of finding papers :) I'm digging through them now!

Copy link
Collaborator

@mprobson mprobson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall outline looks good! My only concern is how we'll integrate this with their preferred LaTeX style file.

Copy link
Collaborator

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rando2 what you have so far looks great to me. My one major suggestion is that we think about how to add a Results section. See the comments below for some initial thoughts.

I'm tagging @rdvelazquez to see whether he has time in the next week to help describe his technical contributions or analyze the repository data with the GitHub API.


### Abstract
### ABSTRACT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Infodemic is an appropriate word for the title but not the most common term. Let's define it in the abstract as we motivate the problem.


#### Description of the Problem

<!--To Do: Plot the number of pubs in CORD-19 over time?-->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A plot like this - growth of preprints and/or published manuscripts - would be nice. It could be the first part of Results instead of Methods.

Additionally, the broader topic of COVID-19 intersects with a wide range of fields, including virology, immunology, medicine, pharmacology, evolutionary biology, public health, and more.
Therefore, any effort to comprehensively document and evaluate this body of literature would require insight from scientists across a number of fields.
Furthermore, during the initial phase of the COVID-19 pandemic during spring and summer 2020, and much longer in some part of the world, many biological scientists were unable to access their research spaces.
As a result, early career researchers (ECR) and students were likely to lose out on valuable time for conducting experiments. <!--To Do: look at equity analyses of the effects on the pandemic to see if there is any data on this yet?-->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manubot provided a way for all contributors, including ECRs, to join a massive collaborative projects but also demonstrate their individual contributions to the larger work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to conclusions!


#### Data Analysis and Visualization
#### Applying Manubot to COVID-19
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should emphasize that we were not just applying Manubot as-is. The demands of this review and its scale required software, infrastructure, and workflow improvements. For this type of conference, we should emphasize the methods development and contributions.

I now noticed you already have this explicit section below. Perhaps we want to still rephrase this header to stress that the figure automation was all a new development as well.

Copy link
Contributor Author

@rando2 rando2 Apr 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempted in 889a954

The combination of Manubot and GitHub Actions also made it possible to dynamically update information such as statistics and visualizations in the manuscript.
Data about worldwide cases and deaths from the COVID-19 Data Repository by the Center for Systems Science and Engineering at Johns Hopkins University [@https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series] were read using a Python script. <!-- TODO: replace figure reference with manuscript reference or reference figure conditionally based on template variable [to generate Figure @fig:csse-deaths.] -->
When scientific writers added text that was current only as of a given date, publicly available data sources were identified whenever possible to allow the information to pulled directly into the manuscript in order to keep it up-to-date.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoted examples can help clarify what we mean by this.

Data about worldwide cases and deaths from the COVID-19 Data Repository by the Center for Systems Science and Engineering at Johns Hopkins University [@https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series] were read using a Python script. <!-- TODO: replace figure reference with manuscript reference or reference figure conditionally based on template variable [to generate Figure @fig:csse-deaths.] -->
When scientific writers added text that was current only as of a given date, publicly available data sources were identified whenever possible to allow the information to pulled directly into the manuscript in order to keep it up-to-date.

Data was pulled from a number of sources.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had more time, this would be a good candidate for a figure illustrating the workflow. It is pretty confusing how the auto-updates and figure versioning works. I could help sketch this but don't have good graphic design skills.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a note that it would be nice to get this! I can help (and can do OK visualizations) but if @vincerubinetti ends up interested, maybe he would have some advice!


### Article Selection and Evaluation
Additionally, the fast-moving nature of the infodemic has led to a number of retractions and corrections of COVID-19 literature.
<!--To Do: Describe integration with scite -- unless this should go in the section below? Ask Vince -->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should take credit for any general Manubot improvements (like scite integration and spellchecking) in this paper and invite the relevant Manubot contributors as authors on this paper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scite is @vincerubinetti and you did the spell-checking, right? I reached out to Vince to see what he thinks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added most of the spell checking. Daniel helped improve it when we migrated it from the prototype in this repo to the rootstock version.


Reviewers described whether a causal claim could be made.
They described whether any side effects or interactions with other drugs were identified, as well as any subgroup findings.
<!--To Do: Add a graph/text describing number of unique contributors & commits over time-->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this type of conference, we should include a Results section in addition to the Methods. I haven't checked the template, but I assume that will be expected.

Our major claim is that Manubot successfully facilitated a massive collaborative review on an urgent topic. Our Results section can provide a qualitative and quantitative assessment to support that claim. I noted above that the growth of the literature could be one initial result to demonstrate the problem (and perhaps the grown of clinical trials, treatments, etc. as well). Other quantitative results could include:

  • unique contributors and commits over time, as noted in the comment
  • growth of the manuscript over time (we could try regenerating this figure summarizing deep review growth)
  • growth of paper issues over time
  • distribution of review contents, that is, something showing the number of comments per pull request that was merged
  • growth of the reference list over time and/or number of clinical trials cited (because we describe that as a necessary enhancement)
  • example of how the auto-updated figures changed between an early version and most recent version to demonstrate the importance of having current data visualized

Is there anything else along these lines we could include in a Results section? Most of my suggestions are plots showing something getting bigger over time, which is okay but not the most exciting data.

We could have a summary figure showing all of the external data sources that are integrated into the manuscript. We have icons for MSSM reviews, JHU data, etc. at the top. That flows into a GitHub repo, which also takes input from all the contributors. Then the output of the repo is the manuscript and other CI artifacts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are awesome ideas!!! I really like the idea of comparing the early figures to the most recent figures. My only other idea was something like a word cloud of #17 to show the diversity of interests? I don't have a lot of experience with that type of visualization, but assume it wouldn't be hard to do in Python. It would be cool to do something by degree obtained/pursued (MD vs PhD) or field, but I'm not sure how easy it would be to extract that data. Honestly, I could probably collect data on MD/PhD, career stage, and dsicipline just by manually tabulating people's responses to #17, which wouldn't be bad!

@RLordan
Copy link
Collaborator

RLordan commented Apr 26, 2021

Good luck folks, don't forget to mention the use of Gitter. It was pretty useful for inexperienced to collaborate and ask questions more informally.

Co-authored-by: Anthony Gitter <agitter@users.noreply.github.com>
@AppVeyorBot
Copy link

AppVeyor build 1.0.3995

Copy link
Contributor Author

@rando2 rando2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are a couple of quick fixes for some of @agitter's comments. I'm still working on addressing the ones that will take a little more reorganizing!


### Abstract
### ABSTRACT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<!--To do: add definition of "infodemic"-->

@AppVeyorBot
Copy link

AppVeyor build 1.0.3996 for commit a824ec6 is now complete.

Found 8 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymous
content/09.evolution.md:139:LVNA
content/23.vaccines-app.md:15:IgGs
content/23.vaccines-app.md:387:IgGs
content/60.methods.md:8:CCS
content/60.methods.md:52:backend
content/60.methods.md:54:autoupdated
content/60.methods.md:64:ECR...
The rendered manuscript from this build is temporarily available for download at:

Copy link
Contributor Author

@rando2 rando2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incredibly helpful feedback @agitter! I'm about to push a new commit that might cause some of the comments to disappear from the files tab, but I have responded to them here.

My suspicion is we are looking at a 6-page paper (@mprobson also suggested this might be better received as a short paper), but we'll have to see what it looks like when we convert to their format!

I will start on some of the data analysis/figure generation tomorrow!

Additionally, the broader topic of COVID-19 intersects with a wide range of fields, including virology, immunology, medicine, pharmacology, evolutionary biology, public health, and more.
Therefore, any effort to comprehensively document and evaluate this body of literature would require insight from scientists across a number of fields.
Furthermore, during the initial phase of the COVID-19 pandemic during spring and summer 2020, and much longer in some part of the world, many biological scientists were unable to access their research spaces.
As a result, early career researchers (ECR) and students were likely to lose out on valuable time for conducting experiments. <!--To Do: look at equity analyses of the effects on the pandemic to see if there is any data on this yet?-->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to conclusions!


#### Collaborative Writing and Manuscript Generation
Few researchers in biological and medical fields are trained in version control tools such as git <!--To Do: Find some sort of reference for this, software carpentry?-->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is one of my favorite ways of finding papers :) I'm digging through them now!

For examples of each template, please see Appendices B-D.
Another option was to contribute or edit text using GitHub's pull request system.
Each pull request was reviewed and approved by at least one other author.
Manubot also provides a functionality to create a bibliography using digital object identifiers (DOIs), website URLs, or other identifiers such as PubMed identifiers and arXiv IDs.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved below!


#### Collaborative Writing and Manuscript Generation
Few researchers in biological and medical fields are trained in version control tools such as git <!--To Do: Find some sort of reference for this, software carpentry?-->
The current project was managed through GitHub [@url:https://github.com/greenelab/covid19-review] using Manubot [@doi:10.1371/journal.pcbi.1007128] to continuously generate a version of the manuscript online [@url:https://greenelab.github.io/covid19-review].
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to discussion


#### Data Analysis and Visualization
#### Applying Manubot to COVID-19
Copy link
Contributor Author

@rando2 rando2 Apr 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempted in 889a954

Data about worldwide cases and deaths from the COVID-19 Data Repository by the Center for Systems Science and Engineering at Johns Hopkins University [@https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series] were read using a Python script. <!-- TODO: replace figure reference with manuscript reference or reference figure conditionally based on template variable [to generate Figure @fig:csse-deaths.] -->
When scientific writers added text that was current only as of a given date, publicly available data sources were identified whenever possible to allow the information to pulled directly into the manuscript in order to keep it up-to-date.

Data was pulled from a number of sources.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a note that it would be nice to get this! I can help (and can do OK visualizations) but if @vincerubinetti ends up interested, maybe he would have some advice!


### Article Selection and Evaluation
Additionally, the fast-moving nature of the infodemic has led to a number of retractions and corrections of COVID-19 literature.
<!--To Do: Describe integration with scite -- unless this should go in the section below? Ask Vince -->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scite is @vincerubinetti and you did the spell-checking, right? I reached out to Vince to see what he thinks!


Reviewers described whether a causal claim could be made.
They described whether any side effects or interactions with other drugs were identified, as well as any subgroup findings.
<!--To Do: Add a graph/text describing number of unique contributors & commits over time-->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are awesome ideas!!! I really like the idea of comparing the early figures to the most recent figures. My only other idea was something like a word cloud of #17 to show the diversity of interests? I don't have a lot of experience with that type of visualization, but assume it wouldn't be hard to do in Python. It would be cool to do something by degree obtained/pursued (MD vs PhD) or field, but I'm not sure how easy it would be to extract that data. Honestly, I could probably collect data on MD/PhD, career stage, and dsicipline just by manually tabulating people's responses to #17, which wouldn't be bad!

@rando2 rando2 changed the title Prepare methods for ACM - BRB Prepare methods for ACM - BCB Apr 26, 2021
@AppVeyorBot
Copy link

AppVeyor build 1.0.3997 for commit 889a954 is now complete.

Found 15 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymous
content/09.evolution.md:139:LVNA
content/23.vaccines-app.md:15:IgGs
content/23.vaccines-app.md:387:IgGs
content/60.methods.md:8:CCS
content/60.methods.md:64:ECRs
content/60.methods.md:80:Manubot's
content/60.methods.md:110:scite
content/60.methods.md:142:scite
content/60.methods.md:159:ECR
content/60.methods.md:160:ECRs
content/60.methods.md:164:docx
content/60.methods.md:166:scite...
The rendered manuscript from this build is temporarily available for download at:

Copy link
Collaborator

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the changes. Let's merge this so we can start splitting up individual sections. Please give me direction on where I can help most, whether that is resolving TODOs or something else.

Daniel's notebooks at https://github.com/greenelab/meta-review/tree/master/analyses/deep-review-contrib give some examples of how to do contributor and content analysis for the results.

@rando2
Copy link
Contributor Author

rando2 commented Apr 27, 2021

Thank you @agitter! I put this on gitter, but if you want to look at either methods for functionalities that you added (@vincerubinetti is going to send us a paragraph on scite) or exporting in LaTeX, I can work on the analysis/visualization this morning!

@rando2 rando2 merged commit 05acaf5 into greenelab:master Apr 27, 2021
@rando2 rando2 deleted the methods branch April 27, 2021 11:59
@agitter agitter mentioned this pull request Aug 27, 2021
28 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Methods Strategies for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants