-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepare methods for ACM - BCB #939
Conversation
AppVeyor build 1.0.3976 for commit 421d256 is now complete.... Found 8 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymouscontent/09.evolution.md:139:LVNA content/23.vaccines-app.md:15:IgGs content/23.vaccines-app.md:387:IgGs content/60.methods.md:8:CCS content/60.methods.md:45:backend content/60.methods.md:48:autoupdated content/60.methods.md:60:ECR The rendered manuscript from this build is temporarily available for download at: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great to have a potential venue for this manuscript. I'm adding a few quick thoughts now and will work on a full review soon.
Are you targeting the 10 page paper or 6 page short paper? https://acm-bcb.org/2021/index.php?page=call_for_papers
These include both preprints and papers that are published in more traditional venues. | ||
The large number of retractions may also be influenced by the fact that the time from submission to peer review for papers related to COVID-19 is very low. | ||
|
||
The rate of this proliferation also presents challenges to efforts to summarize and synthesize existing literature, which are necessary given the volume. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an important point to emphasize here an maybe elsewhere. There have been related efforts to review COVID-19 preprints (which we should search for an cite). We can explain why our project goes beyond reviewing preprints and the advantages of summarizing and synthesizing a massive amount of preprints, news stories, and journal publications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds great! I can definitely add this.
Edited: should be addressed in a824ec6
content/60.methods.md
Outdated
|
||
#### Collaborative Writing and Manuscript Generation | ||
Few researchers in biological and medical fields are trained in version control tools such as git <!--To Do: Find some sort of reference for this, software carpentry?--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nice article but it isn't supporting what you say here https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004668
Maybe something citing it would be more appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is one of my favorite ways of finding papers :) I'm digging through them now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall outline looks good! My only concern is how we'll integrate this with their preferred LaTeX style file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rando2 what you have so far looks great to me. My one major suggestion is that we think about how to add a Results section. See the comments below for some initial thoughts.
I'm tagging @rdvelazquez to see whether he has time in the next week to help describe his technical contributions or analyze the repository data with the GitHub API.
|
||
### Abstract | ||
### ABSTRACT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Infodemic is an appropriate word for the title but not the most common term. Let's define it in the abstract as we motivate the problem.
content/60.methods.md
Outdated
|
||
#### Description of the Problem | ||
|
||
<!--To Do: Plot the number of pubs in CORD-19 over time?--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A plot like this - growth of preprints and/or published manuscripts - would be nice. It could be the first part of Results instead of Methods.
content/60.methods.md
Outdated
Additionally, the broader topic of COVID-19 intersects with a wide range of fields, including virology, immunology, medicine, pharmacology, evolutionary biology, public health, and more. | ||
Therefore, any effort to comprehensively document and evaluate this body of literature would require insight from scientists across a number of fields. | ||
Furthermore, during the initial phase of the COVID-19 pandemic during spring and summer 2020, and much longer in some part of the world, many biological scientists were unable to access their research spaces. | ||
As a result, early career researchers (ECR) and students were likely to lose out on valuable time for conducting experiments. <!--To Do: look at equity analyses of the effects on the pandemic to see if there is any data on this yet?--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manubot provided a way for all contributors, including ECRs, to join a massive collaborative projects but also demonstrate their individual contributions to the larger work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to conclusions!
content/60.methods.md
Outdated
|
||
#### Data Analysis and Visualization | ||
#### Applying Manubot to COVID-19 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should emphasize that we were not just applying Manubot as-is. The demands of this review and its scale required software, infrastructure, and workflow improvements. For this type of conference, we should emphasize the methods development and contributions.
I now noticed you already have this explicit section below. Perhaps we want to still rephrase this header to stress that the figure automation was all a new development as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attempted in 889a954
content/60.methods.md
Outdated
The combination of Manubot and GitHub Actions also made it possible to dynamically update information such as statistics and visualizations in the manuscript. | ||
Data about worldwide cases and deaths from the COVID-19 Data Repository by the Center for Systems Science and Engineering at Johns Hopkins University [@https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series] were read using a Python script. <!-- TODO: replace figure reference with manuscript reference or reference figure conditionally based on template variable [to generate Figure @fig:csse-deaths.] --> | ||
When scientific writers added text that was current only as of a given date, publicly available data sources were identified whenever possible to allow the information to pulled directly into the manuscript in order to keep it up-to-date. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quoted examples can help clarify what we mean by this.
Data about worldwide cases and deaths from the COVID-19 Data Repository by the Center for Systems Science and Engineering at Johns Hopkins University [@https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series] were read using a Python script. <!-- TODO: replace figure reference with manuscript reference or reference figure conditionally based on template variable [to generate Figure @fig:csse-deaths.] --> | ||
When scientific writers added text that was current only as of a given date, publicly available data sources were identified whenever possible to allow the information to pulled directly into the manuscript in order to keep it up-to-date. | ||
|
||
Data was pulled from a number of sources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we had more time, this would be a good candidate for a figure illustrating the workflow. It is pretty confusing how the auto-updates and figure versioning works. I could help sketch this but don't have good graphic design skills.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a note that it would be nice to get this! I can help (and can do OK visualizations) but if @vincerubinetti ends up interested, maybe he would have some advice!
content/60.methods.md
Outdated
|
||
### Article Selection and Evaluation | ||
Additionally, the fast-moving nature of the infodemic has led to a number of retractions and corrections of COVID-19 literature. | ||
<!--To Do: Describe integration with scite -- unless this should go in the section below? Ask Vince --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should take credit for any general Manubot improvements (like scite integration and spellchecking) in this paper and invite the relevant Manubot contributors as authors on this paper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scite is @vincerubinetti and you did the spell-checking, right? I reached out to Vince to see what he thinks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I added most of the spell checking. Daniel helped improve it when we migrated it from the prototype in this repo to the rootstock version.
|
||
Reviewers described whether a causal claim could be made. | ||
They described whether any side effects or interactions with other drugs were identified, as well as any subgroup findings. | ||
<!--To Do: Add a graph/text describing number of unique contributors & commits over time--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this type of conference, we should include a Results section in addition to the Methods. I haven't checked the template, but I assume that will be expected.
Our major claim is that Manubot successfully facilitated a massive collaborative review on an urgent topic. Our Results section can provide a qualitative and quantitative assessment to support that claim. I noted above that the growth of the literature could be one initial result to demonstrate the problem (and perhaps the grown of clinical trials, treatments, etc. as well). Other quantitative results could include:
- unique contributors and commits over time, as noted in the comment
- growth of the manuscript over time (we could try regenerating this figure summarizing deep review growth)
- growth of paper issues over time
- distribution of review contents, that is, something showing the number of comments per pull request that was merged
- growth of the reference list over time and/or number of clinical trials cited (because we describe that as a necessary enhancement)
- example of how the auto-updated figures changed between an early version and most recent version to demonstrate the importance of having current data visualized
Is there anything else along these lines we could include in a Results section? Most of my suggestions are plots showing something getting bigger over time, which is okay but not the most exciting data.
We could have a summary figure showing all of the external data sources that are integrated into the manuscript. We have icons for MSSM reviews, JHU data, etc. at the top. That flows into a GitHub repo, which also takes input from all the contributors. Then the output of the repo is the manuscript and other CI artifacts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are awesome ideas!!! I really like the idea of comparing the early figures to the most recent figures. My only other idea was something like a word cloud of #17 to show the diversity of interests? I don't have a lot of experience with that type of visualization, but assume it wouldn't be hard to do in Python. It would be cool to do something by degree obtained/pursued (MD vs PhD) or field, but I'm not sure how easy it would be to extract that data. Honestly, I could probably collect data on MD/PhD, career stage, and dsicipline just by manually tabulating people's responses to #17, which wouldn't be bad!
Good luck folks, don't forget to mention the use of Gitter. It was pretty useful for inexperienced to collaborate and ask questions more informally. |
Co-authored-by: Anthony Gitter <agitter@users.noreply.github.com>
AppVeyor build 1.0.3995 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are a couple of quick fixes for some of @agitter's comments. I'm still working on addressing the ones that will take a little more reorganizing!
|
||
### Abstract | ||
### ABSTRACT | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<!--To do: add definition of "infodemic"--> |
AppVeyor build 1.0.3996 for commit a824ec6 is now complete. Found 8 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymouscontent/09.evolution.md:139:LVNA content/23.vaccines-app.md:15:IgGs content/23.vaccines-app.md:387:IgGs content/60.methods.md:8:CCS content/60.methods.md:52:backend content/60.methods.md:54:autoupdated content/60.methods.md:64:ECR... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incredibly helpful feedback @agitter! I'm about to push a new commit that might cause some of the comments to disappear from the files tab, but I have responded to them here.
My suspicion is we are looking at a 6-page paper (@mprobson also suggested this might be better received as a short paper), but we'll have to see what it looks like when we convert to their format!
I will start on some of the data analysis/figure generation tomorrow!
content/60.methods.md
Outdated
Additionally, the broader topic of COVID-19 intersects with a wide range of fields, including virology, immunology, medicine, pharmacology, evolutionary biology, public health, and more. | ||
Therefore, any effort to comprehensively document and evaluate this body of literature would require insight from scientists across a number of fields. | ||
Furthermore, during the initial phase of the COVID-19 pandemic during spring and summer 2020, and much longer in some part of the world, many biological scientists were unable to access their research spaces. | ||
As a result, early career researchers (ECR) and students were likely to lose out on valuable time for conducting experiments. <!--To Do: look at equity analyses of the effects on the pandemic to see if there is any data on this yet?--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to conclusions!
content/60.methods.md
Outdated
|
||
#### Collaborative Writing and Manuscript Generation | ||
Few researchers in biological and medical fields are trained in version control tools such as git <!--To Do: Find some sort of reference for this, software carpentry?--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is one of my favorite ways of finding papers :) I'm digging through them now!
For examples of each template, please see Appendices B-D. | ||
Another option was to contribute or edit text using GitHub's pull request system. | ||
Each pull request was reviewed and approved by at least one other author. | ||
Manubot also provides a functionality to create a bibliography using digital object identifiers (DOIs), website URLs, or other identifiers such as PubMed identifiers and arXiv IDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved below!
content/60.methods.md
Outdated
|
||
#### Collaborative Writing and Manuscript Generation | ||
Few researchers in biological and medical fields are trained in version control tools such as git <!--To Do: Find some sort of reference for this, software carpentry?--> | ||
The current project was managed through GitHub [@url:https://github.com/greenelab/covid19-review] using Manubot [@doi:10.1371/journal.pcbi.1007128] to continuously generate a version of the manuscript online [@url:https://greenelab.github.io/covid19-review]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to discussion
content/60.methods.md
Outdated
|
||
#### Data Analysis and Visualization | ||
#### Applying Manubot to COVID-19 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attempted in 889a954
Data about worldwide cases and deaths from the COVID-19 Data Repository by the Center for Systems Science and Engineering at Johns Hopkins University [@https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series] were read using a Python script. <!-- TODO: replace figure reference with manuscript reference or reference figure conditionally based on template variable [to generate Figure @fig:csse-deaths.] --> | ||
When scientific writers added text that was current only as of a given date, publicly available data sources were identified whenever possible to allow the information to pulled directly into the manuscript in order to keep it up-to-date. | ||
|
||
Data was pulled from a number of sources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a note that it would be nice to get this! I can help (and can do OK visualizations) but if @vincerubinetti ends up interested, maybe he would have some advice!
content/60.methods.md
Outdated
|
||
### Article Selection and Evaluation | ||
Additionally, the fast-moving nature of the infodemic has led to a number of retractions and corrections of COVID-19 literature. | ||
<!--To Do: Describe integration with scite -- unless this should go in the section below? Ask Vince --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scite is @vincerubinetti and you did the spell-checking, right? I reached out to Vince to see what he thinks!
|
||
Reviewers described whether a causal claim could be made. | ||
They described whether any side effects or interactions with other drugs were identified, as well as any subgroup findings. | ||
<!--To Do: Add a graph/text describing number of unique contributors & commits over time--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are awesome ideas!!! I really like the idea of comparing the early figures to the most recent figures. My only other idea was something like a word cloud of #17 to show the diversity of interests? I don't have a lot of experience with that type of visualization, but assume it wouldn't be hard to do in Python. It would be cool to do something by degree obtained/pursued (MD vs PhD) or field, but I'm not sure how easy it would be to extract that data. Honestly, I could probably collect data on MD/PhD, career stage, and dsicipline just by manually tabulating people's responses to #17, which wouldn't be bad!
AppVeyor build 1.0.3997 for commit 889a954 is now complete. Found 15 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymouscontent/09.evolution.md:139:LVNA content/23.vaccines-app.md:15:IgGs content/23.vaccines-app.md:387:IgGs content/60.methods.md:8:CCS content/60.methods.md:64:ECRs content/60.methods.md:80:Manubot's content/60.methods.md:110:scite content/60.methods.md:142:scite content/60.methods.md:159:ECR content/60.methods.md:160:ECRs content/60.methods.md:164:docx content/60.methods.md:166:scite... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the changes. Let's merge this so we can start splitting up individual sections. Please give me direction on where I can help most, whether that is resolving TODOs or something else.
Daniel's notebooks at https://github.com/greenelab/meta-review/tree/master/analyses/deep-review-contrib give some examples of how to do contributor and content analysis for the results.
Thank you @agitter! I put this on gitter, but if you want to look at either methods for functionalities that you added (@vincerubinetti is going to send us a paragraph on scite) or exporting in LaTeX, I can work on the analysis/visualization this morning! |
[ci skip] This build is based on 05acaf5. This commit was created by the following CI build and job: https://github.com/greenelab/covid19-review/commit/05acaf59cd3a3701df8ce0ad113773e154774d82/checks https://github.com/greenelab/covid19-review/runs/788980132
[ci skip] This build is based on 05acaf5. This commit was created by the following CI build and job: https://github.com/greenelab/covid19-review/commit/05acaf59cd3a3701df8ce0ad113773e154774d82/checks https://github.com/greenelab/covid19-review/runs/788980132
Description of the proposed additions or changes
For reasons involving a job that might be posted soon, we are hoping to pivot the methods section to ACM - BRB. Unfortunately the deadline is 4/30! (Hopefully it will be pushed back though!)
Here is my very rough draft outline of the methods section as we could potentially present it for this conference (= Computer Science version of a journal for those used to bio publishing!)
@mprobson has agreed to take a look and give feedback on how to structure this type of content for an ACM journal. Based on this feedback, I'll start hitting the To Dos!
Related issues
Suggested reviewers (optional)
Checklist