-
Notifications
You must be signed in to change notification settings - Fork 83
Lancet exploration notebook series: Part 1 TCGA-PBTA comparison variations #557
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good, and I only have a few places where I think some DRYing would be useful.
My biggest comment is that the graphs here don't look bad? The TCGA data does seem to have higher TMB? Or am I looking at something wrong? If not, what is the difference here from previous analyses?
I am specifically looking at https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/3b3fd1c3b715da590179aa40a8c13c7316c0db18/analyses/snv-callers/lancet-wxs-tests/plots/tcga-vs-pbta-plots/tmb-cdf-Consensus.png, which looks as I would have expected, vs plots that were produced previously.
…out recent issues
if (is_tcga) { | ||
df <- df %>% | ||
# Shorten the Tumor_Sample_Barcode so it matches | ||
dplyr::mutate(Tumor_Sample_Barcode = substr(Tumor_Sample_Barcode, 0, 12)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just do this always, and save an argument. Taking the substr of a 12 character barcode will just return the barcode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have taken suggestions to my satisfaction, especially given the fact that this analysis is out of date and has been replaced by more current analyses. Given all that, I am happy to approve this PR in its current form. 🤓
Purpose/implementation Section
What scientific question is your analysis addressing?
This first notebook of three was an exploration into how different callers were resulting in different PBTA- TCGA comparisons.
There are three questions and plots to accompany them :
What was your approach?
Respectively:
What GitHub issue does your pull request address?
#548
Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.
Two headlining notes before you get into the nitty gritty:
Which areas should receive a particularly close look?
There are some parts of this analysis that could be DRY'ed up, but it also will be retired as soon as we get it confirmed. Let me know what parts of this code clearly need work and which we will leave as is.
Results
What is your summary of the results?
Here's the rendered notebook:
https://cansavvy.github.io/openpbta-notebook-concept/snv-callers/explore-tcga-pbta.nb.html
The results of this notebook were what was added to this PDF report:
TCGAvsPBTAconsensus.pdf
Reproducibility Checklist
Documentation Checklist
This analysis is not permanent and has not been added to the main READMEs
README
and it is up to date.analyses/README.md
and the entry is up to date.