Lancet exploration notebook series: Part 1 TCGA-PBTA comparison variations #557

cansavvy · 2020-02-25T17:16:26Z

Purpose/implementation Section

What scientific question is your analysis addressing?

This first notebook of three was an exploration into how different callers were resulting in different PBTA- TCGA comparisons.

There are three questions and plots to accompany them :

Is the read depth different between PBTA and TCGA
Do TMB comparisons results change if we calculate TMB with each caller by itself?
How much do the TCGA and PBTA overlap in their target WXS regions?

What was your approach?

Respectively:

Plot the read densities for TCGA and PBTA
Recreate the TMB comparison CDF plot for each caller
Create Venndiagrams of the overlaps of the target regions

What GitHub issue does your pull request address?

#548

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Two headlining notes before you get into the nitty gritty:

This PR is huge. Let me know if you want me to split it up.
Note that this analysis is gonna be retired almost as soon as we get it in (and its already outdated with the BED file changes that are still in progress Updated analysis: bedtools creation of bed files for TMB #564 and Bed files intersection fixes #566 )

Which areas should receive a particularly close look?

There are some parts of this analysis that could be DRY'ed up, but it also will be retired as soon as we get it confirmed. Let me know what parts of this code clearly need work and which we will leave as is.

Results

What is your summary of the results?

Here's the rendered notebook:
https://cansavvy.github.io/openpbta-notebook-concept/snv-callers/explore-tcga-pbta.nb.html

The results of this notebook were what was added to this PDF report:

TCGAvsPBTAconsensus.pdf

Reproducibility Checklist

The dependencies required to run the code in this pull request have been added to the project Dockerfile.
This analysis has been added to continuous integration.

Documentation Checklist

This analysis is not permanent and has not been added to the main READMEs

This analysis module has a README and it is up to date.
This analysis is recorded in the table in analyses/README.md and the entry is up to date.
The analytical code is documented and contains comments.

.circleci/config.yml

jashapiro

This looks pretty good, and I only have a few places where I think some DRYing would be useful.

My biggest comment is that the graphs here don't look bad? The TCGA data does seem to have higher TMB? Or am I looking at something wrong? If not, what is the difference here from previous analyses?

I am specifically looking at https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/3b3fd1c3b715da590179aa40a8c13c7316c0db18/analyses/snv-callers/lancet-wxs-tests/plots/tcga-vs-pbta-plots/tmb-cdf-Consensus.png, which looks as I would have expected, vs plots that were produced previously.

analyses/snv-callers/lancet-wxs-tests/explore-tcga-pbta.Rmd

…out recent issues

jashapiro · 2020-03-05T15:34:25Z

analyses/snv-callers/lancet-wxs-tests/explore-tcga-pbta.Rmd

+  if (is_tcga) {
+    df <- df %>% 
+      # Shorten the Tumor_Sample_Barcode so it matches
+      dplyr::mutate(Tumor_Sample_Barcode = substr(Tumor_Sample_Barcode, 0, 12))
+  } 


I think you can just do this always, and save an argument. Taking the substr of a 12 character barcode will just return the barcode.

jashapiro

You have taken suggestions to my satisfaction, especially given the fact that this analysis is out of date and has been replaced by more current analyses. Given all that, I am happy to approve this PR in its current form. 🤓

cansavvy added 6 commits February 25, 2020 11:51

Add exploration notebook

b84db10

Add to CircleCI

315d8d3

update gitiugnore to include a file for this analysis

50bd252

Update notebooks

869171b

Add only the one ref file

6f8f56b

Fix circleCI file path

6b4c1ff

cansavvy commented Feb 25, 2020

View reviewed changes

.circleci/config.yml Outdated Show resolved Hide resolved

Update .circleci/config.yml file path

e3b32f6

cansavvy added the work in progress Used to label (non-draft) pull requests that are not yet ready for review label Feb 25, 2020

cansavvy added 3 commits February 25, 2020 15:45

Rearrange order of CircleCI tests

63bed5f

Add better documentation

86a0a92

Fix date

b58941a

jaclyn-taroni mentioned this pull request Feb 26, 2020

TCGA vs PBTA exploratory analysis #548

Closed

5 tasks

cansavvy added 7 commits February 27, 2020 18:27

Merge branch 'master' into lancet-tests-1

9b3c9be

Merge branch 'master' into lancet-tests-1

2b46d95

Merge remote-tracking branch 'upstream/master' into lancet-tests-1

b13e7bf

Update the notebook and plots with redone data

3ccd0c6

Merge remote-tracking branch 'upstream/master' into lancet-tests-1

4e08f3e

Merge branch 'master' into lancet-tests-1

62828b7

fix file reference

90fcee9

cansavvy removed the work in progress Used to label (non-draft) pull requests that are not yet ready for review label Mar 3, 2020

Merge branch 'master' into lancet-tests-1

3b3fd1c

jashapiro reviewed Mar 4, 2020

View reviewed changes

cansavvy added 7 commits March 4, 2020 16:46

Merge remote-tracking branch 'upstream/master' into lancet-tests-1

1afb5d4

Incorporate some @jashapiro suggestions and add more documentation ab…

0bd853c

…out recent issues

Merge remote-tracking branch 'origin/lancet-tests-1' into lancet-tests-1

7e0fb4e

Fix plot rendering

a504ca3

Merge branch 'master' into lancet-tests-1

f5b0cb7

Undo accidental bolding

7c63ca9

Push changes of function

da85445

push refreshed notebook

1a2f956

jashapiro reviewed Mar 5, 2020

View reviewed changes

jashapiro approved these changes Mar 5, 2020

View reviewed changes

cansavvy added 2 commits March 5, 2020 11:32

change to 0, 12 substr for all samples and refresh notebook

7e2f36e

Merge branch 'master' into lancet-tests-1

584d7fa

cansavvy mentioned this pull request Mar 6, 2020

Functionalize CDF TMB Plot #612

Merged

jaclyn-taroni added 2 commits March 6, 2020 11:24

Merge branch 'master' into lancet-tests-1

8536e84

Merge branch 'master' into lancet-tests-1

17d4557

jaclyn-taroni merged commit 787309a into AlexsLemonade:master Mar 6, 2020

cansavvy deleted the lancet-tests-1 branch March 25, 2020 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lancet exploration notebook series: Part 1 TCGA-PBTA comparison variations #557

Lancet exploration notebook series: Part 1 TCGA-PBTA comparison variations #557

cansavvy commented Feb 25, 2020 •

edited

Loading

jashapiro left a comment •

edited

Loading

jashapiro Mar 5, 2020

jashapiro left a comment

Lancet exploration notebook series: Part 1 TCGA-PBTA comparison variations #557

Lancet exploration notebook series: Part 1 TCGA-PBTA comparison variations #557

Conversation

cansavvy commented Feb 25, 2020 • edited Loading

Purpose/implementation Section

What scientific question is your analysis addressing?

What was your approach?

What GitHub issue does your pull request address?

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Results

What is your summary of the results?

Reproducibility Checklist

Documentation Checklist

jashapiro left a comment • edited Loading

Choose a reason for hiding this comment

jashapiro Mar 5, 2020

Choose a reason for hiding this comment

jashapiro left a comment

Choose a reason for hiding this comment

cansavvy commented Feb 25, 2020 •

edited

Loading

jashapiro left a comment •

edited

Loading