This repository has been archived by the owner on Jun 21, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 83
PR 1 of n: Molecular Subtyping - HGG (Defining Lesions) #352
Merged
jaclyn-taroni
merged 21 commits into
AlexsLemonade:master
from
cbethell:hgg-molecular-subtyping-data-prep
Jan 4, 2020
Merged
Changes from 17 commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
e39ca95
Add `01-HGG-molecular-subtyping-data-prep.Rmd`
cbethell ecc924f
Fix command in `.circleci`
cbethell ed68af3
Merge branch 'master' into hgg-molecular-subtyping-data-prep
cbethell 02ba79e
Minor `lintr` format changes
cbethell 254a279
Merge branch 'master' into hgg-molecular-subtyping-data-prep
cbethell c865960
Log2 transform expression data
cbethell 2981f7c
Merge branch 'master' into hgg-molecular-subtyping-data-prep
cbethell 5ec08c4
Use `controlfreec` cn data
cbethell fda9e31
Merge branch 'master' of https://github.com/cbethell/OpenPBTA-analysi…
cbethell e2befa3
Merge branch 'master' into hgg-molecular-subtyping-data-prep
jaclyn-taroni cd9bd8e
Merge branch 'master' into hgg-molecular-subtyping-data-prep
cbethell f0b7e4c
Merge branch 'hgg-molecular-subtyping-data-prep' of https://github.co…
cbethell a1a132e
Create a column better distinguishing specific HGG mutations
cbethell 3a5adb4
Merge branch 'master' of https://github.com/cbethell/OpenPBTA-analysi…
cbethell 85b4b13
Change `01` nb to look only at HGG defining lesions
cbethell 2885f1f
Edit analysis in `.circleci` to reflect nb name change
cbethell 900f803
Remove unused lines of code
cbethell 8bdd4c6
Update code to reflect V12 change
cbethell e5e4cf2
Merge branch 'master' into hgg-molecular-subtyping-data-prep
jaclyn-taroni a10cded
Address @jharenza comments
jaclyn-taroni caf52e3
Add to modules at a glance table
jaclyn-taroni File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
169 changes: 169 additions & 0 deletions
169
analyses/molecular-subtyping-HGG/01-HGG-molecular-subtyping-defining-lesions.Rmd
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,169 @@ | ||||||
--- | ||||||
title: "High-Grade Glioma Molecular Subtyping - Defining Lesions" | ||||||
output: | ||||||
html_notebook: | ||||||
toc: TRUE | ||||||
toc_float: TRUE | ||||||
author: Chante Bethell for ALSF CCDL | ||||||
date: 2019 | ||||||
--- | ||||||
|
||||||
This notebook looks at the defining lesions for all samples for the issue of | ||||||
molecular subtyping high-grade glioma samples in the OpenPBTA dataset. | ||||||
|
||||||
# Usage | ||||||
|
||||||
This notebook is intended to be run via the command line from the top directory | ||||||
of the repository as follows: | ||||||
|
||||||
`Rscript -e "rmarkdown::render('analyses/molecular-subtyping-HGG/01-HGG-molecular-subtyping-defining-lesions.Rmd', clean = TRUE)"` | ||||||
|
||||||
# Set Up | ||||||
|
||||||
```{r} | ||||||
# Get `magrittr` pipe | ||||||
`%>%` <- dplyr::`%>%` | ||||||
``` | ||||||
|
||||||
## Directories and Files | ||||||
|
||||||
```{r} | ||||||
# Detect the ".git" folder -- this will in the project root directory. | ||||||
# Use this as the root directory to ensure proper sourcing of functions no | ||||||
# matter where this is called from | ||||||
root_dir <- rprojroot::find_root(rprojroot::has_dir(".git")) | ||||||
|
||||||
# File path to results directory | ||||||
results_dir <- | ||||||
file.path(root_dir, "analyses", "molecular-subtyping-HGG", "results") | ||||||
|
||||||
if (!dir.exists(results_dir)) { | ||||||
dir.create(results_dir) | ||||||
} | ||||||
|
||||||
# Read in metadata | ||||||
metadata <- | ||||||
readr::read_tsv(file.path(root_dir, "data", "pbta-histologies.tsv")) | ||||||
|
||||||
# Select wanted columns in metadata for merging and assign to a new object | ||||||
select_metadata <- metadata %>% | ||||||
dplyr::select(sample_id, | ||||||
Kids_First_Participant_ID, | ||||||
Kids_First_Biospecimen_ID, | ||||||
disease_type_new) | ||||||
|
||||||
# Read in snv consensus mutation data | ||||||
snv_df <- | ||||||
data.table::fread(file.path(root_dir, | ||||||
"data", | ||||||
"pbta-snv-consensus-mutation.maf.tsv.gz")) | ||||||
``` | ||||||
|
||||||
## Custom Function | ||||||
|
||||||
```{r} | ||||||
# Custom datatable function | ||||||
# Function code adapted from: https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/49acc98f5ffd86853fc70f220623311e13e3ca9f/analyses/collapse-rnaseq/02-analyze-drops.Rmd#L23 | ||||||
viewDataTable <- function(data) { | ||||||
DT::datatable( | ||||||
data, | ||||||
rownames = FALSE, | ||||||
filter = "bottom", | ||||||
class = "cell-border stripe", | ||||||
options = list( | ||||||
pageLength = 5, | ||||||
searchHighlight = TRUE, | ||||||
scrollX = TRUE, | ||||||
dom = "tpi", | ||||||
initComplete = htmlwidgets::JS( | ||||||
"function(settings, json) {", | ||||||
"$(this.api().table().header()).css({'background-color': | ||||||
'#004467', 'color': '#fff'});", | ||||||
"}" | ||||||
) | ||||||
) | ||||||
) | ||||||
} | ||||||
``` | ||||||
|
||||||
# Prepare Data | ||||||
|
||||||
## SNV consensus mutation data - defining lesions | ||||||
|
||||||
```{r} | ||||||
# Filter the snv consensus mutatation data for the target lesions | ||||||
snv_lesions_df <- snv_df %>% | ||||||
dplyr::select(Tumor_Sample_Barcode, Hugo_Symbol, HGVSp_Short) %>% | ||||||
dplyr::mutate( | ||||||
H3F3A.K28M = dplyr::case_when(Hugo_Symbol == "H3F3A" & | ||||||
HGVSp_Short == "p.K28M" ~ "Yes", | ||||||
TRUE ~ "No"), | ||||||
HIST1H3B.K28M = dplyr::case_when( | ||||||
Hugo_Symbol == "HIST1H3B" & HGVSp_Short == "p.K28M" ~ "Yes", | ||||||
TRUE ~ "No" | ||||||
), | ||||||
H3F3A.G35R = dplyr::case_when(Hugo_Symbol == "H3F3A" & | ||||||
HGVSp_Short == "p.G35R" ~ "Yes", | ||||||
TRUE ~ "No"), | ||||||
H3F3A.G35V = dplyr::case_when(Hugo_Symbol == "H3F3A" & | ||||||
HGVSp_Short == "p.G35V" ~ "Yes", | ||||||
TRUE ~ "No") | ||||||
) | ||||||
|
||||||
# Join the selected variables from the metadata with the snv consensus mutation | ||||||
# and defining lesions data.frame | ||||||
snv_lesions_df <- snv_lesions_df %>% | ||||||
dplyr::left_join(select_metadata, | ||||||
by = c("Tumor_Sample_Barcode" = "Kids_First_Biospecimen_ID")) %>% | ||||||
dplyr::select( | ||||||
Kids_First_Participant_ID, | ||||||
sample_id, | ||||||
Kids_First_Biospecimen_ID = Tumor_Sample_Barcode, | ||||||
dplyr::everything(), | ||||||
-HGVSp_Short, | ||||||
-Hugo_Symbol | ||||||
) %>% | ||||||
dplyr::distinct() %>% | ||||||
dplyr::mutate( | ||||||
disease_type_reclassified = dplyr::case_when( | ||||||
H3F3A.K28M == "Yes" | | ||||||
HIST1H3B.K28M == "Yes" | | ||||||
H3F3A.G35R == "Yes" | | ||||||
H3F3A.G35V == "Yes" ~ "High-grade glioma", | ||||||
TRUE ~ as.character(disease_type_new) | ||||||
) | ||||||
) | ||||||
|
||||||
# Display `snv_lesions_df` | ||||||
snv_lesions_df | ||||||
``` | ||||||
|
||||||
## Save final table of results | ||||||
|
||||||
```{r} | ||||||
# Save final data.frame to file | ||||||
readr::write_tsv(snv_lesions_df, | ||||||
file.path(results_dir, "HGG_defining_lesions.tsv")) | ||||||
``` | ||||||
|
||||||
## Inconsistencies in disease classification | ||||||
|
||||||
```{r} | ||||||
# Isolate the samples that should be reclassified as HGG | ||||||
hgg_samples <- snv_lesions_df %>% | ||||||
dplyr::filter( | ||||||
disease_type_reclassified == "High-grade glioma" & | ||||||
disease_type_new != "High-grade glioma;astrocytoma (WHO grade III/IV)" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With the v12 data I think this should be
Suggested change
|
||||||
) | ||||||
|
||||||
# Display the reclassified samples | ||||||
viewDataTable(hgg_samples) | ||||||
``` | ||||||
|
||||||
# Session Info | ||||||
|
||||||
```{r} | ||||||
# Print the session information | ||||||
sessionInfo() | ||||||
``` | ||||||
|
6,113 changes: 6,113 additions & 0 deletions
6,113
analyses/molecular-subtyping-HGG/01-HGG-molecular-subtyping-defining-lesions.nb.html
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add an additional column here for the
molecular_subtype
- something like:HGG, H3 K28 mutant
orHigh-grade glioma, H3 K28 mutant
HGG, H3 G35 mutant
orHigh-grade glioma, H3 G35 mutant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made that update in a10cded. Note that this table is not the final table from this module, but an interim product #352 (review).