forked from AlexsLemonade/OpenPBTA-analysis
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add CRANIO, ADAM subtyping notebook per AlexsLemonade#994
- Loading branch information
1 parent
8431b1e
commit 272d918
Showing
3 changed files
with
3,327 additions
and
0 deletions.
There are no files selected for viewing
119 changes: 119 additions & 0 deletions
119
analyses/molecular-subtyping-pathology/clinical-subtyping-craniopharyngioma.Rmd
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
--- | ||
title: "Recoding adamantinomatous craniopharyngiomas" | ||
output: | ||
html_notebook: | ||
toc: true | ||
toc_float: true | ||
author: JN Taroni for ALSF CCDL (code) | ||
date: 2021 | ||
--- | ||
|
||
_Background adapted from [#994](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/994)_ | ||
|
||
There are Craniopharyngioma samples which may have been annotated as "To be classified" in `molecular-subtyping-CRANIO` because they lack canonical mutations. | ||
However, for adamantinomatous craniopharyngioma, the b-catenin SNV is not present in all samples ([ref](https://doi.org/10.1093/jnen/nlw116)): | ||
|
||
> In our cohort of [adamantinomatous craniopharyngiomas] specimens from 117 patients we found _CTNNB1_ mutations in 89 cases (76.1%). | ||
There are samples described as `Adamantinomatous` in pathology reports, so we can update the `harmonized_diagnosis` and `molecular_subtype` information accordingly. | ||
|
||
## Set up | ||
|
||
### Libraries | ||
|
||
```{r} | ||
library(tidyverse) | ||
``` | ||
|
||
### Input | ||
|
||
```{r} | ||
data_dir <- file.path("..", "..", "data") | ||
histologies_file <- file.path(data_dir, "pbta-histologies.tsv") | ||
``` | ||
|
||
### Output | ||
|
||
```{r} | ||
results_dir <- "results" | ||
if (!dir.exists(results_dir)) { | ||
dir.create(results_dir) | ||
} | ||
output_file <- file.path(results_dir, "cranio_adam_subtypes.tsv") | ||
``` | ||
|
||
## Read in data | ||
|
||
```{r} | ||
histologies_df <- read_tsv(histologies_file) | ||
``` | ||
|
||
### Samples to be reclassified | ||
|
||
The instructions on [#994](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/994) are to use specific sample identifiers to do the reclassification because it is on the basis of pathology report review. | ||
|
||
Here's the same filtering steps that are performed on the issue itself that we'll save in a new data frame. | ||
|
||
```{r} | ||
acp_df <- histologies_df %>% | ||
# Same logic as on the issue! | ||
filter(pathology_diagnosis == "Craniopharyngioma", | ||
molecular_subtype == "CRANIO, To be classified", | ||
str_detect(str_to_lower(pathology_free_text_diagnosis), | ||
"adamantinomatous")) %>% | ||
select(sample_id, | ||
pathology_free_text_diagnosis, | ||
molecular_subtype) %>% | ||
distinct() | ||
acp_df | ||
``` | ||
|
||
We can pull the `sample_id` values out and use that in our next steps. | ||
|
||
```{r} | ||
sample_ids_reclassification <- acp_df %>% | ||
pull(sample_id) | ||
``` | ||
|
||
## Recode adamantinomatous craniopharyngiomas | ||
|
||
We can use the following table from [#994](https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/994) to guide how we recode the labels for these samples. | ||
|
||
| broad_histology | short_histology | harmonized_diagnosis | molecular_subtype | | ||
|-------------------------|-------------------|------------------------------------|-------------------| | ||
| Tumors of sellar region | Craniopharyngioma | Adamantinomatous craniopharyngioma | CRANIO, ADAM | | ||
|
||
```{r} | ||
cranio_adam_df <- histologies_df %>% | ||
filter(sample_id %in% sample_ids_reclassification) %>% | ||
# Filter to relevant ID and disease type label columns | ||
select(Kids_First_Biospecimen_ID, | ||
Kids_First_Participant_ID, | ||
sample_id, | ||
broad_histology, | ||
short_histology, | ||
harmonized_diagnosis, | ||
molecular_subtype) %>% | ||
# Code the values that are in the table above | ||
mutate( | ||
broad_histology = "Tumors of sellar region", | ||
short_histology = "Craniopharyngioma", | ||
harmonized_diagnosis = "Adamantinomatous craniopharyngioma", | ||
molecular_subtype = "CRANIO, ADAM" | ||
) | ||
``` | ||
|
||
Write to file! | ||
|
||
```{r} | ||
write_tsv(cranio_adam_df, output_file) | ||
``` | ||
|
||
## Session Info | ||
|
||
```{r} | ||
sessionInfo() | ||
``` | ||
|
Oops, something went wrong.