Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Commit

Permalink
Mol-subtyping using pathology_free_text_diagnosis reported subtype (M…
Browse files Browse the repository at this point in the history
…eningioma) update (#1047)

* add path_free_text subtypes to mol subtype integrate

* rerun bindrows

* rerun harm_dx meningioma

* rerun to add tumor_descriptor

* Update analyses/molecular-subtyping-pathology/pathology_free_text-subtyping-meningioma.Rmd

Co-authored-by: Jo Lynne Rokita <jharenza@gmail.com>

* adding order to output

* Update run-subtyping-aggregation.sh

* Update run-subtyping-aggregation.sh

* Update pathology_free_text-subtyping-meningioma.Rmd

Co-authored-by: Jo Lynne Rokita <jharenza@gmail.com>
  • Loading branch information
kgaonkar6 and jharenza authored May 3, 2021
1 parent fbdb9a0 commit b847973
Show file tree
Hide file tree
Showing 7 changed files with 2,583 additions and 992 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,11 @@ base_histology <- read_tsv(file.path(data_dir,"pbta-histologies-base.tsv"),

### Read molecular-subtyping-pathology results

Reading molecular_subtype, integrated_diagnosis, short_histology, broad_histology and Notes from `compiled_molecular_subtypes_with_clinical_pathology_feedback.tsv`
Reading molecular_subtype, integrated_diagnosis, short_histology, broad_histology and Notes from `compiled_molecular_subtypes_with_clinical_pathology_feedback_and_report_info.tsv`

```{r}
compiled_subtyping<-read_tsv(file.path("..", "molecular-subtyping-pathology", "results", "compiled_molecular_subtypes_with_clinical_pathology_feedback.tsv"))
compiled_subtyping<-read_tsv(file.path("..", "molecular-subtyping-pathology", "results", "compiled_molecular_subtypes_with_clinical_pathology_feedback_and_report_info.tsv"))
```

Expand Down Expand Up @@ -208,5 +208,6 @@ histology %>%
-broad_histology.subtyped,
- short_histology.base,
-short_histology.subtyped) %>%
arrange(Kids_First_Biospecimen_ID) %>%
write_tsv("results/pbta-histologies.tsv")
```

Large diffs are not rendered by default.

1,832 changes: 916 additions & 916 deletions analyses/molecular-subtyping-integrate/results/pbta-histologies.tsv

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,17 @@ We will use this notebook to do so; see [#993](https://github.com/AlexsLemonade/
```{r}
# Pipes
library(magrittr)
library(tidyverse)
```

### Input

```{r}
data_dir <- file.path("..", "..", "data")
histologies_file <- file.path(data_dir, "pbta-histologies.tsv")
results_dir <- file.path("results")
histologies_file <- file.path(data_dir, "pbta-histologies-base.tsv")
compiled_mol_subtypes_pathology_clinical_file <- file.path(results_dir,
"compiled_molecular_subtypes_with_clinical_pathology_feedback.tsv")
```

### Output
Expand All @@ -37,13 +41,14 @@ if (!dir.exists(results_dir)) {
dir.create(results_dir)
}
output_file <- file.path(results_dir, "meningioma_subtypes.tsv")
output_file <- file.path(results_dir, "compiled_molecular_subtypes_with_clinical_pathology_feedback_and_report_info.tsv")
```

## Read in data

```{r}
histologies_df <- readr::read_tsv(histologies_file, guess_max = 10000)
compiled_mol_subtypes_pathology_clinical_df <- read_tsv(compiled_mol_subtypes_pathology_clinical_file, guess = 10000)
```

### Display `pathology_free_text_diagnosis` values
Expand Down Expand Up @@ -77,11 +82,10 @@ meningioma_df <- histologies_df %>%
dplyr::select(Kids_First_Biospecimen_ID,
Kids_First_Participant_ID,
sample_id,
pathology_diagnosis,
tumor_descriptor,
pathology_free_text_diagnosis,
broad_histology,
short_histology,
harmonized_diagnosis) %>%
short_histology) %>%
# To smooth the way for string detection for the pathology free text, we
# add a column where all of the text is converted to lowercase
dplyr::mutate(pathology_free_text_dx_lower =
Expand All @@ -100,21 +104,29 @@ meningioma_df <- histologies_df %>%
# This will be true when none of the conditions (e.g., strings are
# detected) above
TRUE ~ "Meningioma"
)
),
Notes = "Updated via OpenPBTA subtyping from pathology_free_text_diagnosis"
) %>%
# Drop the column we added for convenience of string detection
dplyr::select(-pathology_free_text_dx_lower)
# and to format to match compiled_mol_subtypes_pathology_clinical_df
dplyr::select(-pathology_free_text_dx_lower,
-pathology_free_text_diagnosis)
```

Write to file!
Since Meningioma was not subtyped as part of OpenPBTA molecular subtying we can append the meningioma results to `compiled_mol_subtypes_pathology_clinical_df`.

```{r}
readr::write_tsv(meningioma_df, output_file)
meningioma_df %>%
bind_rows(compiled_mol_subtypes_pathology_clinical_df) %>%
arrange(Kids_First_Biospecimen_ID) %>%
readr::write_tsv(output_file)
```

## Session Info

```{r}
sessionInfo()
```

Large diffs are not rendered by default.

Loading

0 comments on commit b847973

Please sign in to comment.