Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

PBTA Histologies: Fusion filtering base (3 of N) #865

Merged
merged 14 commits into from
Jan 6, 2021
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions analyses/collapse-rnaseq/02-analyze-drops.nb.html

Large diffs are not rendered by default.

Binary file not shown.
Empty file modified analyses/collapse-rnaseq/run-collapse-rnaseq.sh
100644 → 100755
Empty file.
13 changes: 13 additions & 0 deletions analyses/fusion_filtering/04-project-specific-filtering.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,14 @@ params:
label: "results folder for *tsv files"
value: results
input: string
base_run:
label: "1/0 to run with base histology"
value: 0
input: integer
base_histology:
label: "Base histology file"
value: data/pbta-histologies-base.tsv
input: file
---


Expand Down Expand Up @@ -88,8 +96,13 @@ fusion_calls<-QCGeneFiltered_filtFusion %>% mutate(FusionName=rm_between(.data$F
group<-params$group

# get histology file
if ( params$base_run ==0 ){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the only difference here is whether params$base_histology or params$histology is used, so I would reduce your if statement to specify a histology_file variable and then supply that to that code chunk rather than repeating the code chunk twice.

clinical<-read.delim(file.path(root_dir, params$histology), stringsAsFactors = FALSE)
clinical<-clinical[,c("Kids_First_Biospecimen_ID","Kids_First_Participant_ID",group)]
} else {
clinical<-read.delim(file.path(root_dir, params$base_histology), stringsAsFactors = FALSE)
clinical<-clinical[,c("Kids_First_Biospecimen_ID","Kids_First_Participant_ID",group)]
}

# Least number of callers
numCaller<-params$numCaller
Expand Down
23 changes: 13 additions & 10 deletions analyses/fusion_filtering/04-project-specific-filtering.nb.html

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,14 @@ params:
label: "results folder for pbta-fusion-putative-oncogenic.tsv files"
value: results
input: string
base_run:
label: "1/0 to run with base histology"
value: 0
input: integer
base_histology:
label: "Base histology file"
value: data/pbta-histologies-base.tsv
input: file

---

Expand Down Expand Up @@ -69,8 +77,14 @@ fusion_calls<-read_tsv(file.path(root_dir,params$dataPutativeFusion))
outputfolder<-params$outputfolder

#### get histology file
if ( params$base_run ==0 ){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here!

clinical<-read_tsv(file.path(root_dir, params$histology), guess_max = 10000) %>%
dplyr::select(Kids_First_Biospecimen_ID, Kids_First_Participant_ID, broad_histology)
} else {
clinical<-read_tsv(file.path(root_dir, params$base_histology), guess_max = 10000) %>%
dplyr::select(Kids_First_Biospecimen_ID, Kids_First_Participant_ID, broad_histology)
}


# add broad_histology to fusion
fusion_calls<-fusion_calls %>%
Expand Down

Large diffs are not rendered by default.

11 changes: 10 additions & 1 deletion analyses/fusion_filtering/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,16 @@ The code to generate genelistreference.txt and fusionreference.txt is available
* pbta-fusion-recurrently-fused-genes-bysample.tsv

### Run script
`bash run_fusion_merged.sh`
use BASE_SUBTYPING=1 to run this module using the pbta-histologies-base.tsv from data folder while running molecular-subtyping modules for release.
```sh
BASE_SUBTYPING=1 run_fusion_merged.sh
```

OR by default uses pbta-histologies.tsv from data folder
```sh
bash run_fusion_merged.sh
```


#### Order of scripts in analysis
`01-fusion-standardization.R` : Standardizes fusion calls from STARFusion and Arriba
Expand Down
27,882 changes: 14,147 additions & 13,735 deletions analyses/fusion_filtering/results/FilteredFusion.tsv

Large diffs are not rendered by default.

4,026 changes: 2,102 additions & 1,924 deletions analyses/fusion_filtering/results/pbta-fusion-putative-oncogenic.tsv

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
FusionName broad_histology count
KIAA1549--BRAF Low-grade astrocytic tumor 109
KIAA1549--BRAF Low-grade astrocytic tumor 114
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming these results changes are expected, but just going to bring them to attention so we can check.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, these are expected since some samples were updated to Low-grade. More info in Jo Lynne's comment here #865 (comment)

C11orf95--RELA Ependymal tumor 25
EWSR1--FLI1 Mesenchymal non-meningothelial tumor 7
KIAA1549--BRAF Neuronal and mixed neuronal-glial tumor 6
EWSR1--FLI1 Mesenchymal non-meningothelial tumor 5
REV3L--FYN Diffuse astrocytic and oligodendroglial tumor 5
Loading