-
Notifications
You must be signed in to change notification settings - Fork 83
update molecular subtyping pathology #854
update molecular subtyping pathology #854
Conversation
Oh yes, BS_EE73VE7V ( 1 of the sample Progressive Disease Post-Mortem for PT_KTRJ8TFY ) only had the histone variant in Vardict so is lost in consensus calling so got subtyped as H3 wildtype. Ticket https://github.com/d3b-center/bixu-tracker/issues/463 shows the IGV for the variant, we should probably add that to the clinical/pathology module as H3 mutatant along with other updates? |
-update MB, EPN, HGAT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use case_when() calls here for assigning subtypes since we have multiple options. I added some suggestions for changing ifelse calls to case_when()
analyses/molecular-subtyping-pathology/01-compile-subtyping-results.Rmd
Outdated
Show resolved
Hide resolved
analyses/molecular-subtyping-pathology/01-compile-subtyping-results.Rmd
Outdated
Show resolved
Hide resolved
analyses/molecular-subtyping-pathology/01-compile-subtyping-results.Rmd
Outdated
Show resolved
Hide resolved
analyses/molecular-subtyping-pathology/01-compile-subtyping-results.Rmd
Outdated
Show resolved
Hide resolved
analyses/molecular-subtyping-pathology/01-compile-subtyping-results.Rmd
Outdated
Show resolved
Hide resolved
analyses/molecular-subtyping-pathology/01-compile-subtyping-results.Rmd
Outdated
Show resolved
Hide resolved
analyses/molecular-subtyping-pathology/01-compile-subtyping-results.Rmd
Outdated
Show resolved
Hide resolved
…sults.Rmd change ifelse() to case_when() Co-authored-by: Krutika Gaonkar <34580719+kgaonkar6@users.noreply.github.com>
…sults.Rmd change ifelse() to case_when() Co-authored-by: Krutika Gaonkar <34580719+kgaonkar6@users.noreply.github.com>
…sults.Rmd change ifelse() to case_when() Co-authored-by: Krutika Gaonkar <34580719+kgaonkar6@users.noreply.github.com>
…sults.Rmd change ifelse() to case_when() Co-authored-by: Krutika Gaonkar <34580719+kgaonkar6@users.noreply.github.com>
…sults.Rmd change ifelse() to case_when() Co-authored-by: Krutika Gaonkar <34580719+kgaonkar6@users.noreply.github.com>
…sults.Rmd change ifelse() to case_when() Co-authored-by: Krutika Gaonkar <34580719+kgaonkar6@users.noreply.github.com>
…sults.Rmd change ifelse() to case_when() Co-authored-by: Krutika Gaonkar <34580719+kgaonkar6@users.noreply.github.com>
update Notes to be in sync with new subtypes
update comment about adding int dx/broad/short hist here
For now, since we do not have clinical sequencing to confirm this, let's leave this out of the clinical module. I expect this will be captured when #819 goes in by @migbro in a few weeks |
update and remove duplicates from final file
@kgaonkar6 I updated the code based on your comments, updated Notes, and added in the 03-script the changes due to pathology feedback. I then get 1487 rows in the final file, but 1481 biospecimens. It seems like 6 more have duplicates, but when I check for duplicates of the final file using
I get the table below (which do not show me duplicate BS_IDs). (Maybe overlooking something in the code).
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jharenza! From your comment above I also looked into 03 now and it seems the following updates will remove the duplicated rows.
annotation_cols <- unlist(str_split("integrated_diagnosis,broad_histology,short_histology,molecular_subtype", ","))
get_count<-function(x){
compiled_df %>%
dplyr::group_by(Kids_First_Biospecimen_ID,!!as.name(x)) %>%
tally() %>%
filter(n>1)
}
lapply(annotation_cols,function(x) get_count(x))
The above check had no rows so we only have 1 value per bs_id in complied_df now which is expected.
Additionally, EPN molecular_subtypes sometimes deviated from the format of like HGG, wildtype/ HGG, K28M values do we want to update that ( in EPN updated PR) ?
# A tibble: 5 x 1
molecular_subtype
<chr>
1 EPN, To be classified
2 ST_EPN_RELA
3 ST_EPN_YAP1
4 EPN, H3 K28
5 PT_EPN_A
analyses/molecular-subtyping-pathology/03-incorporate-pathology-feedback.Rmd
Show resolved
Hide resolved
analyses/molecular-subtyping-pathology/03-incorporate-pathology-feedback.Rmd
Outdated
Show resolved
Hide resolved
analyses/molecular-subtyping-pathology/03-incorporate-pathology-feedback.Rmd
Show resolved
Hide resolved
Sure, sounds like we can make that update. |
…y-feedback.Rmd add updates from code review Co-authored-by: Krutika Gaonkar <34580719+kgaonkar6@users.noreply.github.com>
…y-feedback.Rmd add updates from code review Co-authored-by: Krutika Gaonkar <34580719+kgaonkar6@users.noreply.github.com>
…y-feedback.Rmd add updates from code review Co-authored-by: Krutika Gaonkar <34580719+kgaonkar6@users.noreply.github.com>
analyses/molecular-subtyping-pathology/03-incorporate-pathology-feedback.Rmd
Outdated
Show resolved
Hide resolved
…y-feedback.Rmd fix typo
close chunk and rerun subtyping
@kgaonkar6 applied changes, fixed typo and one missed chunk closure, and reran, so requested your review once more. Looks like all of the duplicates are now fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Does all the required tasks in #854
Purpose/implementation Section
What scientific question is your analysis addressing?
Update molecular subtyping pathology module. NOTE: #851, #853, and #852 should be be merged and run first.
What was your approach?
What GitHub issue does your pull request address?
#719, #784, #667
Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.
Which areas should receive a particularly close look?
integrated_diagnosis
as NA if there is no subtypePT_AQWDQW27
seems to be missing from the EPN subtyping module and was noted as a sample which is a rare meningioma but was subtyped as EPN. Will create a ticket to assess the EPN module.Is there anything that you want to discuss further?
Note:
PT_KTRJ8TFY
still has two different subtypes for samples which areProgressive Disease Post-Mortem
- @kgaonkar6 will you check that sample to be sure it should end up this way from clinical feedback? This was one sample which did not have a clinical sequencing report, so this may be expected.Note:
PT_6Q0NPVP3
also has duplicate subtypes, but this will be taken care of once #851 is merged.Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?
yes
Results
What types of results are included (e.g., table, figure)?
tables
What is your summary of the results?
1497 rows in final table, but this will change with modules being merged ahead
Reproducibility Checklist
Documentation Checklist
README
and it is up to date.analyses/README.md
and the entry is up to date.