-
Notifications
You must be signed in to change notification settings - Fork 83
PBTA Histologies v19 Part 3 of N: Update fusion filtering #1041
PBTA Histologies v19 Part 3 of N: Update fusion filtering #1041
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seem to be a lot of lines removed here in the recurrent lists, more than I would have expected. I initially thought this might be due to sample changes in the independent samples analysis, but then I would have expected some additions, not just removals.
FilteredFusion.tsv
only seems to remove BS_JXF8A2A6
as expected, but other files have more samples removed, which I was not expecting.
Is it possible that there was a mismatch between the data files & sample list for some part of this module, making it so the changes in #1040 might not be fully reflected here?
One other note: pbta-fusion-putative-oncogenic.tsv
has more changes than I would have expected. Spot checking it looks like most of this is ordering changes, but I didn't do a careful analysis.
Thanks for the review! I was not able to figure out the issue before while filing the PR but I think I figured out: This code needs to be change cohort name from Just wanted to also mention that all in #1028 I actually run the modules related to each subsequent module by running the https://github.com/kgaonkar6/OpenPBTA-analysis/blob/v19-release/scripts/run-for-subtyping.sh so the changes are propogated. But added each module by themselves in |
Updating the code to use PNOC instead of PNOC003 fixed those unwanted changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good now, but I'm still a bit confused by the large number of changes in pbta-fusion-putative-oncogenic.tsv
As I said, I don't think it is a problem; it seems like there might be just an ordering change there.
If you want to fix it, it looks like just adding an arrange(Sample, FusionName)
at line 211 of
05-QC_putative_onco_fusion_dustribution.Rmd
would do the trick to prevent this from coming up again in the future, and make it easier to distinguish any changes that do occur after this point. (You might want to fix the typo in the file name while you are at it. 😉 )
@kgaonkar6 Thanks for the updates in efe4de5... you will want to make sure to delete the old files as well, so GH recognizes it as a move rather than new files. |
Ah yes removed those old files! All these commits for this module and I never saw that file name typo 😅 |
Purpose/implementation Section
What scientific question is your analysis addressing?
fusion_filtering
results/pbta-fusion-putative-oncogenic.tsv
(included in data download)results/pbta-fusion-recurrent-fusion-byhistology.tsv
(included in data download)results/pbta-fusion-recurrent-fusion-bysample.tsv
(included in data download)What was your approach?
Just re-run as part of the scripts/run-for-subtyping.sh in kgaonkar6:v19-release in #1028
Then create a new branch for module specific review:
git checkout v19-release analyses/fusion_filtering/
What GitHub issue does your pull request address?
#867
Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.
Which areas should receive a particularly close look?
Not fully sure why the samples are missing from
pbta-fusion-recurrent-fusion-bysample.tsv
Is there anything that you want to discuss further?
Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?
Results
What types of results are included (e.g., table, figure)?
What is your summary of the results?
Reproducibility Checklist
Documentation Checklist
README
and it is up to date.analyses/README.md
and the entry is up to date.