-
Notifications
You must be signed in to change notification settings - Fork 83
Fix v15 breaking changes #574
Comments
Because of the extent of these breaking changes, I believe it's prudent to handle each module in a separate pull request as stated above. Unfortunately, that means that CI will fail for a bunch of these fixes until the last one goes in. So here is the general procedure I think we should follow:
This procedure has a significant weakness in that there may be changes introduced in any one fix that will cause CI to fail unexpectedly once the final fix goes in and this issue is closed. Once this issue gets closed, #569 should be updated such that it is in sync with |
So we can keep track of progress on these, I took @jaclyn-taroni 's list above and made it into a checklist. We can claim items and then check things off as we fix them. I'll start by claiming this first item. I'll put the PR number next to it too when I get it filed. v15 breaking changes TODO list
|
I just realized that the |
Okay. Well I just started working on it now, I'll see if that's it. |
@jaclyn-taroni you were right. It is fine if the first script is ran. #587 |
Okay 👍 - would love to see those organized such that the step that was failing was immediately after the step that it depends on (perhaps after v15 is out out). I think that would have increased the chances I noticed that immediately. |
I was about to just make this change when I had the branch open but I didn't know if there were particular sequential orders to some of the other tests and didn't want to throw another possible wrench in our testings here. But yeah, we may even want to have a bash script that calls both and make it one CircleCI test. |
@jaclyn-taroni In regards to
I don't know enough about this analysis module to make an informed decision on this. Do we want to retire it though? |
@jashapiro - what do you think, time to retire |
I think it can be deprecated. Will do that now. |
We did it, everyone! Changes incorporated to master in #569 |
To quote the release notes being added in #569, we're changing the names of well-enough-used columns in the clinical file:
I know this change to
pbta-histologies.tsv
will break a number of things. The purpose of this issue is to track what will need to be changed as a result. Not only will the column names need to be updated, but we will also need to rerun any notebooks, change documentation, etc.Anticipated issues
Here I'll list what I know needs to change in modules that are not deprecated.
Some of the modeling steps of
gene-set-enrichment-analysis
usedisease_type_new
:OpenPBTA-analysis/analyses/gene-set-enrichment-analysis/02-model-gsea.Rmd
Line 123 in 286ff25
Luckily the
gsva_anova_tukey
function is already flexible!OpenPBTA-analysis/analyses/gene-set-enrichment-analysis/util/hallmark_models.R
Line 35 in 286ff25
The first step of
interaction-plots
uses thedisease_type_new
column to generate lists of samples:OpenPBTA-analysis/analyses/interaction-plots/scripts/01-disease-specimen-lists.R
Line 97 in 286ff25
Documentation associated with that option will also need to change.
We filter out ATRT and MB samples in
molecular-subtyping-embryonal
usingdisease_type_old
OpenPBTA-analysis/analyses/molecular-subtyping-embryonal/01-samples-to-subset.Rmd
Line 128 in 286ff25
and check
disease_type_new
as well:OpenPBTA-analysis/analyses/molecular-subtyping-embryonal/01-samples-to-subset.Rmd
Line 136 in 286ff25
Also in this module, we use both the
disease_type
columns in the subtyping and generating final tables quite a bit starting aroundOpenPBTA-analysis/analyses/molecular-subtyping-embryonal/04-table-prep.Rmd
Line 325 in 286ff25
The README for this module needs to change as well + this documentation:
OpenPBTA-analysis/analyses/molecular-subtyping-embryonal/02-generate-subset-files.R
Line 6 in 286ff25
The subset files step of
molecular-subtyping-EPN
usesdisease_type_new
OpenPBTA-analysis/analyses/molecular-subtyping-EPN/00-subset-for-EPN.R
Line 55 in 286ff25
In
molecular-subtyping-HGG
, we usedisease_type_new
quite a bit for classification based on defining lesions, here's just one example from the first notebook:OpenPBTA-analysis/analyses/molecular-subtyping-HGG/01-HGG-molecular-subtyping-defining-lesions.Rmd
Line 59 in 286ff25
disease_type_new
is one of the "layers" associated with all of the plotting insample-distribution-analysis
: https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/286ff25022930024bb9812e3cfad5410a2cf49c8/analyses/sample-distribution-analysis/01-filter-across-types.R, https://github.com/AlexsLemonade/OpenPBTA-analysis/blob/286ff25022930024bb9812e3cfad5410a2cf49c8/analyses/sample-distribution-analysis/02-multilayer-plots.R and is used in the tables generated in03-tumor-descriptor-and-assay-count
:OpenPBTA-analysis/analyses/sample-distribution-analysis/03-tumor-descriptor-and-assay-count.Rmd
Line 206 in 286ff25
selection-strategy-comparison
includes consideration ofdisease_type_new
:OpenPBTA-analysis/analyses/selection-strategy-comparison/01-selection-strategies.rmd
Line 169 in 286ff25
We may want to just deprecate this analysis at this point rather than try to maintain it?
Issues that have arisen as part of #576
molecular-subtyping-chordoma
fails with the following:That's from this chunk:
OpenPBTA-analysis/analyses/molecular-subtyping-chordoma/01-Subtype-chordoma.Rmd
Line 166 in 286ff25
I suspect what is actually happening is that there are no chordoma samples in the expression data used in CI and this step
OpenPBTA-analysis/analyses/molecular-subtyping-chordoma/01-Subtype-chordoma.Rmd
Line 147 in 286ff25
We may want to take an approach that is similar to other subtyping modules and have the first step be a script that generates files that consist only of chordoma samples that are committed to the repository.
The
Add Shatterseek
step ofsv-analysis
, which isRscript analyses/sv-analysis/02-shatterseek.R
fails with:analyses/sv-analysis/02-shatterseek.R
uses an independent specimen file, which is included in its entirety in CI, to read in files:OpenPBTA-analysis/analyses/sv-analysis/02-shatterseek.R
Line 48 in 286ff25
The step that would have generated
scratch/sv-vcf/BS_K07KNTFY_withoutYandM.tsv
comes prior to this one in CIOpenPBTA-analysis/.circleci/config.yml
Line 142 in 286ff25
So it will only have access to the subsetted Manta file. See #449 (comment) and #449 (comment) for more context. The
sv-analysis
module should be more robust to "missing" samples.Next steps
@cansavvy @cbethell @jashapiro I'd recommend splitting this up such that modifications to each module are in separate pull requests so you can go through any make sure you catch any documentation stuff I may not have come across.
The text was updated successfully, but these errors were encountered: