-
Notifications
You must be signed in to change notification settings - Fork 83
Add shell script to generate analysis files that are included in data releases #1421
Add shell script to generate analysis files that are included in data releases #1421
Conversation
…-analysis-generation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
# Run hotspot detection | ||
echo "Run hotspots detection" | ||
bash ${analyses_dir}/hotspots-detection/run_overlaps_hotspots.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting that this module (and output file pbta-snv-scavenged-hotspots.maf.tsv.gz
) appears to be missing from here but it should be there, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they should, but maybe that should get fixed on #1424 and not here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll open an issue. Edit: #1428
# Set the working directory to the directory of this file | ||
cd "$(dirname "${BASH_SOURCE[0]}")" | ||
|
||
# If RUN_LOCAL is used, the time-intensive steps are skipped because they cannot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and memory-intensive!
# Compile all the files that need to be included in the release in one place | ||
# in the scratch directory | ||
compiled_dir=${scratch_dir}/analysis_files_for_release | ||
mkdir -p ${compiled_dir} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move these lines up before you start running modules, right after other directories are defined in lines 19-21
|
||
# Run GISTIC step -- only the part that generates ZIP file | ||
echo "Run GISTIC" | ||
# Run a step that subs ploidy for NA to allow GISTIC to run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question about the whether the CI if lines might be needed here.
OpenPBTA-analysis/analyses/run-gistic/run-gistic-module.sh
Lines 9 to 32 in e31b58c
# If this is CI, run the example included with GISTIC | |
# The sample size for the subset files are too small otherwise | |
IS_CI=${OPENPBTA_CI:-0} | |
if [[ "$IS_CI" -gt "0" ]] | |
then | |
# Environmental variables for MCR | |
ORIG_LD_LIBRARY_PATH=$LD_LIBRARY_PATH | |
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/mcr/v83/runtime/glnxa64:/opt/mcr/v83/bin/glnxa64:/opt/mcr/v83/sys/os/glnxa64 | |
export XAPPLRESDIR=/opt/mcr/v83/X11/app-defaults | |
# We want this to fail if the GISTIC example fails only -- because we have | |
# some instances of running GISTIC that do not complete but do save some | |
# output | |
set -e | |
set -o pipefail | |
# Run the example that comes with GISTIC - that allows us to | |
cd /home/rstudio/gistic_install && ./run_gistic_example | |
# 'Undo' environmental variables for MCR | |
export LD_LIBRARY_PATH=$ORIG_LD_LIBRARY_PATH | |
unset XAPPLRESDIR | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow. I don't necessarily expect this script (scripts/generate-analysis-files-for-release.sh
) to get run in CI.
We don't use analyses/run-gistic/run-gistic-module.sh
at all here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't necessarily expect this script (scripts/generate-analysis-files-for-release.sh) to get run in CI.
Just looking for a confirmation of this, and whether it might ever change! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remaining small feedback items I left are quick and at your discretion, so I don't need to see again 🚀
…9-analysis-generation
Continuing to break up the changes described in #1399 (comment)
Here I'm adding
scripts/generate-analysis-files-for-release.sh
, a shell script for generating analysis files that are included in the data download. As a reminder, an analysis file is defined as:It is intended to be used as part of a series of steps during the data release process, which will be as follows:
scripts/generate-analysis-files-for-release.sh
, which should generate all the analysis files for a release and put them inscratch/analysis_files_for_release
, and commit any changes to files that are included in the repository.scripts/run-for-subtyping.sh
and commit any changes to files that are included in the repository. (PR coming soon!)pbta-histologies.tsv
to the release