-
Notifications
You must be signed in to change notification settings - Fork 2
Doc.Singularity.Usecases.OSG.Example
Example Use-Case for Compiling JETSCAPE or X-SCAPE and Submitting jobs to the HTCondor Scheduler on the Open Science Grid (OSG) while using a Singularity Image to Satisfy the Dependencies
This use-case assumes that you already have an OSG account and have already logged in using an ssh terminal. If you don't already have an OSG account or have not already logged in, please refer to the OSG's introductory documentation: OSPool Documentation
This use-case applies to both JETSCAPE and X-SCAPE. The repository names and branch names can be adjusted as needed.
On many HPC clusters, the execution node (where the program runs) has access to your user account's home directory. However, because the OSG is a distributed system, the execution nodes cannot directly access your home directory. It is therefore necessary to send your compiled code to an execution node along with any input files.
These first scripts jetcomp.submit and jetcomp.sh can be placed in your home directory and can be used to submit a job that will clone JETSCAPE, compile it, and return the compiled code to be used in subsequent job submissions.
universe = vanilla
executable = jetcomp.sh
Requirements = HAS_SINGULARITY == TRUE
+SingularityImage = "/cvmfs/singularity.opensciencegrid.org/jetscape/base:stable"
# transfer JETSCAPE from home directory to the worker node
arguments = $(Cluster) $(Process)
transfer_output_files = JETSCAPE-EXE.tar.gz
OSDF_LOCATION = osdf:///ospool/apXX/data/your_osg_user_name
transfer_output_remaps = "JETSCAPE-EXE.tar.gz=$(OSDF_LOCATION)/JETSCAPE-EXE.tar.gz"
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
output = job_$(Cluster)_$(Process).out
error = job_$(Cluster)_$(Process).err
log = job_$(Cluster)_$(Process).log
JobDurationCategory = "Medium"
request_memory = 8.0GB
request_disk = 4.0GB
request_cpus = 1
queue 1
-
This is the script that will be submitted to the scheduler to compile JETSCAPE.
-
The line
executable = jetcomp.sh
calls a script that will perform the tasks of the job. Ensure that the jetcomp.sh file is placed in your home directory along with the jetcomp.submit file. -
The jetscape/base Docker image, which provides the dependencies needed for JETSCAPE, has already been associated with the OSG and is referenced in the above script.
-
This job's output is a compiled tar archive of the JETSCAPE code, which is specified at the line:
transfer_output_files = JETSCAPE-EXE.tar.gz
-
The line
OSDF_LOCATION = osdf:///ospool/apXX/data/your_osg_user_name
defines the location of your "public" folder. The OSG requires large files (over 1GB) to be placed in your public folder instead of your private home directory. More details about this can be found here.On the above line, change your_osg_user_name to your OSG user name, and change XX to the specific AP number that the OSG assigned to your login node. You can move to your OSDF directory with
cd /ospool/apXX/data/your_osg_user_name/
and return to your home directory withcd /home/your_osg_user_name
. -
The line
transfer_output_remaps = "JETSCAPE-EXE.tar.gz=$(OSDF_LOCATION)/JETSCAPE-EXE.tar.gz"
remaps the output file from being placed in your home directory to being placed in the public OSDF space. If you expect your output file to be under 1GB, you can omit thistransfer_output_remaps
line and the file will be placed in your home directory.Note that the OSDF is publicly accessible. See the OSG's documentation here for other important details.
-
JobDurationCategory = "Medium"
specifies jobs expected to complete in under 10 hours. Use "Long" for jobs expected to complete in under 20 hours. See more details about job duration here. -
queue 1
specifies to submit one instance of this job. -
Submit this job to the scheduler from your home directory using this command:
condor_submit jetcomp.submit
-
Check the status of your job with the command:
condor_q
#!/bin/bash
# clone the repository
echo "Cloning the JETSCAPE repository"
git clone https://github.com/JETSCAPE/JETSCAPE.git
# checkout the desired branch
cd ${_CONDOR_SCRATCH_DIR}/JETSCAPE
git checkout main
echo "JETSCAPE repository cloned and main branch checked out"
# build the external packages
echo "Building the external packages"
cd ${_CONDOR_SCRATCH_DIR}/JETSCAPE/external_packages
./get_music.sh
./get_iSS.sh
./get_freestream-milne.sh
./get_smash.sh
echo "External packages built"
# build the JETSCAPE code
echo "Building the JETSCAPE code"
cd ${_CONDOR_SCRATCH_DIR}/JETSCAPE
if [ -d "build" ]; then
rm -rf "build"
fi
mkdir build
cd build
export SMASH_DIR="${_CONDOR_SCRATCH_DIR}/JETSCAPE/external_packages/smash/smash_code"
cmake .. -DUSE_MUSIC=ON -DUSE_ISS=ON -DUSE_FREESTREAM=ON -DUSE_SMASH=ON
make
echo "JETSCAPE code built"
# create an archive of the JETSCAPE code (built on the execute node)
echo "Creating an archive of the JETSCAPE code"
cd ${_CONDOR_SCRATCH_DIR}
tar -czf JETSCAPE-EXE.tar.gz JETSCAPE
echo "Archive of the JETSCAPE code created"
- This is the bash script that was called from the jetcomp.submit submission file. The work of this script is to clone JETSCAPE, check out a desired branch, download the external packages, build JETSCAPE, and create a tar archive of the compiled code to be returned at the end of the job.
- This script can be modified to clone X-SCAPE instead of JETSCAPE or to checkout a different repository branch.
Now that the JETSCAPE code has been compiled and placed in a tar archive, that tar archive can be passed as input to subsequent jobs.
universe = vanilla
executable = jetjob.sh
Requirements = HAS_SINGULARITY == TRUE
+SingularityImage = "/cvmfs/singularity.opensciencegrid.org/jetscape/base:stable"
# transfer JETSCAPE from home directory to the worker node
OSDF_LOCATION = osdf:///ospool/apXX/data/your_osg_user_name
transfer_input_files = $(OSDF_LOCATION)/JETSCAPE-EXE.tar.gz
arguments = $(Cluster) $(Process)
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
output = $(ClusterId)_$(ProcId)/output.txt
error = job_$(Cluster)_$(Process).err
log = job_$(Cluster)_$(Process).log
transfer_output_files = test_out.tar.gz
transfer_output_remaps = "test_out.tar.gz = $(ClusterId)_$(ProcId)/test_out.tar.gz"
JobDurationCategory = "Medium"
request_memory = 8.0GB
request_disk = 4.0GB
request_cpus = 1
queue 4
-
This is the script that will be submitted to the scheduler to run JETSCAPE.
-
Be sure to update the OSDF_LOCATION variable just as it was done in jetcomp.submit.
-
The
transfer_input_files
line assumes that the tar archive of the compiled JETSCAPE code is in your OSDF space. If the tar archive of the compiled code is instead in your home directory (because it is under 1GB), you may omit the OSDF path. -
The output, error, and log files are named using the job id variables, so multiple jobs won't overwrite files with the same name.
-
Because this example doesn't expect the archived output files to be greater than 1GB, OSDF is not specified in the
transfer_output_remaps
. -
queue 4
means that four instances of this submission will run with the same input parameters. If you need each submission to have distinct input parameters, the OSG describes various approaches here.
#!/bin/bash
# set the arguments from the submit file
cluster_id=$1
proc_id=$2
# extract tar archive
echo "Extracting the JETSCAPE code..."
cd ${_CONDOR_SCRATCH_DIR}
mv JETSCAPE-EXE.tar.gz JETSCAPE.tar.gz
tar -xzf JETSCAPE.tar.gz
rm JETSCAPE.tar.gz
echo "JETSCAPE code extracted"
# create a directory for the output
echo "Creating a directory for the output..."
mkdir ${_CONDOR_SCRATCH_DIR}/test_out
echo "Output directory created"
# set the environment variables
export SMASH_DIR="${_CONDOR_SCRATCH_DIR}/JETSCAPE/external_packages/smash/smash_code"
export JETSCAPE_DIR="${_CONDOR_SCRATCH_DIR}/JETSCAPE"
# run the job
echo "Running the job..."
cd ${_CONDOR_SCRATCH_DIR}/JETSCAPE/build
./runJetscape ../config/publications_config/arXiv_1910.05481/jetscape_user_PP_1910.05481.xml
echo "Job completed"
# move files beginning with test_out to the test_out directory
echo "Moving output files..."
mv test_out* ${_CONDOR_SCRATCH_DIR}/test_out/
echo "Output files moved"
# archive the output directory
echo "Archiving the output directory..."
cd ${_CONDOR_SCRATCH_DIR}
tar -czf test_out.tar.gz test_out/
echo "Output directory archived"
# clean up
echo "Cleaning up..."
rm -rf JETSCAPE
echo "Clean up complete"
-
This is the bash script that was called from the jetjob.submit submission file. The work of this script is to extract the compiled JETSCAPE code, run JETSCAPE, and package the desired output files in a tar archive to be returned to the user at the end of the job.
-
Below we see the output files test_out_final_state_hadrons.dat and test_out_final_state_partons.dat returned to the home directory after the run successfully finishes. 16595125_0 is a unique job id for this job.
[my_osg_user_name@apXX ~]$ tar -tvf 16595125_0/test_out.tar.gz
drwxr-xr-x osgusers/domain users 0 2024-09-08 11:46 test_out/
-rw-r--r-- osgusers/domain users 10720017 2024-09-08 11:46 test_out/test_out_final_state_hadrons.dat
-rw-r--r-- osgusers/domain users 1855471 2024-09-08 11:46 test_out/test_out_final_state_partons.dat
[my_osg_user_name@apXX ~]$
If the code you wish to compile is included in a private repository or not yet pushed, clone or otherwise transfer the source code into your OSG home directory. There you can modify any of the source files before compiling.
[my_osg_user_name@apXX JETSCAPE]$ ls
activate_jetscape.csh AUTHORS cmakemodules COPYING examples INSTALL.txt JetScapeDoxy.conf README.md
activate_jetscape.sh CMakeLists.txt config docker external_packages jail README_LINUX.md src
The jetcomp_alt.submit and jetcomp_alt.sh scripts (shown below) compile code that has been amended locally in the home directory.
Before submitting the compile job, create a tar archive of the JETSCAPE or X-SCAPE folder that you wish to compile. If the archive is under 1GB, it can be submitted from the home directory.
tar -czf JETSCAPE-ALT.tar.gz JETSCAPE
universe = vanilla
executable = jetcomp-alt.sh
Requirements = HAS_SINGULARITY == TRUE
+SingularityImage = "/cvmfs/singularity.opensciencegrid.org/jetscape/base:stable"
# transfer JETSCAPE from home directory to the worker node
transfer_input_files = JETSCAPE-ALT.tar.gz
arguments = $(Cluster) $(Process)
transfer_output_files = JETSCAPE-ALT-EXE.tar.gz
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
output = job_$(Cluster)_$(Process).out
error = job_$(Cluster)_$(Process).err
log = job_$(Cluster)_$(Process).log
JobDurationCategory = "Medium"
request_memory = 8.0GB
request_disk = 4.0GB
request_cpus = 1
queue 1
-
The above jetcomp-alt.submit script passes the tar archive JETSCAPE-ALT.tar.gz to an execution node for compilation.
-
Because JETSCAPE-ALT.tar.gz is under 1GB, it can be passed from the home directory.
-
The compiled code, also expected to be under 1GB, is returned to the home directory as JETSCAPE-ALT-EXE.tar.gz. This archive can be passed to other job submissions to run JETSCAPE.
#!/bin/bash
# extract tar archive
mv JETSCAPE-ALT.tar.gz JETSCAPE.tar.gz
tar -xzf JETSCAPE.tar.gz
rm JETSCAPE.tar.gz
# build the JETSCAPE code
cd ${_CONDOR_SCRATCH_DIR}/JETSCAPE
# if the build directory exists, remove it
if [ -d "build" ]; then
rm -rf "build"
fi
mkdir build
cd build
cmake ..
make
# create an archive of the JETSCAPE directory (built on the execute node)
cd ${_CONDOR_SCRATCH_DIR}
tar -czf JETSCAPE-ALT-EXE.tar.gz JETSCAPE
-
With jetcomp-alt.submit, jetcomp-alt.sh and JETSCAPE-ALT.tar.gz in your home directory, the compile job can be submitted with the command:
condor_submit jetcomp-alt.submit
and monitored withcondor_q
. -
JETSCAPE-ALT-EXE.tar.gz, which contains the compiled code, is returned to your home directory at the end of the job.