Skip to content

Example run on UltraViolet (NYU Langone Health HPC)

javrodriguez edited this page Apr 11, 2024 · 5 revisions

HiC-Bench is designed to work on the NYULMC HPC UltraViolet cluster using the SLURM job scheduler. This pipeline can work in other environments but that would require several modifications.

In this example we show how to set and start a HiC-Bench run on this HPC using two samples: mouse embryonic stem cells (ES) and mouse embryonic fibroblasts (MEF).

1) Download and install HiC-Bench

module load git

git clone --depth 1 https://github.com/NYU-BFX/hic-bench.git

2) Organize the fastq files

i) First make a 'fastq' directory in 'hic-bench/pipelines/hicseq-standard/inputs/'

cd hic-bench/pipelines/hicseq-standard/inputs/

mkdir fastq

ii) Create directories for every sample and populate them with the respective fastq files.

cd fastq

ln -sn /gpfs/data/courses/bminga3004/2024/Practicum11/HiC/fastq/ES-untreated-Arima-rep1/ ./

ln -sn /gpfs/data/courses/bminga3004/2024/Practicum11/HiC/fastq/MEF-untreated-Arima-rep1/ ./

Format recommended: celltype-treatment-enzyme-replicate This directory format with 4 fields separated by "-" will help fill the sample sheet correctly in the next step. In this example we already provided pre-formated directories and we just generated symbolic links of them.

3) Create a sample sheet using the "create-sample-sheet.tcsh" script.

cd ..

./create-sample-sheet.tcsh mm10 # the only argument required is the genome version (e.g. hg19, hg38, mm10)

4) Check the sample sheet and edit it manually if needed

5) Run the alignment step

Go to the alignment step directory'hic-bench/pipelines/hicseq-standard/__01a-align' and start the alignment run:

cd ../__01a-align

sbatch submit_step_run.sh

6) Wait until the alignment is finished. Then run the filter step.

Go to the filter step directory 'hic-bench/pipelines/hicseq-standard/__02a-filter' and start the filter run the same way:


Once the filtering step has finished then you can run the next steps by following the same instructions. Some of the most used steps are the following:

'__02b-filter-stats': Produce barplots with important QC information.

'__03a-tracks': Generate contact matrices in '.hic' format which are highly compressed binary files. These files can be visualized on the Juicebox Web App (https://www.aidenlab.org/juicebox/).

'__05a-matrix-filtered' and '__06a-matrix-ic': Make normalized contact matrices in standard text format.

'__09a-domains': Identify topologically associated domains (TADs).