Simulates SV of different purity levels and types. This was used to generate test data for validating the read counting used in SVclone.
Install these dependencies:
- Java Development Kit
- Samtools -- make sure samtools is in your $PATH
- Rubra -- install Rubra
These instructions will clone this repository, and install SimSeq and refpluspipeline.
git clone git@github.com:mcmero/sv_simu_pipe.git
cd sv_simu_pipe
git clone git@github.com:mcmero/refpluspipeline.git
cd refpluspipeline
./compile_javasim.sh
cd ..
git clone https://github.com/jstjohn/SimSeq.git
Create a reference directory under your sv_simu_pipe directory, and add the following files:
- hg19 chromosome reference (pick one chromosome, 12 is used in the example) -- generate an index with Bowtie2
- hg19 repeat track -- from UCSC (name it hg19_repeats.txt)
Configure the sv_simu_config.py configuration file, making sure the reference genome, repeats file, SimSeq dir and the directories for the java simulator are set correctly.
Run as:
./batch_simu.sh
This will generate 20 simulations: 20%, 40%, 60%, 80% and 100% purity mixtures for deletions, duplications, inversions and traslocations. This can be used as input for SV callers. VAFs can then be characterised using SVclone.