High reference-free compression of genomic data
git clone https://github.com/cobioders/jarvis2.git cd jarvis2/src/ make
Run JARVIS2 using level 9:
./JARVIS2 -v -l 9 File.seq
To see the possible options type
./JARVIS2 -h
This will print the following options:
SYNOPSIS
./JARVIS2 [OPTION]... [FILE]
SAMPLE
Run Compression -> ./JARVIS2 -v -l 4 sequence.txt
Run Decompression -> ./JARVIS2 -v -d sequence.txt.jc
DESCRIPTION
Lossless compression and decompression of genomic
sequences for efficient storage and analysis purposes.
Measure an upper bound of the sequence complexity.
-h, --help
Usage guide (help menu).
-a, --version
Display program and version information.
-x, --explanation
Explanation of the context and repeat models.
-f, --force
Force mode. Overwrites old files.
-v, --verbose
Verbose mode (more information).
-d, --decompress
Decompression mode.
-e, --estimate
It creates a file with the extension ".iae" with the
respective information content. If the file is FASTA or
FASTQ it will only use the "ACGT" (genomic) sequence.
-s, --show-levels
Show pre-computed compression levels (configured).
-l [NUMBER], --level [NUMBER]
Compression level (integer).
Default level: 4.
It defines compressibility in balance with computational
resources (RAM & time). Use -s for levels perception.
-hs [NUMBER], --hidden-size [NUMBER]
Hidden size of the neural network (integer).
Default value: 40.
-lr [DOUBLE], --learning-rate [DOUBLE]
Neural Network leaning rate (double).
Default value: 0.030.
[FILE]
Input sequence filename (to compress) -- MANDATORY.
File to compress is the last argument.
To see the possible levels (automatic choosen compression parameters), type:
./JARVIS2 -s
Preparing JARVIS2 for FASTA:
cd FASTA/ chmod +x *.sh ./JARVIS2_FASTA --install
Compression:
./JARVIS2_FASTA.sh --threads 8 --block 10MB --input sample.fa
Decompression:
./JARVIS2_FASTA.sh --decompress --threads 4 --input sample.fa.tar
Preparing JARVIS2 for FASTQ:
cd FASTQ/ chmod +x *.sh ./JARVIS2_FASTQ --install
Compression:
./JARVIS2_FASTQ.sh --threads 8 --block 40MB --input sample.fq
Decompression:
./JARVIS2_FASTQ.sh --decompress --threads 4 --input sample.fq.tar
In progress...
For any issue let us know at issues link.
For more information:
http://www.gnu.org/licenses/gpl-3.0.html