-
Notifications
You must be signed in to change notification settings - Fork 2
Help
CarineRey edited this page Mar 16, 2016
·
7 revisions
/path/to/apytram/apytram.py -h
The help message provides information on all the possible options.
usage: apytram.py [-h] [--version] -d DATABASE -dt {single,paired,FR,RF,F,R}
[-fa [FASTA [FASTA ...]]] [-fq [FASTQ [FASTQ ...]]]
[-q QUERY] [-pep QUERY_PEP] [-i ITERATION_MAX]
[-i_start ITERATION_START] [-out OUTPUT_PREFIX] [-log LOG]
[-tmp TMP] [--keep_iterations] [--no_best_file]
[--only_best_file] [--stats] [--plot] [--plot_ali]
[-e EVALUE] [-id MIN_ID] [-mal MIN_ALI_LEN] [-len MIN_LEN]
[-required_coverage REQUIRED_COVERAGE] [--finish_all_iter]
[-flen FINAL_MIN_LEN] [-fid FINAL_MIN_ID]
[-fmal FINAL_MIN_ALI_LEN] [-threads THREADS]
[-memory MEMORY] [-time_max TIME_MAX]
Run apytram.py on a fastq file to retrieve homologous sequences of bait
sequences.
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
Required arguments:
-d DATABASE, --database DATABASE
Database prefix name. If a database with the same name
already exists, the existing database will be kept and
the database will NOT be rebuilt.
-dt {single,paired,FR,RF,F,R}, --database_type {single,paired,FR,RF,F,R}
single: single unstranded data ______________________
paired: paired unstranded data ______________________
RF: paired stranded data (/1 = reverse ; /2 = forward)
FR: paired stranded data (/2 = reverse ; /1 = forward)
F: single stranded data (reads = forward) ____________
R: single stranded data (reads = reverse) ____________
WARNING: Paired read names must finished by 1 or 2
Input Files:
-fa [FASTA [FASTA ...]], --fasta [FASTA [FASTA ...]]
Fasta formated RNA-seq data to build the database of
reads (only one file).
-fq [FASTQ [FASTQ ...]], --fastq [FASTQ [FASTQ ...]]
Fastq formated RNA-seq data to build the database of
reads (several space delimited fastq file names are
allowed). WARNING: Paired read names must finished by
1 or 2. (fastq files will be first converted to a
fasta file. This process can require some time.)
Query File:
-q QUERY, --query QUERY
Fasta file (nucl) with homologous bait sequences which
will be treated together for the apytram run. If no
query is submitted, the program will just build the
database. WARNING: Sequences must not contain other
characters that a t g c n (eg. - * . ).
-pep QUERY_PEP, --query_pep QUERY_PEP
Fasta file containing the query in the peptide format.
It will be used at the first iteration as bait
sequences to fish reads. It is compulsory to include
also the query in nucleotide format (-q option)
Number of iterations:
-i ITERATION_MAX, --iteration_max ITERATION_MAX
Maximum number of iterations. (Default 5)
-i_start ITERATION_START, --iteration_start ITERATION_START
Number of the first iteration. If different of 1, the
tmp option must be used. (Default: 1)
Output Files:
-out OUTPUT_PREFIX, --output_prefix OUTPUT_PREFIX
Output prefix (Default ./apytram)
-log LOG a log file to report avancement (default: apytram.log)
-tmp TMP Directory to stock all intermediary files for the
apytram run. (default: a directory in /tmp which will
be removed at the end)
--keep_iterations A fasta file containing reconstructed sequences will
be created at each iteration. (default: False)
--no_best_file By default, a fasta file (Outprefix.best.fasta)
containing only the best sequence is created. If this
option is used, it will NOT be created.
--only_best_file By default, a fasta file (Outprefix.fasta) containing
all sequences from the last iteration is created. If
this option is used, it will NOT be created.
--stats Create files with statistics on each iteration.
(default: False)
--plot Create plots to represent the statistics on each
iteration. (default: False)
--plot_ali Create file with a plot representing the alignement of
all sequences from the last iteration on the query
sequence. Take some seconds. (default: False)
Thresholds for EACH ITERATION:
-e EVALUE, --evalue EVALUE
Evalue threshold of the blastn of the bait queries on
the database of reads. (Default 1e-3)
-id MIN_ID, --min_id MIN_ID
Minimum identity percentage of a sequence with a query
on the length of their alignment so that the sequence
is kept at the end of a iteration (Default 50)
-mal MIN_ALI_LEN, --min_ali_len MIN_ALI_LEN
Minimum alignment length of a sequence on a query to
be kept at the end of a iteration (Default 180)
-len MIN_LEN, --min_len MIN_LEN
Minimum length to keep a sequence at the end of a
iteration. (Default 200)
Criteria to stop iteration:
-required_coverage REQUIRED_COVERAGE
Required coverage of a bait sequence to stop iteration
(Default: No threshold)
--finish_all_iter By default, iterations are stop if there is no
improvment, if this option is used apytram will finish
all iteration (-i).
Thresholds for Final output files:
-flen FINAL_MIN_LEN, --final_min_len FINAL_MIN_LEN
Minimum PERCENTAGE of the query length to keep a
sequence at the end of the run. (Default: 0)
-fid FINAL_MIN_ID, --final_min_id FINAL_MIN_ID
Minimum identity PERCENTAGE of a sequence with a query
on the length of their alignment so that the sequence
is kept at the end of the run (Default 0)
-fmal FINAL_MIN_ALI_LEN, --final_min_ali_len FINAL_MIN_ALI_LEN
Alignment length between a sequence and a query must
be at least this PERCENTAGE of the query length to
keep this sequence at the end of the run. (Default: 0)
Miscellaneous options:
-threads THREADS Number of available threads. (Default 1)
-memory MEMORY Memory available for the assembly in Giga. (Default 1)
-time_max TIME_MAX Do not begin a new iteration if the job duration (in
seconds) has exceed this threshold. (Default 7200)