Skip to content
CarineRey edited this page Mar 16, 2016 · 7 revisions

How to get help?

/path/to/apytram/apytram.py -h

The help message provides information on all the possible options.

usage: apytram.py [-h] [--version] -d DATABASE -dt {single,paired,FR,RF,F,R}
                  [-fa [FASTA [FASTA ...]]] [-fq [FASTQ [FASTQ ...]]]
                  [-q QUERY] [-pep QUERY_PEP] [-i ITERATION_MAX]
                  [-i_start ITERATION_START] [-out OUTPUT_PREFIX] [-log LOG]
                  [-tmp TMP] [--keep_iterations] [--no_best_file]
                  [--only_best_file] [--stats] [--plot] [--plot_ali]
                  [-e EVALUE] [-id MIN_ID] [-mal MIN_ALI_LEN] [-len MIN_LEN]
                  [-required_coverage REQUIRED_COVERAGE] [--finish_all_iter]
                  [-flen FINAL_MIN_LEN] [-fid FINAL_MIN_ID]
                  [-fmal FINAL_MIN_ALI_LEN] [-threads THREADS]
                  [-memory MEMORY] [-time_max TIME_MAX]

Run apytram.py on a fastq file to retrieve homologous sequences of bait
sequences.

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

Required arguments:
  -d DATABASE, --database DATABASE
                        Database prefix name. If a database with the same name
                        already exists, the existing database will be kept and
                        the database will NOT be rebuilt.
  -dt {single,paired,FR,RF,F,R}, --database_type {single,paired,FR,RF,F,R}
                        single: single unstranded data ______________________
                        paired: paired unstranded data ______________________
                        RF: paired stranded data (/1 = reverse ; /2 = forward)
                        FR: paired stranded data (/2 = reverse ; /1 = forward)
                        F: single stranded data (reads = forward) ____________
                        R: single stranded data (reads = reverse) ____________
                        WARNING: Paired read names must finished by 1 or 2

Input Files:
  -fa [FASTA [FASTA ...]], --fasta [FASTA [FASTA ...]]
                        Fasta formated RNA-seq data to build the database of
                        reads (only one file).
  -fq [FASTQ [FASTQ ...]], --fastq [FASTQ [FASTQ ...]]
                        Fastq formated RNA-seq data to build the database of
                        reads (several space delimited fastq file names are
                        allowed). WARNING: Paired read names must finished by
                        1 or 2. (fastq files will be first converted to a
                        fasta file. This process can require some time.)

Query File:
  -q QUERY, --query QUERY
                        Fasta file (nucl) with homologous bait sequences which
                        will be treated together for the apytram run. If no
                        query is submitted, the program will just build the
                        database. WARNING: Sequences must not contain other
                        characters that a t g c n (eg. - * . ).
  -pep QUERY_PEP, --query_pep QUERY_PEP
                        Fasta file containing the query in the peptide format.
                        It will be used at the first iteration as bait
                        sequences to fish reads. It is compulsory to include
                        also the query in nucleotide format (-q option)

Number of iterations:
  -i ITERATION_MAX, --iteration_max ITERATION_MAX
                        Maximum number of iterations. (Default 5)
  -i_start ITERATION_START, --iteration_start ITERATION_START
                        Number of the first iteration. If different of 1, the
                        tmp option must be used. (Default: 1)

Output Files:
  -out OUTPUT_PREFIX, --output_prefix OUTPUT_PREFIX
                        Output prefix (Default ./apytram)
  -log LOG              a log file to report avancement (default: apytram.log)
  -tmp TMP              Directory to stock all intermediary files for the
                        apytram run. (default: a directory in /tmp which will
                        be removed at the end)
  --keep_iterations     A fasta file containing reconstructed sequences will
                        be created at each iteration. (default: False)
  --no_best_file        By default, a fasta file (Outprefix.best.fasta)
                        containing only the best sequence is created. If this
                        option is used, it will NOT be created.
  --only_best_file      By default, a fasta file (Outprefix.fasta) containing
                        all sequences from the last iteration is created. If
                        this option is used, it will NOT be created.
  --stats               Create files with statistics on each iteration.
                        (default: False)
  --plot                Create plots to represent the statistics on each
                        iteration. (default: False)
  --plot_ali            Create file with a plot representing the alignement of
                        all sequences from the last iteration on the query
                        sequence. Take some seconds. (default: False)

Thresholds for EACH ITERATION:
  -e EVALUE, --evalue EVALUE
                        Evalue threshold of the blastn of the bait queries on
                        the database of reads. (Default 1e-3)
  -id MIN_ID, --min_id MIN_ID
                        Minimum identity percentage of a sequence with a query
                        on the length of their alignment so that the sequence
                        is kept at the end of a iteration (Default 50)
  -mal MIN_ALI_LEN, --min_ali_len MIN_ALI_LEN
                        Minimum alignment length of a sequence on a query to
                        be kept at the end of a iteration (Default 180)
  -len MIN_LEN, --min_len MIN_LEN
                        Minimum length to keep a sequence at the end of a
                        iteration. (Default 200)

Criteria to stop iteration:
  -required_coverage REQUIRED_COVERAGE
                        Required coverage of a bait sequence to stop iteration
                        (Default: No threshold)
  --finish_all_iter     By default, iterations are stop if there is no
                        improvment, if this option is used apytram will finish
                        all iteration (-i).

Thresholds for Final output files:
  -flen FINAL_MIN_LEN, --final_min_len FINAL_MIN_LEN
                        Minimum PERCENTAGE of the query length to keep a
                        sequence at the end of the run. (Default: 0)
  -fid FINAL_MIN_ID, --final_min_id FINAL_MIN_ID
                        Minimum identity PERCENTAGE of a sequence with a query
                        on the length of their alignment so that the sequence
                        is kept at the end of the run (Default 0)
  -fmal FINAL_MIN_ALI_LEN, --final_min_ali_len FINAL_MIN_ALI_LEN
                        Alignment length between a sequence and a query must
                        be at least this PERCENTAGE of the query length to
                        keep this sequence at the end of the run. (Default: 0)

Miscellaneous options:
  -threads THREADS      Number of available threads. (Default 1)
  -memory MEMORY        Memory available for the assembly in Giga. (Default 1)
  -time_max TIME_MAX    Do not begin a new iteration if the job duration (in
                        seconds) has exceed this threshold. (Default 7200)