Basic Features#
+-
+
- +finaletoolkit.frag.coverage(input_file: str | AlignmentFile, interval_file: str, output_file: str, scale_factor: float = 1000000.0, quality_threshold: int = 30, workers: int = 1, verbose: bool | int = False)# +
Return estimated fragment coverage over intervals specified in +intervals. Fragments are read from input_file which may be +a SAM, BAM, CRAM, or Frag.gz file. Uses an algorithm where the midpoints of +fragments are calculated and coverage is tabulated from the +midpoints that fall into the specified region. Not suitable for +fragments of size approaching interval size.
++ +Parameters#
+-
+
- input_filestr or pysam.AlignmentFile
SAM, BAM, CRAM, or Frag.gz file containing paired-end fragment +reads or its path. AlignmentFile must be opened in read mode.
+
+- interval_filestr
BED4 file containing intervals over which to generate coverage +statistics.
+
+- output_filestring, optional
Path for bed file to print coverages to. If output_file = _, +results will be printed to stdout.
+
+- scale_factorint, optional
Amount to multiply coverages by. Default is 10^6.
+
+
quality_threshold : int, optional +verbose : int or bool, optional
++ +Returns#
+-
+
- coverageint
Fragment coverage over contig and region.
+
+
-
+
- +finaletoolkit.frag.frag_length(input_file: str | AlignmentFile | TabixFile, contig: str = None, start: int = None, stop: int = None, intersect_policy: str = 'midpoint', output_file: str = None, quality_threshold: int = 30, verbose: bool = False) ndarray # +
Return np.ndarray containing lengths of fragments in input_file +that are above the quality threshold and are proper-paired reads.
++ +Parameters#
+-
+
- input_filestr or pysam.AlignmentFile
BAM, SAM, or CRAM file containing paired-end fragment reads or +its path. AlignmentFile must be opened in read mode.
+
+- contigstring, optional
Contig or chromosome to get fragments from
+
+- startint, optional
0-based left-most coordinate of interval
+
+- stopint, optional
1-based right-most coordinate of interval
+
+- intersect_policystr, optional
Specifies what policy is used to include fragments in the +given interval. Default is “midpoint”. Policies include: +- midpoint: the average of end coordinates of a fragment lies +in the interval. +- any: any part of the fragment is in the interval.
+
+
output_file : string, optional +quality_threshold : int, optional +verbose : bool, optional
++ +Returns#
+-
+
- lengthsnumpy.ndarray
ndarray of fragment lengths from file and contig if +specified.
+
+
-
+
- +finaletoolkit.frag.frag_length_bins(input_file: str | AlignmentFile, contig: str = None, start: int = None, stop: int = None, bin_size: int = None, output_file: str = None, contig_by_contig: bool = False, histogram: bool = False, intersect_policy: str = 'midpoint', quality_threshold: int = 30, verbose: bool | int = False) Tuple[ndarray, ndarray] # +
Takes input_file, computes frag lengths of fragments and returns +two arrays containing bins and counts by size. Optionally prints +data to output as a tab delimited table or histogram.
++ +Parameters#
+input_file : str or AlignmentFile +contig : str, optional +start : int, optional +stop : int, optional +bin_size : int, optional +output_file : str, optional +contig_by_contig: bool, optional +histogram: bool, optional +intersect_policy : str, optional
++
+Specifies what policy is used to include fragments in the +given interval. Default is “midpoint”. Policies include: +- midpoint: the average of end coordinates of a fragment lies +in the interval. +- any: any part of the fragment is in the interval.
+workers : int, optional
++ +Returns#
+bins : ndarray +counts : ndarray
+