Skip to content

Commit

Permalink
Merge pull request #114 from epifluidlab/107-wps-error-in-processing-…
Browse files Browse the repository at this point in the history
…the-input-bed-files

107 wps error in processing the input bed files
  • Loading branch information
jamesli124 authored Nov 28, 2024
2 parents 38fe104 + 78549dc commit 83a30e7
Show file tree
Hide file tree
Showing 12 changed files with 82 additions and 13 deletions.
26 changes: 26 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,32 @@ The format is based on
and this project adheres to
[Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.7.8] - 2024-11-28

### Fixed
- update docs, docstring, and help message for wps to mention that
`site_bed` must be sorted.

### Added
- `normalize` keyword argument and `--normalize` flag to `finaletoolkit.frag.coverage` function and `finaletoolkit coverage` subcommand, respectively. Setting this argument/flag to true results in the output
being normalized by the total coverage, ignoring `scale_factor` if specified.
- `--intersect-policy` or `-p` flag added to `finaletoolkit coverage` subcommand.

## [0.7.7] - 2024-11-27

### Fixed
- subpackages can now be accessed when importing `finaletoolkit`. Previously,
the following code resulted in an error:
```python
>>> import finaletoolkit as ftk
>>> help(ftk.frag)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'finaletoolkit' has no attribute 'frag'
```
Now this is a valid way to access subpackages `cli`, `frag`, `genome`, and
`utils`.

## [0.7.6] - 2024-11-18

### Fixed
Expand Down
Binary file modified docs/_build/doctrees/documentation/api_reference/wps.doctree
Binary file not shown.
Binary file modified docs/_build/doctrees/documentation/cli_reference/index.doctree
Binary file not shown.
Binary file modified docs/_build/doctrees/environment.pickle
Binary file not shown.
4 changes: 3 additions & 1 deletion docs/_build/html/documentation/api_reference/wps.html
Original file line number Diff line number Diff line change
Expand Up @@ -425,7 +425,9 @@ <h1>Window Protection Score (WPS)<a class="headerlink" href="#window-protection-
<dd class="field-odd"><ul class="simple">
<li><p><strong>input_file</strong> (<em>str</em><em> or </em><em>pysam.AlignmentFile</em>) – BAM, SAM, or tabix file containing paired-end fragment reads or its
path. <cite>AlignmentFile</cite> must be opened in read mode.</p></li>
<li><p><strong>site_bed</strong> (<em>str</em>) – Bed file containing intervals to perform WPS on.</p></li>
<li><p><strong>site_bed</strong> (<em>str</em>) – BED file containing intervals to perform WPS on. The intervals
in this BED file should be sorted, first by <cite>contig</cite> then
<cite>start</cite>.</p></li>
<li><p><strong>output_file</strong> (<em>string</em><em>, </em><em>optional</em>) – </p></li>
<li><p><strong>window_size</strong> (<em>int</em><em>, </em><em>optional</em>) – Size of window to calculate WPS. Default is k = 120, equivalent
to L-WPS.</p></li>
Expand Down
9 changes: 4 additions & 5 deletions docs/_build/html/documentation/cli_reference/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -535,8 +535,7 @@ <h4>Named Arguments<a class="headerlink" href="#named-arguments_repeat3" title="
<section id="cleavage-profile">
<h3>cleavage-profile<a class="headerlink" href="#cleavage-profile" title="Permalink to this heading">#</a></h3>
<p>Calculates cleavage proportion over intervals defined in a BED file based on alignment data from a BAM/SAM/CRAM/Fragment file.</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">finaletoolkit</span> <span class="n">cleavage</span><span class="o">-</span><span class="n">profile</span> <span class="p">[</span><span class="o">-</span><span class="n">h</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">o</span> <span class="n">OUTPUT_FILE</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">lo</span> <span class="n">FRACTION_LOW</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">hi</span> <span class="n">FRACTION_HIGH</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">q</span> <span class="n">QUALITY_THRESHOLD</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">l</span> <span class="n">LEFT</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">r</span> <span class="n">RIGHT</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">w</span> <span class="n">WORKERS</span><span class="p">]</span>
<span class="p">[</span><span class="o">-</span><span class="n">v</span><span class="p">]</span>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">finaletoolkit</span> <span class="n">cleavage</span><span class="o">-</span><span class="n">profile</span> <span class="p">[</span><span class="o">-</span><span class="n">h</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">o</span> <span class="n">OUTPUT_FILE</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">lo</span> <span class="n">FRACTION_LOW</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">hi</span> <span class="n">FRACTION_HIGH</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">q</span> <span class="n">QUALITY_THRESHOLD</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">l</span> <span class="n">LEFT</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">r</span> <span class="n">RIGHT</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">w</span> <span class="n">WORKERS</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">v</span><span class="p">]</span>
<span class="n">input_file</span> <span class="n">interval_file</span>
</pre></div>
</div>
Expand Down Expand Up @@ -592,8 +591,8 @@ <h4>Named Arguments<a class="headerlink" href="#named-arguments_repeat4" title="
<section id="wps">
<h3>wps<a class="headerlink" href="#wps" title="Permalink to this heading">#</a></h3>
<p>Calculates Windowed Protection Score (WPS) over intervals defined in a BED file based on alignment data from a BAM/SAM/CRAM/Fragment file.</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">finaletoolkit</span> <span class="n">wps</span> <span class="p">[</span><span class="o">-</span><span class="n">h</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">o</span> <span class="n">OUTPUT_FILE</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">i</span> <span class="n">INTERVAL_SIZE</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">W</span> <span class="n">WINDOW_SIZE</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">lo</span> <span class="n">FRACTION_LOW</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">hi</span> <span class="n">FRACTION_HIGH</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">q</span> <span class="n">QUALITY_THRESHOLD</span><span class="p">]</span>
<span class="p">[</span><span class="o">-</span><span class="n">w</span> <span class="n">WORKERS</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">v</span><span class="p">]</span>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">finaletoolkit</span> <span class="n">wps</span> <span class="p">[</span><span class="o">-</span><span class="n">h</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">o</span> <span class="n">OUTPUT_FILE</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">i</span> <span class="n">INTERVAL_SIZE</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">W</span> <span class="n">WINDOW_SIZE</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">lo</span> <span class="n">FRACTION_LOW</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">hi</span> <span class="n">FRACTION_HIGH</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">q</span> <span class="n">QUALITY_THRESHOLD</span><span class="p">]</span> <span class="p">[</span><span class="o">-</span><span class="n">w</span> <span class="n">WORKERS</span><span class="p">]</span>
<span class="p">[</span><span class="o">-</span><span class="n">v</span><span class="p">]</span>
<span class="n">input_file</span> <span class="n">site_bed</span>
</pre></div>
</div>
Expand All @@ -604,7 +603,7 @@ <h4>Positional Arguments<a class="headerlink" href="#positional-arguments_repeat
<dd><p>Path to a BAM/SAM/CRAM/Fragment file containing fragment data.</p>
</dd>
<dt><kbd>site_bed</kbd></dt>
<dd><p>Path to a BED file containing intervals to calculate WPS over.</p>
<dd><p>Path to a BED file containing intervals to calculate WPS over. The intervals in this BED file should be sorted, first by <cite>contig</cite> then <cite>start</cite>.</p>
</dd>
</dl>
</section>
Expand Down
2 changes: 1 addition & 1 deletion docs/_build/html/searchindex.js

Large diffs are not rendered by default.

10 changes: 10 additions & 0 deletions src/finaletoolkit/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,13 @@
"""

from finaletoolkit.version import __version__


__all__ = ["cli", "frag", "genome", "utils"]

# Delay imports until the submodule is actually accessed.
def __getattr__(name):
if name in __all__:
import importlib
return importlib.import_module(f".{name}", __name__)
raise AttributeError(f"Module {__name__} has no attribute {name}")
19 changes: 18 additions & 1 deletion src/finaletoolkit/cli/main_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,12 +60,28 @@ def main_cli_parser():
default='-',
help='A BED file containing coverage values over the intervals '
'specified in interval file.')
cli_coverage.add_argument(
'-n',
'--normalize',
action='store_true',
help="If flag set, ignores any user inputed scale factor and "
"normalizes output by total coverage."
)
cli_coverage.add_argument(
'-s',
'--scale-factor',
default=1e6,
type=float,
help='Scale factor for coverage values.')
cli_coverage.add_argument(
'-p',
'--intersect_policy',
choices=['midpoint',
'any'],
default='midpoint',
type=str,
help='Specifies what policy is used to include fragments in the'
' given interval. See User Guide for more information.')
cli_coverage.add_argument(
'-q',
'--quality_threshold',
Expand Down Expand Up @@ -287,7 +303,8 @@ def main_cli_parser():
cli_wps.add_argument(
'site_bed',
help='Path to a BED file containing intervals to calculate WPS '
'over.')
'over. The intervals in this BED file should be sorted, first '
'by `contig` then `start`.')
cli_wps.add_argument(
'-o',
'--output_file',
Expand Down
19 changes: 16 additions & 3 deletions src/finaletoolkit/frag/coverage.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,12 +110,13 @@ def _single_coverage_star(args):
"""
return single_coverage(*args)

# TODO: add normalized coverage

def coverage(
input_file: Union[str, pysam.TabixFile, pysam.AlignmentFile, Path],
interval_file: str,
output_file: str,
scale_factor: float=1e6,
normalize: bool=False,
intersect_policy: str="midpoint",
quality_threshold: int=30,
workers: int=1,
Expand All @@ -142,6 +143,9 @@ def coverage(
results will be printed to stdout.
scale_factor : int, optional
Amount to multiply coverages by. Default is 10^6.
normalize : bool
When set to true, ignore `scale_factor` and divide by total
coverage.
intersect_policy: str, optional
Specifies how to determine whether fragments are in interval.
'midpoint' (default) calculates the central coordinate of each
Expand Down Expand Up @@ -169,6 +173,7 @@ def coverage(
interval_file: {interval_file}
output_file: {output_file}
scale_factor: {scale_factor}
normalize: {normalize}
intersect_policy: {intersect_policy}
quality_threshold: {quality_threshold}
workers: {workers}
Expand Down Expand Up @@ -216,6 +221,10 @@ def coverage(
if verbose:
tqdm.write(f'Total coverage is {total_coverage}\n')

# normalize
if normalize:
scale_factor = 1/total_coverage

# Output
output_is_file = False

Expand Down Expand Up @@ -244,15 +253,19 @@ def coverage(
f'{contig}\t{start}\t{stop}\t'
f'{coverage/total_coverage[4]*scale_factor}\n'
)
returnVal.append((contig,start,stop,name,coverage/total_coverage[4]*scale_factor))
returnVal.append(
(contig, start, stop, name,
coverage/total_coverage[4]*scale_factor))
else:
for contig, start, stop, name, coverage in coverages:
output.write(
f'{contig}\t{start}\t{stop}\t'
f'{name}\t'
f'{coverage/total_coverage[4]*scale_factor}\n'
)
returnVal.append((contig,start,stop,name,coverage/total_coverage[4]*scale_factor))
returnVal.append(
(contig, start, stop, name,
coverage/total_coverage[4]*scale_factor))
finally:
if output_is_file:
output.close()
Expand Down
4 changes: 3 additions & 1 deletion src/finaletoolkit/frag/multi_wps.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,9 @@ def multi_wps(
BAM, SAM, or tabix file containing paired-end fragment reads or its
path. `AlignmentFile` must be opened in read mode.
site_bed: str
Bed file containing intervals to perform WPS on.
BED file containing intervals to perform WPS on. The intervals
in this BED file should be sorted, first by `contig` then
`start`.
output_file : string, optional
window_size : int, optional
Size of window to calculate WPS. Default is k = 120, equivalent
Expand Down
2 changes: 1 addition & 1 deletion src/finaletoolkit/version.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@
Single-source module for the package version number.
"""

__version__ = "0.7.6"
__version__ = "0.7.8"

0 comments on commit 83a30e7

Please sign in to comment.