Skip to content

Commit

Permalink
Update version 3.5.1 (#290)
Browse files Browse the repository at this point in the history
* remove constrain on openblas

* add message if bedtools is not installed.

* overcome potential other error

* Test tabix independently of extension

* modify get_scores to reach end_region

* linting

* mention installation of BEDTools in the doc

* update the requirements in the doc

* Highlight BEDTools as external dependency

* raise an error when BEDTools is not installed

* use one more log in utilities

* copy the expected pdf before compare_images

* linting

* add line if FAQ regarding hicmatrix14

* fix broken link in all properties table

* update installation doc

* change error to warning

* update version of HiCMatrix

* update version

* copy the bug fix line of hicexplorer to pgt

Co-authored-by: Joachim Wolff <wolffj@informatik.uni-freiburg.de>
  • Loading branch information
lldelisle and joachimwolff authored Oct 12, 2020
1 parent 709aa6e commit 1c6b592
Show file tree
Hide file tree
Showing 20 changed files with 224 additions and 103 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ Also, pyGenomeTracks can be installed using pip
$ pip install pyGenomeTracks
```

Since version 3.5, pyGenomeTracks uses BEDTools, don't forget to install it or load it into your environment


If the latest version wants to be installed use:

```bash
Expand Down
118 changes: 59 additions & 59 deletions docs/content/all_default_properties_rst.txt

Large diffs are not rendered by default.

19 changes: 17 additions & 2 deletions docs/content/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,27 @@ FAQ
.. contents::
:local:

Why the scale of my Hi-C plot suddenly changed
----------------------------------------------
Why the scale of my Hi-C plot suddenly changed?
-----------------------------------------------
pyGenomeTracks is using `HiCMatrix <https://github.com/deeptools/HiCMatrix>`_ to read the matrix from ``h5`` and ``cool`` format.
From version 12 to version 13, a normalization step when reading ``cool`` file was removed. This normalization was mostly used
when you were providing ``cool`` file from `cooler balance <https://cooler.readthedocs.io/en/latest/cli.html#cooler-balance>`_.

If you want to keep the old scale you need to downgrade to HiCMatrix version 12 but version 13 also correct some bugs so we advice
to change your ``max_value`` in your parameter file to adjust to the new scale.

No output generated with version 3.5 installed with pip
-------------------------------------------------------
If you used pyGenomeTracks version 3.5 and the last line you get is:

.. code:: bash
INFO:pygenometracks.tracksClass:initialize x. [xxxxx]
It is highly probable that BEDTools is not installed or not loaded in your environment.

My Hi-C plot looks like no correction was applied when using cool matrix
------------------------------------------------------------------------
pyGenomeTracks is using `HiCMatrix <https://github.com/deeptools/HiCMatrix>`_ to read the matrix from ``cool`` format.
Unfortunately, a bug was introduced in version 14 ignoring the correction factors.
This bug was fixed in version 15 so update HiCMatrix to last version should fix it.
26 changes: 18 additions & 8 deletions docs/content/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,22 @@ Remember -- pyGenomeTracks is available for **command line usage** as well as fo
Requirements
-------------

* Python >=3.6
Python dependencies:

* Python >= 3.6
* numpy >= 1.16
* intervaltree >=2.1.0
* pyBigWig >= 0.3.4
* hicmatrix >= 0.14
* pysam >= 0.8
* matplotlib >= 3.1.1
* gffutils >=0.9
* intervaltree >= 2.1.0
* pyBigWig >= 0.3.16
* hicmatrix >= 15
* pysam >= 0.14
* matplotlib == 3.1.1
* gffutils >= 0.9
* pybedtools >= 0.8.1
* tqdm >= 4.20

External dependencies:

* BEDTools

Command line installation using ``conda``
-----------------------------------------
Expand All @@ -44,6 +52,8 @@ Install pyGenomeTracks using the following command:

All python requirements should be automatically installed.

Since version 3.5, pyGenomeTracks require BEDTools, do not forget to install it or load it into your environment.

If you need to specify a specific path for the installation of the tools, make use of `pip install`'s numerous options:

.. code:: bash
Expand All @@ -55,7 +65,7 @@ Command line installation without ``pip``

You are highly recommended to use `conda install` rather than the following complicated steps.

1. Install the requirements listed above in the "requirements" section. This is done automatically by `pip`.
1. Install the requirements listed above in the "requirements" section. This is done automatically by `pip` (except BEDTools).

2. Download source code
::
Expand Down
1 change: 1 addition & 0 deletions docs/content/releases.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Releases
.. toctree::
:maxdepth: 1

releases/3.5.1
releases/3.5
releases/3.4
releases/3.3
Expand Down
13 changes: 13 additions & 0 deletions docs/content/releases/3.5.1.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
3.5.1
=====

Bugfixes:
^^^^^^^^^

- Get a message when bedtools is installed instead of crashing without any message.

- Always test if a bedgraph is tabix indexed without checking the extension

- Fix a bug which was happening when ``operation`` or ``summary_method`` was used on bedgraph whereas the bedgraph had some missing intervals.

- Enforcing version 15 of HiCMatrix. Version 14 had a bug concerning the application of the correction factors of cool files.
2 changes: 1 addition & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ dependencies:
- intervaltree >=2.1.0
- pybigwig >=0.3.16
- future >=0.17.0
- hicmatrix >=13
- hicmatrix >=15
- pysam >=0.14
- pytest
- gffutils >=0.9
Expand Down
2 changes: 1 addition & 1 deletion pygenometracks/_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
# This file is originally generated from Git information by running 'setup.py
# version'. Distribution tarballs contain a pre-generated copy of this file.

__version__ = '3.5'
__version__ = '3.5.1'
2 changes: 1 addition & 1 deletion pygenometracks/getAllDefaultsAndPossible.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ def main():

# For the default they are summarized in a matrix
mat = np.empty((len(all_default_parameters) + 2, len(all_tracks_with_default) + 1),
dtype='U25')
dtype='U100')
mat[0, 0] = 'parameter'
mat[1, 0] = '--'
for j, track_type in enumerate(all_tracks_with_default, start=1):
Expand Down
1 change: 1 addition & 0 deletions pygenometracks/tests/generateAllOutput.sh
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ bin/pgt --tracks ./pygenometracks/tests/test_data/bedgraph_useMid.ini --region c
bin/pgt --tracks ./pygenometracks/tests/test_data/operation_bdg.ini --region X:2700000-3100000 --trackLabelFraction 0.2 --dpi 130 -o ./pygenometracks/tests/test_data/master_operation_bdg.png
bin/pgt --tracks ./pygenometracks/tests/test_data/bedgraph_withNA.ini --region X:2700000-3100000 --trackLabelFraction 0.2 --dpi 130 -o ./pygenometracks/tests/test_data/master_bedgraph_withNA.png
bin/pgt --tracks ./pygenometracks/tests/test_data/bedgraph_negative.ini --region X:2700000-3100000 --trackLabelFraction 0.2 --dpi 130 -o ./pygenometracks/tests/test_data/master_negative.png
bin/pgt --tracks ./pygenometracks/tests/test_data/bedgraph_end_not_covered.ini --region chr7:100-400 --trackLabelFraction 0.2 --dpi 130 -o ./pygenometracks/tests/test_data/master_bedgraph_end_not_covered.png

# test bigWigTrack:
bin/pgt --tracks ./pygenometracks/tests/test_data/bigwig.ini --region X:2700000-3100000 --trackLabelFraction 0.2 --dpi 130 -o ./pygenometracks/tests/test_data/master_bigwig.png
Expand Down
36 changes: 36 additions & 0 deletions pygenometracks/tests/test_bedGraphTrack.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,18 @@
with open(os.path.join(ROOT, "log1pm_bedgraph.ini"), 'w') as fh:
fh.write(log1p_with_neg)


bedgraph_end_not_covered = """
[bedgraph]
file = simple.bdg
height = 3
summary_method = max
[x-axis]
"""
with open(os.path.join(ROOT, "bedgraph_end_not_covered.ini"), 'w') as fh:
fh.write(bedgraph_end_not_covered)

tolerance = 13 # default matplotlib pixed difference tolerance


Expand Down Expand Up @@ -268,6 +280,14 @@ def test_plot_bedgraph_tracks_rasterize():
ini_file = os.path.join(ROOT, "bedgraph_useMid.ini")
region = "chr2:73,800,000-75,744,000"
expected_file = os.path.join(ROOT, 'master_bedgraph_useMid.pdf')
# matplotlib compare on pdf will create a png next to it.
# To avoid issues related to write in test_data folder
# We copy the expected file into a temporary place
new_expected_file = NamedTemporaryFile(suffix='.pdf',
prefix='pyGenomeTracks_test_',
delete=False)
os.system(f'cp {expected_file} {new_expected_file.name}')
expected_file = new_expected_file.name
args = f"--tracks {ini_file} --region {region} "\
"--trackLabelFraction 0.2 --width 38 --dpi 130 "\
f"--outFileName {outfile.name}".split()
Expand Down Expand Up @@ -419,3 +439,19 @@ def test_bedgraph_neg_log1p():

os.remove(ini_file)
os.remove(os.path.join(ROOT, "bedgraph_chrx_2e6_5e6_m.bg"))


def test_bedgraph_end_not_covered():
region = "chr7:100-400"
outfile = NamedTemporaryFile(suffix='.png', prefix='bedgraph_end_not_covered_', delete=False)
args = "--tracks {ini} --region {region} --trackLabelFraction 0.2 " \
"--dpi 130 --outFileName {outfile}" \
"".format(ini=os.path.join(ROOT, "bedgraph_end_not_covered.ini"),
outfile=outfile.name, region=region).split()
pygenometracks.plotTracks.main(args)
print("saving test to {}".format(outfile.name))
res = compare_images(os.path.join(ROOT, 'master_bedgraph_end_not_covered.png'),
outfile.name, tolerance)
assert res is None, res

os.remove(outfile.name)
7 changes: 7 additions & 0 deletions pygenometracks/tests/test_data/bedgraph_end_not_covered.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@

[bedgraph]
file = simple.bdg
height = 3
summary_method = max

[x-axis]
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 6 additions & 0 deletions pygenometracks/tests/test_data/simple.bdg
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
track type=bedGraph name="400-404notIncluded"
chr7 100 200 1
chr7 200 300 2
chr7 300 350 3
chr7 350 399 4
chr7 405 450 5
21 changes: 16 additions & 5 deletions pygenometracks/tests/test_hiCMatrixTracks.py
Original file line number Diff line number Diff line change
Expand Up @@ -468,13 +468,19 @@ def test_plot_tracks_with_hic_rasterize_height_2chr():
output_file = outfile.name[:-4] + '_' + region_str + extension
expected_file = os.path.join(ROOT, 'master_plot_hic_rasterize_height_'
+ region_str + extension)
# matplotlib compare on pdf will create a png next to it.
# To avoid issues related to write in test_data folder
# We copy the expected file into a temporary place
new_expected_file = NamedTemporaryFile(suffix='.pdf',
prefix='pyGenomeTracks_test_',
delete=False)
os.system(f'cp {expected_file} {new_expected_file.name}')
expected_file = new_expected_file.name
res = compare_images(expected_file,
output_file, tolerance)
assert res is None, res

os.remove(output_file)
if extension == '.pdf':
os.remove(expected_file.replace(extension, '_pdf.png'))


def test_plot_tracks_with_hic_rasterize_height_2chr_individual():
Expand All @@ -485,15 +491,20 @@ def test_plot_tracks_with_hic_rasterize_height_2chr_individual():
delete=False)
expected_file = os.path.join(ROOT, 'master_plot_hic_rasterize_height_'
+ region.replace(':', '-') + extension)

# matplotlib compare on pdf will create a png next to it.
# To avoid issues related to write in test_data folder
# We copy the expected file into a temporary place
new_expected_file = NamedTemporaryFile(suffix='.pdf',
prefix='pyGenomeTracks_test_',
delete=False)
os.system(f'cp {expected_file} {new_expected_file.name}')
expected_file = new_expected_file.name
args = f"--tracks {ini_file} --region {region} "\
"--trackLabelFraction 0.23 --width 38 --dpi 10 "\
f"--outFileName {outfile.name}".split()
pygenometracks.plotTracks.main(args)
res = compare_images(expected_file,
outfile.name, tolerance)
if extension == '.pdf':
os.remove(expected_file.replace(extension, '_pdf.png'))
assert res is None, res


Expand Down
35 changes: 17 additions & 18 deletions pygenometracks/tracks/BedGraphTrack.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,15 +140,13 @@ def __init__(self, properties_dict):
" requires to set the parameter"
" second_file.")
else:
if self.properties['second_file'].endswith(".bgz"):
# First try to open it as a Tabix file
try:
# from the tabix file is not possible to know the
# global min and max
try:
self.tbx2 = pysam.TabixFile(self.properties['second_file'])
except IOError:
self.interval_tree2, __, __ = file_to_intervaltree(self.properties['second_file'])
# load the file as an interval tree
else:
self.tbx2 = pysam.TabixFile(self.properties['second_file'])
except IOError:
# load the file as an interval tree
self.interval_tree2, __, __ = file_to_intervaltree(self.properties['second_file'])

def set_properties_defaults(self):
Expand Down Expand Up @@ -183,17 +181,13 @@ def set_properties_defaults(self):

def load_file(self):
self.tbx = None
# try to load a tabix file is available
if self.properties['file'].endswith(".bgz"):
# try to load a tabix file if available
try:
# from the tabix file is not possible to know the
# global min and max
try:
self.tbx = pysam.TabixFile(self.properties['file'])
except IOError:
self.interval_tree, __, __ = file_to_intervaltree(self.properties['file'],
self.properties['region'])
# load the file as an interval tree
else:
self.tbx = pysam.TabixFile(self.properties['file'])
except IOError:
# load the file as an interval tree
self.interval_tree, __, __ = file_to_intervaltree(self.properties['file'],
self.properties['region'])

Expand Down Expand Up @@ -235,15 +229,15 @@ def get_scores(self, chrom_region, start_region, end_region,
return_nans=True, tbx_var='self.tbx', inttree_var='self.interval_tree'):
"""
Retrieves the score (or scores or whatever fields are in a bedgraph like file) and the positions
for a given region.
for a given region. If return_nans is True the pos_list goes until at least end_region.
In case there is no item in the region. It returns [], []
Args:
chrom_region:
start_region:
end_region:
Returns:
tuple:
scores_list, post_list
scores_list, pos_list
"""
score_list = []
pos_list = []
Expand Down Expand Up @@ -294,6 +288,11 @@ def get_scores(self, chrom_region, start_region, end_region,
score_list.append(values)
pos_list.append((start, end))

# Add a last value if needed:
if prev_end < end_region and return_nans:
score_list.append(np.repeat(np.nan, self.num_fields))
pos_list.append((prev_end, end_region))

return score_list, pos_list

def plot(self, ax, chrom_region, start_region, end_region):
Expand Down
2 changes: 1 addition & 1 deletion pygenometracks/tracks/HiCMatrixTrack.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ def set_properties_defaults(self):
else:
return
# We need to get the size before masking bins because
# HiCMatrix v13 give smaller chromosome_sizes after:
# HiCMatrix>=v13 give smaller chromosome_sizes after:
self.chrom_sizes = self.hic_ma.get_chromosome_sizes()
if self.properties['show_masked_bins']:
pass
Expand Down
27 changes: 23 additions & 4 deletions pygenometracks/utilities.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@
import pybedtools
import tempfile
import warnings
import logging


FORMAT = "[%(levelname)s:%(filename)s:%(lineno)s - %(funcName)20s()] %(message)s"
logging.basicConfig(format=FORMAT)
log = logging.getLogger(__name__)
log.setLevel(logging.DEBUG)


class InputError(Exception):
Expand Down Expand Up @@ -86,6 +93,14 @@ def temp_file_from_intersect(file_name, plot_regions=None, around_region=0):
file_to_open = original_file.intersect(regions, wa=True, u=True).fn
except pybedtools.helpers.BEDToolsError:
file_to_open = file_name
except NotImplementedError:
log.warning("BEDTools is not installed pygenometracks"
" will be slower.")
file_to_open = file_name
except Exception as e:
log.warning(f"BEDTools intersect raised: {e}"
"\nWill not subset the file.")
file_to_open = file_name
sys.stderr.close()
sys.stderr = sys.__stderr__
with open(temporary_file.name, 'r') as f:
Expand All @@ -94,9 +109,9 @@ def temp_file_from_intersect(file_name, plot_regions=None, around_region=0):
error_lines = [line for line in temp_std_error if 'error' in line.lower()]
if len(error_lines) > 0:
error_lines_printable = '\n'.join(error_lines)
sys.stderr.write("Bedtools intersect raised an error:\n"
f"{error_lines_printable}\n"
"Will not use bedtools.\n")
log.warning("BEDTools intersect raised an error:\n"
f"{error_lines_printable}\n"
"Will not use BEDTools.\n")
file_to_open = file_name
return file_to_open

Expand Down Expand Up @@ -171,7 +186,11 @@ def file_to_intervaltree(file_name, plot_regions=None):
valid_intervals += 1

if valid_intervals == 0:
sys.stderr.write(f"No valid intervals were found in file {file_name}")
if file_to_open == file_name:
suffix = " after intersection with the plotted region"
else:
suffix = ""
log.warning(f"No valid intervals were found in file {file_name}{suffix}")
file_h.close()

return interval_tree, min_value, max_value
Expand Down
Loading

0 comments on commit 1c6b592

Please sign in to comment.