Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for building and passing tests for 32-bit archs #39

Merged
merged 1 commit into from
Oct 12, 2020

Conversation

nileshpatra
Copy link
Contributor

Hi,

I suppose NCLS is compatible to run with 32-bit archs right?

The tests actually seem to fail with those archs due to very large values as well as type mismatch at a couple of points.
This is my attempt to fix it.

I tried testing it in a 32-bit environment and it works.

@nileshpatra
Copy link
Contributor Author

@endrebak this is a bug in Debian with NCLS failing in 32-bit arches. Could you please review this?

@endrebak
Copy link
Collaborator

Thanks! I will do so after work today :)

@endrebak
Copy link
Collaborator

I will need to look more closely. This might break macOS. Let me test.

@endrebak
Copy link
Collaborator

I am not an expert on C types, but long means different things on different architectures, I think. So it is best to use int64_t, not long.

The test is for 64-bit architectures (actually, I do not test NCLS directly, only indirectly through the extensive tests in pyranges).

Why does the build fail on 32-bit architectures?

Wouldn't it be better to just skip the tests on 32-bit architectures?

I need more context. I think by using long instead of int64_t you are going to break NCLS for windows.

@nileshpatra
Copy link
Contributor Author

I am not an expert on C types, but long means different things on different architectures, I think. So it is best to use int64_t, not long.

I'm not sure here - if the dtype of output_arr is np.long, why should the datatype of output be int64_t?

This lead to a build failure leading to inconsistent types here.

The test is for 64-bit architectures (actually, I do not test NCLS directly, only indirectly through the extensive tests in pyranges).

OK,

Why does the build fail on 32-bit architectures?

Because of two things:

  1. The values assigned in the numpy arrays in the test suite are extremely large and unsuited for 32-bit archs. The log for the same can be seen here

  2. The assignment of output_arr to output is incompatible due to the former being a long dtype. This is a bigger problem since this means that the function is broken for 32-bit archs.

Wouldn't it be better to just skip the tests on 32-bit architectures?

That can be an option but it might be better to actually fix the problem instead.

I need more context. I think by using long instead of int64_t you are going to break NCLS for windows.

I admit, I don't use windows. Maybe modifying .travis.yml can give us the results?

Again, since you're the author of the package you are much more aware of how this works - please let me know what you'd think should be the best solution?

@endrebak
Copy link
Collaborator

I will try to make the types consistent. I will also look into how to check whether we are testing in a 32-bit arch. Then I can run the appropriate tests for 32 and 64-bit arches.

@endrebak
Copy link
Collaborator

Does 0.0.55 work? I've added a check in the tests to see whether we are on a 32 or 64-bit architecture: https://github.com/biocore-ntnu/ncls/blob/master/tests/test_ncls.py

@nileshpatra
Copy link
Contributor Author

Just tested it. It doesn't, fails with:

=============================================================================================== test session starts ================================================================================================
platform linux -- Python 3.8.6, pytest-4.6.11, py-1.9.0, pluggy-0.13.0
rootdir: /ncls/python-ncls
collected 2 items                                                                                                                                                                                                  

tests/test_ncls.py .F                                                                                                                                                                                        [100%]

===================================================================================================== FAILURES =====================================================================================================
____________________________________________________________________________________________ test_all_containments_both ____________________________________________________________________________________________

    def test_all_containments_both():
    
        starts = np.array([5, 10], dtype=int)
        ends = np.array([6, 50], dtype=int)
        ids = np.array([0, 1], dtype=int)
    
>       ncls = NCLS(starts, ends, ids)

tests/test_ncls.py:78: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
ncls/__init__.py:11: in NCLS
    return NCLS32(starts, ends, ids)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   def __cinit__(self, const int32_t [::1] starts=None, const int32_t [::1] ends=None, const int64_t [::1] ids=None):
E   ValueError: Buffer dtype mismatch, expected 'const int64_t' but got 'long'

ncls/src/ncls32.pyx:36: ValueError
======================================================================================== 1 failed, 1 passed in 0.91 seconds ========================================================================================

@nileshpatra
Copy link
Contributor Author

nileshpatra commented Oct 12, 2020

@endrebak on applying this patch, it works. I've modified the PR accordingly.

diff --git a/tests/test_ncls.py b/tests/test_ncls.py
index 0637b5a..cc4a4c3 100644
--- a/tests/test_ncls.py
+++ b/tests/test_ncls.py
@@ -46,7 +46,7 @@ if struct.calcsize("P") * 8 == 64:
         assert list(subs) == [0, 1] == list(covers)
 
 else:
-    
+
     starts = np.array([5, 2_147_483_645], dtype=np.int64)
 
     ends = np.array([6, 2_147_483_646], dtype=np.int64)
@@ -71,9 +71,9 @@ else:
 
     def test_all_containments_both():
 
-        starts = np.array([5, 10], dtype=int)
-        ends = np.array([6, 50], dtype=int)
-        ids = np.array([0, 1], dtype=int)
+        starts = np.array([5, 10], dtype=np.int64)
+        ends = np.array([6, 50], dtype=np.int64)
+        ids = np.array([0, 1], dtype=np.int64)
 
         ncls = NCLS(starts, ends, ids)
         subs, covers = ncls.all_containments_both(starts, ends, ids)

Also, could you please tag NCLS with a 0.0.55 github release? It'd enable me to update the same package in Debian accordingly.

@endrebak
Copy link
Collaborator

I am confused though. Shouldn't it be np.int32 if you want to use it on a 32-bit architecture?

@endrebak
Copy link
Collaborator

endrebak commented Oct 12, 2020

Can you try with int32 instead of int64 and confirm that it still works? (I am guessing the 32-64 bits are just ignored and that is why it works)

@endrebak
Copy link
Collaborator

I've uploaded 0.0.56 where I corrected the test. It uses 32-bits consistently. I had made some sloppy mistakes, sorry.

@endrebak
Copy link
Collaborator

Please report whether 0.0.56 works :)

@nileshpatra
Copy link
Contributor Author

@endrebak it doesn't work again.

=============================================================================================== test session starts ================================================================================================
platform linux -- Python 3.8.6, pytest-4.6.11, py-1.9.0, pluggy-0.13.0
rootdir: /ncls/python-ncls
collected 2 items                                                                                                                                                                                                  

tests/test_ncls.py FF                                                                                                                                                                                        [100%]

===================================================================================================== FAILURES =====================================================================================================
____________________________________________________________________________________________________ test_ncls _____________________________________________________________________________________________________

    def test_ncls():
        # ids = starts
    
        print(starts, ends, ids)
    
>       ncls = NCLS(starts, ends, ids)

tests/test_ncls.py:61: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
ncls/__init__.py:11: in NCLS
    return NCLS32(starts, ends, ids)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   def __cinit__(self, const int32_t [::1] starts=None, const int32_t [::1] ends=None, const int64_t [::1] ids=None):
E   ValueError: Buffer dtype mismatch, expected 'const int64_t' but got 'long'

ncls/src/ncls32.pyx:36: ValueError
----------------------------------------------------------------------------------------------- Captured stdout call -----------------------------------------------------------------------------------------------
[         5 2147483645] [         6 2147483646] [0 3]
____________________________________________________________________________________________ test_all_containments_both ____________________________________________________________________________________________

    def test_all_containments_both():
    
        starts = np.array([5, 10], dtype=np.int32)
        ends = np.array([6, 50], dtype=np.int32)
        ids = np.array([0, 1], dtype=np.int32)
    
>       ncls = NCLS(starts, ends, ids)

tests/test_ncls.py:78: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
ncls/__init__.py:11: in NCLS
    return NCLS32(starts, ends, ids)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   def __cinit__(self, const int32_t [::1] starts=None, const int32_t [::1] ends=None, const int64_t [::1] ids=None):
E   ValueError: Buffer dtype mismatch, expected 'const int64_t' but got 'long'

ncls/src/ncls32.pyx:36: ValueError
============================================================================================= 2 failed in 0.25 seconds =============================================================================================

This was the case earlier as well with np.int32 - I've no idea why though.
I've again pushed what works in the PR, please consider merging and tagging this with the new (github)release

This is the output with the patch applied:

=============================================================================================== test session starts ================================================================================================
platform linux -- Python 3.8.6, pytest-4.6.11, py-1.9.0, pluggy-0.13.0
rootdir: /ncls/python-ncls
collected 2 items                                                                                                                                                                                                  

tests/test_ncls.py ..                                                                                                                                                                                        [100%]

============================================================================================= 2 passed in 0.20 seconds =============================================================================================
	rm -fr -- /tmp/dh-xdg-rundir-nRu9sb2c

Copy link
Collaborator

@endrebak endrebak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are just changing the test, which is fine by me 👍

@endrebak endrebak merged commit 5970b21 into pyranges:master Oct 12, 2020
@nileshpatra
Copy link
Contributor Author

@endrebak thanks a lot for your fixes and work. Could you please tag a git release to 0.0.56 so I can upload this new version which would fix NCLS in Debian?

@nileshpatra
Copy link
Contributor Author

Alrighty, it's done. Super thanks!

@endrebak
Copy link
Collaborator

I am so happy we were able to find a solution :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants