Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing how optional dependencies are import state are checked #551

Merged
merged 18 commits into from
Dec 17, 2023

Conversation

alex-rakowski
Copy link
Contributor

This PR changes how optional dependencies import states are checked, to address #548. Importing on Colab now works without requiring adding os.envion["CUPY_PYLOPS] = "0". Instead of using importlib.util.find_spec to get state, this uses importlib.import_module in a try/except block.

!pip install git+https://github.com/alex-rakowski/pylops.git@optional-depends-checker > /dev/null
import pylops
states_to_check = [
    "cupy_enabled",
    "cusignal_enabled",
    "devito_enabled",
    "numba_enabled",
    "pyfftw_enabled",
    "pywt_enabled",
    "skfmm_enabled",
    "spgl1_enabled",
    "sympy_enabled",
    "torch_enabled",
]
for state in states_to_check:
  s = getattr(pylops.utils, state)
  print(f"{state}: {s}")
  
cupy_enabled: False
cusignal_enabled: False
devito_enabled: False
numba_enabled: True
pyfftw_enabled: False
pywt_enabled: True
skfmm_enabled: False
spgl1_enabled: False
sympy_enabled: True
torch_enabled: True

I changed np.float128 and np.complex256 to np.longdouble and np.clongdouble in test_fft.py to run tests on my local mac. Tests are passing on linux machine, and on mac (with the exception of the 7 mentioned in #550). I can revert these changes if desired.

@mrava87
Copy link
Collaborator

mrava87 commented Nov 25, 2023

Hi @alex-rakowski,
thanks!

I have one main comment:

  • what you did makes sense, but if you look below the # error message at import of available package you will see that for many packages (not cupy) a similar logic was already in place as I mentioned in the GitHub issue. Now, if we want to use your check_module_enabled routine for all packages, we should modify all subsequent functions. One of the reasons we decided to go for importlib.util.find_spec followed many functions instead of one is i) to avoid having to try importing many libraries when someone does import pylops but only do it when a specific one is required, ii) because we want to give customized error messages for each library (explaining what to do, e.g. run pip install x). We tried in the past having a single function like you suggest and didn't work out, eventually it would become messy and have many if...else statements.

I personally, believe that we should just make 2 new functions called cupy_import and cusignal_import that follow the same rationale of the current ones, and inside the backend.py file have something like:

cupy_message = deps.cupy_import("the cupy backend")

if cupy_message is None:
    import cupy as cp
    import cupyx
    ...

This way we will not change the entire logic of what has worked for long time, and still get around the cupy issue you have raised. @cako, what do you think?

And a couple of minor comments:

  • the docstring should follow the style we use in PyLops, which is numpydoc
  • let's take away the test_ffts.py changes from this PR and handle them separately

@mrava87 mrava87 requested review from cako and mrava87 November 25, 2023 18:39
@mrava87
Copy link
Collaborator

mrava87 commented Nov 26, 2023

Thanks!

The docstring is better now :) I'll wait to hear from @cako regarding my concerns and then we can agree all together what is the best way forward.

@cako
Copy link
Collaborator

cako commented Nov 26, 2023

Thanks for the issue and the patch @alex-rakowski. The way I see it, since find_spec does not actually try to import the module, it is susceptible to poorly installed libraries, which is the case of cupy in google colab. find_spec finds the module, but the import fails.

I see a few options in handling this:

  1. @alex-rakowski patch: switch everything to try/except. This means that no faulty module will ever be imported, regardless whether it is nominally imported or not. PyLops will never bug out due to faulty installations. But it will have the extraneous imports that @mrava87 mentioned. And linters will complain about the try/except block (currently noqa'd). Suggestion for improvement: add flags for all optional libraries, and check the flag before testing imports.
  2. The current "solution": recognize that one has a faulty import, and let the user add a flag to disable it. If well documented, this is not a bad solution. Suggestion to improve: like above, add flag for all other optional libraries; add better docs under Installation>Advanced Installation>Optional Dependencies.
  3. Use try/except for cupy/cusignal, use find_spec for everything else. This option "recognizes" that cupy is a troublesome library to be handled specially. It should flag/differentiate between a working cupy install, a nonworking cupy install and a missing cupy install. Downside is that if any other library has a fault install, we'll be back in the Issues or worse, lose users who.

I'm personally not a huge fan of catering to faulty installs. Someone with a faulty install but with a GPU may be confused why PyLops doesnt complain about import cupy but they can't use the GPU. But I also understand that the Colab issue is a common enough case that should be handled cleanly and clearly. Personally I'd like to see something like 3 where PyLops explicitly says that cupy could not be imported even though it is installed. PyLops would still be imported and work, but the warning would be registered.


Regarding the FFT issue, that's a great catch. Upon further investigation, whereas float64 and float32 always have the claimed precision, float128 is always has the same precision of longdouble. On the other hand, longdouble needs to only be as precise as double. In practice it can be anywhere from 64, 80 or 128 precision. So your patch actually does not degrade the precision.

For bookkeeping though, I would agree with @mrava87 that this would be better suited in another PR.

@mrava87
Copy link
Collaborator

mrava87 commented Nov 26, 2023

Thanks @cako for your thought!

Just to be clear, what I am saying is that right now we have a slight different approach for cupy/cusignal than for all the other optional dependencies:

  • for cupy/cusignal we have:

cupy_enabled = ( util.find_spec("cupy") is not None and int(os.getenv("CUPY_PYLOPS", 1)) == 1 )

and we use cupy_enabled at import time (e.g., in `backend.py)

  • for all others (taking devito as example) we have:
devito_enabled = util.find_spec("devito") is not None

and

def devito_import(message):
....

If I remember correctly, the need to introduce the env variable CUPY_PYLOPS was indeed for platforms like Colab where one may want to ensure the cupy backend is disabled even if cupy is installed (as the user has no freedom to choose what to install).

So to your points:

1 . we already use try/except everywhere but in cupy/cusignal. If we add def cupy_import(message): and def cusignal_import(message):, we will have a common strategy for all optional deps. Now using what @alex-rakowski proposed is a one function for all, it may be nice as it reduces code length but I struggle to see how to provide customized messages to users to help them understand what is the problem of an optional dependencies not working (see an example for devito in devito_import with different customized messages (eg how to pip install). Do you agree, or I am not making myself clear here?
2. we could do that, but as our code is currently written it would not make much sense for all deps other that cupy/cusignal as they are protected by try/except in x_import modules.
3. given my comments above, I believe this does not make sense as we already use try/except for all but cupy/cusignal.

@mrava87
Copy link
Collaborator

mrava87 commented Nov 26, 2023

Now, this is the main reason for having a 2 steps strategy:

In [3]: %time util.find_spec('numpy')
CPU times: user 317 µs, sys: 1.15 ms, total: 1.46 ms
Wall time: 1.49 ms
Out[3]: ModuleSpec(name='numpy', loader=<_frozen_importlib_external.SourceFileLoader object at 0x10f323410>, origin='/opt/anaconda3/envs/pylops_test/lib/python3.11/site-packages/numpy/__init__.py', submodule_search_locations=['/opt/anaconda3/envs/pylops_test/lib/python3.11/site-packages/numpy'])

In [4]: %time import numpy
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
CPU times: user 570 ms, sys: 69.2 ms, total: 639 ms
Wall time: 976 ms

So when you do import pylops you will not try to import all optional dependencies as the new code proposed by @alex-rakowski would do, but only later when someone tries to import a specific submodule... but maybe I am just overthinking and this doesn't matter too much in the grand scheme of things?

@alex-rakowski
Copy link
Contributor Author

Just to check, it seems that the proposed solution is to keep the util.find_spec function as it was for the other imports and create cupy_import and cusignal_import functions. These will be defined and called in depnds.py to set cupy_enabled and cusignal_enabled, and they should do so noisily (printing a UserWarning ~ Cupy appears to be installed but cannot be imported, falling back to CPU compute) if they detect cupy/cusignal is installed but cannot be imported. Is this correct?

@mrava87
Copy link
Collaborator

mrava87 commented Nov 27, 2023

Correct, this is my suggestion. @cako ?

@cako
Copy link
Collaborator

cako commented Nov 29, 2023

That sounds like the ideal solution to me as well!

@mrava87
Copy link
Collaborator

mrava87 commented Dec 5, 2023

@alex-rakowski, just checking that it is clear from your end that we are waiting for you to adapt your code based on the agreed solution... no rush from our end though, happy to handle this at any time :)

@alex-rakowski
Copy link
Contributor Author

@mrava87 @cako yes thanks for the clarification. I should have time to work on these changes this week.

@alex-rakowski alex-rakowski marked this pull request as ready for review December 12, 2023 07:26
@alex-rakowski
Copy link
Contributor Author

I believe this is ready and matches the behavior described above.

I've checked locally and on Google Colab which now imports, and the pylops.utils.deps booleans look correct for what I expected, and pylops can now be imported on CPU and GPU runtime.

I tested the behavior with the following:

!pip install git+https://github.com/alex-rakowski/pylops.git@optional-depends-checker > /dev/null

import pylops
print()
states_to_check = [
    "cupy_enabled",
    "cusignal_enabled",
    "devito_enabled",
    "numba_enabled",
    "pyfftw_enabled",
    "pywt_enabled",
    "skfmm_enabled",
    "spgl1_enabled",
    "sympy_enabled",
    "torch_enabled",
]
for state in states_to_check:
  s = getattr(pylops.utils, state)
  print(f"{state}: {s}")

On the CPU runtime, which is the environment with Cupy installed but no GPU, it prints an error on import pylops to say that ~ "cupy couldn't be imported and the ensure CUDA environment is correct" with a link to cupy install instructions. I expect the same behavior for cusignal, but I haven't found a way to create a malformed environment to test this.

@mrava87
Copy link
Collaborator

mrava87 commented Dec 12, 2023

Thanks! I'll get back to this later this week and let you know if I have any question :)

@cako
Copy link
Collaborator

cako commented Dec 16, 2023

LGTM. @mrava87 I'll leave it to you to merge if you approve as well!

@mrava87
Copy link
Collaborator

mrava87 commented Dec 17, 2023

I’m looking into it, as I found a few typos and a couple of things I want to test before merging… hopefully it will be ready by the end of the week :)

@mrava87 mrava87 merged commit 77eca0f into PyLops:dev Dec 17, 2023
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants