Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error during install of dependencies via pip; Python 3.8; gcc version 10.1.0 #307

Closed
johannesgiersdorf opened this issue Jun 7, 2020 · 20 comments

Comments

@johannesgiersdorf
Copy link

Hello,
I have a problem during the installation of emukit via pip.

I am using Python 3.8.3. I try to install the current version of emukit (0.4.7).

I've have tried to install it multiple times, if you want to see the details or you need more information to my setup I've created a more detailed gist. I have reproduced the bug as a sort of summary.

How to reproduce the bug

What I wanted to do: I tried to install emukit via pip with the command pip install emukit that is suggested by the README and the Docs

First I create a virtual environment and activate it (I use venv because the first install attempt did not work out of the box, so I did not want to mess up my system and the python packaging tutorial suggested to do so)

python -m venv ./venv
source venv/bin/activate

Then I upgrade pip and install wheel (because the output of pip in previous attempts suggested to do so)

pip install --upgrade pip
pip install wheel

Then I try to install emukit via pip. For the bug report I logged stdout stderr.

pip install emukit 1>install_f.txt  2>error_install_f.txt

The files containing the full output are attached.
error_install_f.txt
install_f.txt

The error occurs when pip is trying to install scipy==1.1.0 (the fact that pip is trying to install exactly this version of scipy even if a newer is already installed even though emukit doesn't seem to depend on exactly this version I find odd).
The whole error is a very long.

Error: Rank mismatch between actual argument at (1) and actual argument at (2) (scalar and rank-1)
  21     scipy/sparse/linalg/eigen/arpack/ARPACK/SRC/znaitr.f:737:39:

Errors similar to the above line led me to assume it could be related to scipy/scipy#11611

I have also tried to install numpy, gpy and scipy before installing emukit. For details see gist.

Version of python, pip and gcc

$ python -V
Python 3.8.3
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/10.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --with-isl --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-install-libiberty --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-libunwind-exceptions --disable-werror gdc_include_dir=/usr/include/dlang/gdc
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.1.0 (GCC) 
$ pip -V   
pip 20.1.1 from /home/<username>/Documents/uni/bayesian-quadrature/venv/lib/python3.8/site-packages/pip (python 3.8)
$ pip list                     
Package    Version
---------- -------
numpy      1.18.5
pip        20.1.1
setuptools 41.2.0
wheel      0.34.2

I appreciate any and all help.

@johannesgiersdorf
Copy link
Author

It works if I use installation from sources

git clone https://github.com/amzn/Emukit.git
cd Emukit
pip install -r requirements/requirements.txt
python setup.py develop

@MashaNaslidnyk
Copy link
Contributor

Thanks for this! I can reproduce it, and it disappears because the new requirements file has scipy>=1.1.0 rather than scipy==1.1.0 . I can install scipy v1.4.0 through pip, but not scipy v1.1.0-v1.1.3 . (don't know why, but here's a link to the release notes: https://github.com/scipy/scipy/releases/tag/v1.4.0 )

I think we should change requirements/requirements.txt to scipy>=1.4.0, and do a release soon - tagging @mmahsereci for review.

@apaleyes
Copy link
Collaborator

apaleyes commented Jun 9, 2020

@johannesgiersdorf thanks for a very detailed error report, much appreciated!

@MashaNaslidnyk I'd be cautious with this, and would try to keep the deps version as inclusive as possible. Let's deep dive on the error and see if we can fix it, before considering other options.

The error file starts with this: ModuleNotFoundError: No module named 'numpy'. That's weird. Notice that when install is being executed, scipy is installed first, before numpy. I am not sure if that is correct.

@MashaNaslidnyk
Copy link
Contributor

The issue is with there not being a pre-built wheel for scipy v1.1.0. Or pip not picking it.

This fails:
pip install numpy scipy==1.1.0

This succeeds:

pip install numpy scipy==1.4.0

If we insist on installing the earlier version, we need to have C libraries needed to build the wheel installed - this worked for my env:
yum install lapack-devel.x86_64 blas-devel.x86_64
pip install numpy scipy==1.1.0

Or, potentially, one one of the .whl here might be suitable - and pip is wrong for deciding it isn't - in which case one can download it manually. I didn't try that because it wouldn't be a reasonable fix to suggest.

We can look into this further; changing scipy==1.1.0 to scipy>=1.1.0 in the meantime seems reasonable as it would solve the issue for many.

As to install order: here's my educated guess at what's happening. Pip tries to follow the following:

  1. Build all wheels (numpy -> scipy -> ....)
    For numpy, no build is needed - a pre-built wheel is available (and was downloaded before, see line "Using cached numpy-1.18.5-cp38-cp38-manylinux1_x86_64.whl"). For scipy, for some reason, none of the pre-built wheels for v1.1.0 will do, so pip downloads the tarball and attempts to build it.

  2. Install all wheels (numpy -> scipy -> ....) OR run setup.py install if a wheel build failed

We see two errors in the logs:

  1. Scipy wheel can't be built without numpy - and numpy hasn't been installed yet, because we're still at step 1! If numpy is installed (I checked this), we fail to build the wheel still because of missing C libraries, in my case, lapack-devel.x86_64 blas-devel.x86_64 .

But that's okay because there's still the option of setup.py install . So the installation proceeds to step 2, installs numpy (this is what the somewhat ambiguous "Installing collected packages: numpy [...]" line says), attempts setup.py install of scipy because no wheel was built, and..

  1. Fails because of missing C libraries.

The only difference for 1.4.0 is that there is a pre-built wheel available.

@apaleyes
Copy link
Collaborator

Amazing deep dive, thanks a ton @MashaNaslidnyk ! I agree that scipy>=1.1.0 is a good interim solution. There is probably no good reason to keep == in reqs file anyway.

@ekalosak
Copy link
Contributor

ekalosak commented Jul 13, 2020

Maybe I'm paranoid; it's already dangerous enough to not require source-hash tagged requirements. Reading @MashaNaslidnyk 's post, it seems implicit that scipy 1.4.0 is close enough to 1.1.0 not to cause any problems, and that the pre-built wheel is the real solution to the installation issue. So it seems like an ==1.4.0 rather than a >=1.1.0 might suffice.

@apaleyes
Copy link
Collaborator

@ekalosak forgive me my ignorance, what is "source-hash tagged requirements"?

@ekalosak
Copy link
Contributor

ekalosak commented Jul 14, 2020

I'm sure there's a better way to put that, sorry. Let me try to explain - hopefully the sources are more well written and illustrative if I fall short.

We currently rely on version numbers maintained by humans to indicate which source code & pre-built wheels to install. This is dangerous for a number of reasons[1] - most relevantly because builds become nondeterministic. One specific example of the issue of nondeterministic builds is that supporting pre-built wheels can be introduced or removed without changing the semantic version tag.

Having, hopefully, motivated the need for a better solution - one such solution is to add the SHA265 checksum (what I referred to as "hash") of the source code to the requirements.txt. This is a popular solution for deterministic builds in the Haskell, Debian, and production R communities - it's gaining steam in the Python community[2,3] as one of the patches in the patchwork-quilt solution to Dependency Hell in the Python ecosystem.

It might be overkill here, but I bring it up to point out the risk of being free with dependency tagging - it might make issues like the present one worse rather than better going forward.

Sources

  1. https://lil.law.harvard.edu/blog/2019/05/20/improving-pip-compile-generate-hashes/
  2. pip freeze with a hash pypa/pip#4732
  3. https://discuss.python.org/t/draft-pep-recording-the-source-hash-of-installed-distribution/4660

tl;dr see the following excerpt lifted from [1]
Screen Shot 2020-07-14 at 11 15 12 AM

@ZackArnold
Copy link

ZackArnold commented Jul 30, 2020

Is there any plan for the next release that will include a solution to this scipy dependency issue? As others commented, I have noticed the requirements file has been updated. That would be sufficient for my problem, were it to be released in a new version.

I am working on a project that has dependencies on emukit and another package that both work with scipy, with package 2 requiring scipy>=1.4.0. However, I have found that emukit 0.4.7 requiring exactly scipy==1.1.0 causes a ContextualVersionConflict exception in my project when using setuptools entrypoints, requiring me to manually load the modules using importlib which does not check for matching versions.

I have also just noticed that pyDOE is abandonware at this point, while a pyDOE2 fork is still being maintained. Would be nice to see if Emukit can make the swap. https://pypi.org/project/pyDOE2/

@ZackArnold
Copy link

Regarding the dependency issue: the upcoming pip release, with its more stringent requirements resolver, is going to start refusing to install emukit when newer scipy versions are required elsewhere. Here is an example of the error message when installing with -use-feature=2020-resolver that is about to become the default.

ERROR: Cannot install testpkg 0.2.0 and emukit 0.4.7 because these package versions have conflicting dependencies.

The conflict is caused by:
testpkg 0.2.0 depends on scipy>=1.4.0
emukit 0.4.7 depends on scipy==1.1.0

To fix this you could try to:

  1. loosen the range of package versions you've specified
  2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies

@ekalosak
Copy link
Contributor

ekalosak commented Jan 1, 2021

For those still experiencing pain, please try installing from source:

pyenv virtualenv --python=python3.8 ek3
git clone git@github.com:EmuKit/emukit.git
cd emukit && echo "ek3" > .python-version
pip install -r requirements/requirements.txt
python setup.py install

For those looking to develop emukit, substitute python setup.py install for python setup.py develop.

@ekalosak
Copy link
Contributor

ekalosak commented Feb 5, 2021

Looks like the recent requirements/requirements.txt has scipy>=1.1.0 (link) - @apaleyes @MashaNaslidnyk what would it take to get the current code released to PyPi?

Screen Shot 2021-02-05 at 2 24 30 PM

Furthermore, with the switch to GitHub Actions, it'd be good to consider adding an issue requesting someone add CD -> PyPi.

@apaleyes
Copy link
Collaborator

apaleyes commented Feb 6, 2021

Yes, I shall release a new version, thanks for pinging @ekalosak . Can't promise any dates really, but will probably have time for that after 15th of Feb.

@apaleyes
Copy link
Collaborator

apaleyes commented Feb 6, 2021

Although to be fair scipy update won't solve all our problems. GPy isn't behaving of late either: SheffieldML/GPy#874

@apaleyes
Copy link
Collaborator

Update: i am going through recent PRs and issues at the moment, will release new version once I am done with that

@apaleyes
Copy link
Collaborator

Version 0.4.8 is now available in pip. Install on python 3.8 seems to work locally. Can anyone in this issue please confirm before we close it?

@johannesgiersdorf
Copy link
Author

Install on python 3.9.1 works if wheel is installed. (GPy throws an error in case wheel is not installed, see issue mentioned above)

python -m venv ./venv
source venv/bin/activate
pip install --upgrade pip
pip install wheel
pip install emukit

@apaleyes
Copy link
Collaborator

Thanks @johannesgiersdorf ! GPy is indeed broken on 3.9 at the moment, although relentless @ekalosak has a fix pending review: SheffieldML/GPy#888

@apaleyes
Copy link
Collaborator

Thanks to @ekalosak we now have new release of GPy, and Emukit was updated to work with it. Closing,

@apaleyes
Copy link
Collaborator

An update here: I have just realized that while the dependency issue was fixed, it was never released, meaning people still could run into it.

New version 0.4.9 was just released, and hopefully pip install emukit shall now run smoothly on python 3.8 and 3.9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants