Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install only aiida package (+ subpackages) #4205

Closed
ltalirz opened this issue Jun 29, 2020 · 25 comments · Fixed by #4204
Closed

install only aiida package (+ subpackages) #4205

ltalirz opened this issue Jun 29, 2020 · 25 comments · Fixed by #4204
Assignees
Labels

Comments

@ltalirz
Copy link
Member

ltalirz commented Jun 29, 2020

The current generic line

packages=find_packages(),

will install all top-level python packages in the user's python environment.

As mentioned by @greschd here, this should be replaced by find_packages(include=['aiida', 'aiida.*']).

Note: This concerns both the install from the git repo and from source on PyPI (the tarball currently includes all tests as well, see https://pypi.org/project/aiida-core/#files)

@greschd
Copy link
Member

greschd commented Jun 29, 2020

While we're at it, I think this line in the MANIFEST file is outdated.. at least, I can't find these files in the source tree.

@ltalirz
Copy link
Member Author

ltalirz commented Jun 29, 2020

Yes, please update this as well!

Ah, by the way, the utils/ are needed for the fastentrypoints.py.
Perhaps we should consider moving this inside aiida?

@ltalirz
Copy link
Member Author

ltalirz commented Jun 29, 2020

This also highlights the fact that a regular pip install from the tarball is currently not tested (or your PR should have failed the tests).

This is the same issue with aiida-diff - would be very nice to fix this (if not testing everything from the tarball-installed AiiDA, then at least installing it and importing it once?).
This is a very generic problem - what do other popular packages do to mitigate this?

Edit: Perhaps I'm wrong - since the utils/ are still installed via the manifest, they will be in the tarball. Perhaps this is enough?
Anyhow, I think it means teh utils/ folder will be installed in the user's environment and I don't think this is something we really want.

@greschd
Copy link
Member

greschd commented Jun 29, 2020

Yes, please update this as well!

What's the correct update here? Have they moved somewhere else, or can this line be removed?

Edit: Perhaps I'm wrong - since the utils/ are still installed via the manifest, they will be in the tarball. Perhaps this is enough?

I think so, yes.

This also highlights the fact that a regular pip install from the tarball is currently not tested (or your PR should have failed the tests).

That may still be missing, though. In the aiida-wannier90 CI, we run (as one of the install options)

python setup.py sdist
ls -1 dist/ | xargs -I % pip install dist/%[testing]

That's still on Travis CI, so may have to be adapted for GHA.

@ltalirz
Copy link
Member Author

ltalirz commented Jun 29, 2020

What's the correct update here? Have they moved somewhere else, or can this line be removed?

Yes, these are AiiDA export files used for testing.
There are still a couple of them in the repo, but all in the tests/ folder (which we anyhow don't want to include).

@sphuber
Copy link
Contributor

sphuber commented Jun 29, 2020

Yes, please update this as well!

What's the correct update here? Have they moved somewhere else, or can this line be removed?

Well, if we do not want to distribute the tests with the package, then we can just remove this. Part of these have been moved to tests/ and some even to the package aiida-migration-tests. In any case, aiida-core does not need it to be functional, so I say we just remove it. Same then goes for the line directly below it, the .mako which is to generate SqlAlchemy migration files. I would also say this is not needed for the package as only developers need access to this.

@ltalirz
Copy link
Member Author

ltalirz commented Jun 29, 2020

That may still be missing, though. In the aiida-wannier90 CI, we run (as one of the install options)

python setup.py sdist
ls -1 dist/ | xargs -I % pip install dist/%[testing]

That's still on Travis CI, so may have to be adapted for GHA.

I think adding such a test would be very useful as well.

@greschd
Copy link
Member

greschd commented Jun 29, 2020

Ok, two more points:

  • Should utils/* be included, or would utils/fastentrypoints.py be sufficient?
  • Do we want to distribute the docs with the package? I'm not sure if that was complete before (probably the *.rst were missing anyway) - in any case, now only the explicitly included docs/source/images/*.png are there. I think including the docs sources is not super useful, if anything the built docs should be included. So I'd suggest removing docs/source/images/*.png also.

I think adding such a test would be very useful as well.

Since I'm not super familiar with the GHA infrastructure, does one of you want to add that?

@ltalirz
Copy link
Member Author

ltalirz commented Jun 29, 2020

  • Should utils/* be included, or would utils/fastentrypoints.py be sufficient?

Yes, fastentrypoints.py plus the __init__.py should be enough.

  • Do we want to distribute the docs with the package? I'm not sure if that was complete before (probably the *.rst were missing anyway) - in any case, now only the explicitly included docs/source/images/*.png are there. I think including the docs sources is not super useful, if anything the built docs should be included. So I'd suggest removing docs/source/images/*.png also.

I certainly agree; and I don't think including e.g. a PDF in the PyPI package would be very useful either (+ it could be quite large).

I think adding such a test would be very useful as well.

Since I'm not super familiar with the GHA infrastructure, does one of you want to add that?

Sure, we can do it separately.

@greschd
Copy link
Member

greschd commented Jun 29, 2020

  • Should utils/* be included, or would utils/fastentrypoints.py be sufficient?

Yes, fastentrypoints.py plus the __init__.py should be enough.

What do we need the __init__.py for?

@ltalirz
Copy link
Member Author

ltalirz commented Jun 29, 2020

in order to be able to import it, no?

See

from utils import fastentrypoints # pylint: disable=unused-import

@greschd
Copy link
Member

greschd commented Jun 29, 2020

That works without __init__.py in Python3 - it makes it a "namespace package" instead of a "regular package", but I don't think that matters much.

@ltalirz
Copy link
Member Author

ltalirz commented Jun 29, 2020

Anyhow, shipping this in utils is certainly not good practice (any other package doing the same make break ours).

Although the top level of the package is already relatively crowded, it would be better to put it there - or we can specify it as a build-level dependency in the pyproject.toml https://github.com/ninjaaron/fast-entry_points#usage

@greschd
Copy link
Member

greschd commented Jun 29, 2020

Anyhow, shipping this in utils is certainly not good practice (any other package doing the same make break ours).

I doubt it would break, because our utils would have the highest precedence by virtue of being in the CWD at install-time..

Although the top level of the package is already relatively crowded, it would be better to put it there - or we can specify it as a build-level dependency in the pyproject.toml https://github.com/ninjaaron/fast-entry_points#usage

..but nevertheless, this is probably the better option.

@greschd
Copy link
Member

greschd commented Jun 29, 2020

The disadvantage of "un-vendoring" the fastentrypoints like this is that it will not be used when setup.py is executed directly (not via pip). So for the purpose of being able to python setup.py bdist_wheel, we'd have to add a try-except clause, like here.

@sphuber @ltalirz I'll let you decide this one, for me both approaches are fine.

@ltalirz
Copy link
Member Author

ltalirz commented Jun 29, 2020

I prefer the try/except to vendoring (which also then creates this utils/ package in the user's python environment).
What about @sphuber ?

@greschd
Copy link
Member

greschd commented Jun 29, 2020

which also then creates this utils/ package in the user's python environment

Not unless we include it in the packages - it's only visible to the install because that happens at the project root.

@sphuber
Copy link
Contributor

sphuber commented Jun 29, 2020

I have been reading up a bit, especially on this issue of setuptools. Writing this here to collect my thoughts and record if it for future reference.

If I understand correctly, the original problem was that the wrapper script that setuptools would generate for console_script entry points, which we use for verdi, would import pkg_resources which is too slow. The fastentrypoints would simply monkey patch setuptools to write a simpler wrapper script. Since then pip has been fixed to not use setuptools to install console scripts and so unlike the old version:

#!/home/sph/.virtualenvs/aiida_testing/bin/python
# EASY-INSTALL-ENTRY-SCRIPT: 'aiida-testing','console_scripts','aiida-testing'
__requires__ = 'aiida-testing'
import re
import sys
from pkg_resources import load_entry_point

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(
        load_entry_point('aiida-testing', 'console_scripts', 'aiida-testing')()
    )

which uses the slow pkg_resources, it now automatically installs the version that fastentrypoints would hack:

#!/home/sph/.virtualenvs/aiida_testing/bin/python
# -*- coding: utf-8 -*-
import re
import sys
from aiida_testing.cli import cmd_root
if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(cmd_root())

The former takes roughly 280 ms on average on my machine, versus 130 ms for the latter.

Except! if you don't have wheel installed.... Then pip has to fall back on the old one and we get the original slow version again. This in turn can be prevented by having a pyproject.toml containing:

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"

With this, wheel will be properly installed and pip will create the correct console script wrapper. We might hope we are finally at the end of this story, but not quite. Editable installs follow another path yet again, and will produce slow console scripts with pip.
Even with a pyproject.toml if we install as pip install -e package, we get the following monstrosity:

#!/home/sph/.virtualenvs/aiida_testing/bin/python
# EASY-INSTALL-ENTRY-SCRIPT: 'aiida-testing','console_scripts','aiida-testing'
import re
import sys

# for compatibility with easy_install; see #2198
__requires__ = 'aiida-testing'

try:
    from importlib.metadata import distribution
except ImportError:
    try:
        from importlib_metadata import distribution
    except ImportError:
        from pkg_resources import load_entry_point


def importlib_load_entry_point(spec, group, name):
    dist_name, _, _ = spec.partition('==')
    matches = (
        entry_point
        for entry_point in distribution(dist_name).entry_points
        if entry_point.group == group and entry_point.name == name
    )
    return next(matches).load()


globals().setdefault('load_entry_point', importlib_load_entry_point)


if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(load_entry_point('aiida-testing', 'console_scripts', 'aiida-testing')())

which goes towards the old loading times of around 280 ms, because importlib.metadata (which only ships as built-in sinec Python 3.8) nor the backport importlb_metadata (which has to be manually installed as package) can be imported.

So long story short, we could get rid of fastentrypoints if we provide a pyproject.toml which installs wheel, which we do, except that it breaks editable installs. Since those are used quite a lot, I think that means we cannot drop fastentrypoints yet, as much as I would like to.

@greschd what is the purpose of python setup.py bdist_wheel? Who would be doing this for what reason? I guess the same goes for calling python setup.py install directly, but that is also highly discouraged and not supported by setuptools, so we shouldn't support that. What is the difference with bdist_wheel?

@greschd
Copy link
Member

greschd commented Jun 29, 2020

Nice writeup - I think I agree on the conclusion. For completeness' sake, do you happen to know which version of pip started using the "fast" method for the non-editable installs?

The python setup.py bdist_wheel is used to create the wheels, not install them. See https://github.com/aiidateam/aiida-core/wiki/How-to-make-a-new-release#manual-pypi-deployment
I guess "who will do that" will be you, @sphuber 😄

@sphuber
Copy link
Contributor

sphuber commented Jun 29, 2020

Nice writeup - I think I agree on the conclusion. For completeness' sake, do you happen to know which version of pip started using the "fast" method for the non-editable installs?

Unfortunately not, found out about this fix buried in the comments of the setuptools issue that I linked, but wasn't mentioned when that happened.

The python setup.py bdist_wheel is used to create the wheels, not install them. See https://github.com/aiidateam/aiida-core/wiki/How-to-make-a-new-release#manual-pypi-deployment
I guess "who will do that" will be you, @sphuber smile

I have to admit that when it comes to this part of doing the releases I am merely studiously copy-pasting those commands. I have never really delved into the specifics of package building and distribution. That's also the reason for the write up, because I have no experience with this whatsoever. But, does that mean then, that if we were to remove fastentrypoints, that the wheels will not have it and so the console script installed by those will be slow? I am afraid that here already I don't even know what a wheel exactly is. I imagine this is already some pre-compiled version? Regardless, if that is the case, we can't unvendor fastentrypoints, no? Because all pipinstalls will have a slowverdi`?

@greschd
Copy link
Member

greschd commented Jun 29, 2020

In my limited understanding, wheels are pre-built packages with all the metadata etc. baked in (in a standardized format). That means setup.py won't be executed when pip (or whichever tool) installs the package. [1]

From the fastentrypoints README:

This is ripped directly from the way wheels do it and is faster than whatever the heck the normal console scripts do.

Note:

This bug in setuptools only affects packages built with the normal setup.py method. Building wheels avoids the problem and has many other benefits as well. fastentrypoints simply ensures that your user scripts will not automatically import pkg_resources, no matter how they are built.

So we need fastentrypoints only for the case of non-wheel installs. Conveniently, that's also exactly when the setup.py will be executed.

[1] This is also the source of our reentry frustrations...

@sphuber
Copy link
Contributor

sphuber commented Jun 29, 2020

So we need fastentrypoints only for the case of non-wheel installs. Conveniently, that's also exactly when the setup.py will be executed.

So we need it for non-wheel installs, but do we need it for wheel building? I am a bit confused, the quote you included seems to suggest that when dealing with wheels (building and installing?) we don't need all this hackery. But then you mentioned in the beginning of this thread that "un-vendoring" fastentrypoints would prevent python install bdist_wheel from working? So is that correct, or am I mistaken in thinking that the latter has anything to do with building wheels?

@greschd
Copy link
Member

greschd commented Jun 29, 2020

We don't need it for wheel building, but the way setup.py is written right now it would fail if it's not there. A simple

try:
    import fastentrypoints
except ImportError:
    pass

should do the trick.

@greschd
Copy link
Member

greschd commented Jun 29, 2020

I've added a commit to #4204 doing exactly this now. As mentioned before, both the "vendored" and "un-vendored" version are fine IMO, so you can let me know which one you'd prefer.

@sphuber
Copy link
Contributor

sphuber commented Jun 29, 2020

As long as both work for all routes of installing/building where it is necessary and for both we get the fast entry points (don't care about overhead once when installing) then I largely favor not vendoring it. Since these conditions seem to have been met, I think we can move ahead with your PR. I am just looking into the final question of the logo. Would be good if we remove it from the MANIFEST and use a permalink. I will see if I can get one for you and comment to the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants