Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #97 PEP420: find_packages() #1312

Merged
merged 5 commits into from
Jul 11, 2018
Merged

Conversation

silkentrance
Copy link
Contributor

make find_packages() find also PEP420 namespace packages when supported

@jaraco
Copy link
Member

jaraco commented Mar 31, 2018

I think there were objections to making the Namespace packages finder the default. I don't see those objections addressed here or in #97. This change, if I understand it correctly, will break the expectation of many packages, where they have 'tests' and 'docs' without a __init__.py that are currently not installed, but with this change will get installed. Surely such a change can't be accepted without serious consideration of the consequences and a good writeup of what users should expect to do about it.

@pganssle
Copy link
Member

I'm also -1 on changing the default, especially changing the default only on Python 3. People find it hard enough to reason about writing a 2/3 compatible setup.py without this.

I think having a way to use find_packages with namespace packages is important, but having find_packages do it automatically is not worth the mental overhead this adds.

@pganssle
Copy link
Member

One consideration, these implicit namespace packages only work for Python 3-only packages, right?

I guess it happens at different times, but it would probably be a good idea to check the python_requires when an implicit namespace package is detected and throw a warning if it is either empty or includes Python 2.

@silkentrance
Copy link
Contributor Author

@jaraco in #97 a requirement was stated that in case that the PEP420 was supported, it should be default, so this PR is about meeting that requirement.

However, I would be totally fine with just exporting a find_packages_ns() method instead.

What do you think, @jaraco @pganssle

@jaraco
Copy link
Member

jaraco commented Apr 6, 2018

Yes, I was thinking about this in my waking hours this morning. Ultimately, Python 3.4 (and earlier) will get old and all namespace packages will look like PEP420 namespace packages. When that happens, we'd like for setuptools and libraries like setuptools to have an interface that's clean and free of legacy cruft.

To get there without breaking existing packages, I believe we need a mechanism for the find_packages behavior to remain unchanged (for now), but signal that in future versions, the behavior will change.

Here's an idea. What if:

phase 1

find_packages will invoke both finders and where the results differ, warn that in the future find_packages will return the larger set of packages and indicate that to retain existing behavior, exclude those package, but setuptools use the compatible result.

phase 2

About 6 months later, as a backward-incompatible release, find_packages uses PEP420Finder.find.


I could imagine a 3-phase release that retains the legacy behavior, but the more I think about it, the less important that seems to retain.

@pganssle
Copy link
Member

pganssle commented Apr 6, 2018

@jaraco I mostly don't like that the default behavior changes based on what interpreter you use to run setup.py with. I'm fine with the default behavior changing to something else, but it should happen for all interpreters, possibly issuing a warning when a project uses implicit namespaces on a too-old version of Python.

That said, the "legacy behavior" is probably the overwhelmingly common case here. I think most packages use a single folder with an __init__.py, and that's not going to change even once everyone is 3.4+. A large fraction of those have a tests/ folder or other various non-package folders without an __init__.py in it, sibling to the main package folder. If I'm understanding the failure mode correctly here, I think there's reasonably strong case for leaving namespace packages as opt-in, or at least making the legacy behavior as simple as a feature flag or variant function.

@silkentrance
Copy link
Contributor Author

silkentrance commented Apr 12, 2018

@pganssle the current solution supports both, the legacy and the future, yet one might need to exclude additional paths.

@jaraco perhaps we need to add additional checks, preventing non python packages from being included in the legacy (default) case while still making sure that PEP420 packages are being found.

Maybe searching for *.py would do the trick. This, however, might still lead to unwanted results, which, however, are the sole responsibility of the user, i.e. defining proper excludes.

@pganssle
Copy link
Member

the current solution supports both, the legacy and the future, yet one might need to exclude additional paths.

What I meant was that if the PEP 420 find_packages ends up creating a lot of extra work for non-namespace packages who now have to write find_packages(exclude=['tests/', 'scripts/'] or something when they didn't have to before, then I don't think this is a real improvement.

The default should be the thing that works correctly in the common case, with an easy upgrade path for the uncommon case. If 1% of packages use implicit namespaces (I think this number is probably a dramatic over-estimate), and 20% have a tests/ and/or scripts/ file with .py files in them, that means we are making 20% of people write:

find_packages(exclude=['scripts/', 'test'])

Just so that 1% of people don't have to write:

find_packages(namespace=True)

We're also at a pretty significant risk of having a bunch of packages not realize that this has happened and accidentally installing scripts and tests packages into the root namespace!

@pganssle
Copy link
Member

pganssle commented Apr 12, 2018

>>> from setuptools import PEP420PackageFinder as pf
>>> from setuptools import find_packages

>>> print(find_packages())
['dateutil', 'dateutil.tz', 'dateutil.test', 'dateutil.zoneinfo', 'dateutil.parser']

>>> print(pf.find())'
['dateutil', 'build', 'ci_tools', 'docs', 'dist', 'dateutil.tz', 'dateutil.test',
'dateutil.zoneinfo', 'dateutil.parser', 'build.lib', 'build.lib.dateutil',
'build.lib.dateutil.tz', 'build.lib.dateutil.zoneinfo', 'build.lib.dateutil.parser',
'docs.samples']

This would be a very dangerous change to the default, for very little gain. Some of those directories are generated directories (meaning that running setup.py twice would give different results). I think implicit namespace packages should remain opt-in indefinitely.

@silkentrance
Copy link
Contributor Author

silkentrance commented Apr 12, 2018

@pganssle you need to define the excludes when running find().
so, the least you would do is to define the includes 😁 that should be considered when finding any packages, e.g. find(include=['src']). and, if you have any non packages below that src, you would need to define excludes for these, e.g. find(..., exclude=['src/justresources']).
For my part, this could not be any easier to do.

personally, i think that omitting the include is blatant ignorance on behalf of the users of setuptools, always believing that __init__.py is for granted.

@silkentrance
Copy link
Contributor Author

silkentrance commented Apr 12, 2018

@pganssle considering for example the maven over at java world, it defines some common grounds, e.g. test-sources, test-resources, sources, and resources. all of these are based on specific conventions, especially the location on where to find these resources. yet, one can still configure the otherwise heavily biased build system to their likings.
and the same is true with include and exclude, both of which have been there for a very long time, but due to python supporting only one kind of package idiom, namely the one with __init__.py, everybody, especially the more lazy ones, didn't actually care about these mechanisms. everybody? no wait, there are actually a lot of people using these idioms, especially the exclude.

and given the current PR, one can always choose to import just the old version of find packages, namely find_packages_std.

and considering also that all of this happens during the packaging phase, it will not affect users of the package. so it is solely up to the users of setuptools to make it happen. i.e. adapt, adopt and improve, to cite monty python.

for my part i strongly believe that the migration to the new approach is rather easy to do, unless of course most of the users are unwilling to touch their existing code bases. in that case i'd say: wait for contributions to happen, i.e. people wanting to install your package from source and failing to do so because you fucked up and did no manage to maintain your package accordingly. and if you still fail to maintain your package, be prepared for people to fork and do their own thing, even more quickly than you.

@pganssle
Copy link
Member

pganssle commented Apr 13, 2018

@silkentrance I know how to write the excludes, what I'm saying is that currently find_packages() just works for a large fraction of all Python projects. Most Python projects are not using namespaces and the src layout is not as popular as the "package directly in root".

What that means is that by changing the default thing that find_packages does, for most people we're making find_packages harder to use, not easier. What's more, if you continued to use find_packages as the majority of people do - which is to say using find_packages() or find_packages(excludes=['tests']) or some limited thing, when someone starts using python setup.py with a recent version of setuptools, it will silently start including a bunch of directories like build and test. A lot of them won't even notice this is happening.

Changing the default for find_packages would cause a bunch of headaches and break a bunch of existing code for very little benefit. On the other hand, we could easily make this behavior opt-in, so that anyone with a namespace package can just do:

PACKAGES = find_packages(include=['src'], implicit_namespace=True)

And it will work just fine. If it's so easy for people to manually include and exclude their package directories, why is it so difficult to ask people with implicit namespace packages to opt-in to this behavior? After all, you want the default option to be the common case and the opt-in to be the uncommon case. Implicit namespace packages are decidedly the uncommon case.

I don't think that the "new approach" is inherently more virtuous, so changing the default in direct contradiction to how the majority of packages are laid out in a backwards incompatible way is hardly justified.

@jaraco
Copy link
Member

jaraco commented Apr 13, 2018

we could easily make this behavior opt-in

Just please don't have a boolean flag in the parameters to the function. Use a different function name for selecting different behavior (e.g. find_packages_ns or similar with the same signature as find_packages).

@pganssle
Copy link
Member

Just please don't have a boolean flag in the parameters to the function.

I'm fine with it being a separate function. Normally I'm not a big fan of a bunch of proliferating boolean parameters, but I think this is a borderline case. It's fundamentally the same function used in one of two modes. All the rest of the input parameters are the same and have the same effects and the result is the same (conceptually). This sort of "mode" flag is just providing information about whether this is a package with implicit namespaces or not in the same way that include tells you where the base package locations are and exclude tells you what parts are not part of the package.

In any case, if you're fundamentally against a boolean parameter I don't see a problem with keeping it separate. I've been promoting using alternate constructor-style variants lately because I prefer find_packages.namespace to find_packages_namespace, but find_package_ns works too.

@silkentrance
Copy link
Contributor Author

@pganssle @jaraco I have changed the commit accordingly. find_packages will behave as normal and will not change its behaviour with newer python versions. find_packages_ns needs some more work though, in case it is running on a python version that does not support PEP420 namespace packages.

@@ -17,7 +18,7 @@

__all__ = [
'setup', 'Distribution', 'Feature', 'Command', 'Extension', 'Require',
'find_packages',
'find_packages'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add find_packages_ns here?

@pganssle
Copy link
Member

@silkentrance I think you maybe forgot to push or commit your changes or something, because I'm still seeing the version where find_packages is set to find_packages_ns in Python 3.

@silkentrance
Copy link
Contributor Author

@pganssle I have changed the exports and tests accordingly. find_packages_ns is exported only for Python 3.3+. Similarly, the tests will now be executed for Python 3.3+.

@codecov
Copy link

codecov bot commented Apr 14, 2018

Codecov Report

Merging #1312 into master will increase coverage by 0.54%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1312      +/-   ##
==========================================
+ Coverage    80.9%   81.45%   +0.54%     
==========================================
  Files         105      103       -2     
  Lines       12346    12162     -184     
==========================================
- Hits         9988     9906      -82     
+ Misses       2358     2256     -102
Impacted Files Coverage Δ
setuptools/tests/test_find_packages.py 100% <100%> (ø) ⬆️
setuptools/__init__.py 100% <100%> (ø) ⬆️
setuptools/glibc.py 70.96% <0%> (-6.46%) ⬇️
setuptools/pep425tags.py 68.07% <0%> (-3.19%) ⬇️
pkg_resources/__init__.py 86.27% <0%> (-0.1%) ⬇️
setuptools/config.py 99.54% <0%> (-0.05%) ⬇️
pkg_resources/tests/test_resources.py 100% <0%> (ø) ⬆️
setuptools/tests/test_wheel.py 100% <0%> (ø) ⬆️
setuptools/tests/test_egg_info.py 100% <0%> (ø) ⬆️
setuptools/tests/test_pep425tags.py
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update de686c2...65b8c13. Read the comment docs.

# just catch
pass

py33_upwards = pytest.mark.xfail(sys.version_info < (3,3), reason="Test runs on Python >= 3.3 only")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be skipif not xfail.

xfail indicates that it is currently a failing test that needs to be fixed. skipif indicates that the tested behavior isn't intended to occur under the skipif conditions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pganssle i found that all other conditional tests use xfail, as in expected failure. but i can make this a skipif.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I was planning on going through and checking on those other xfail tests because I did notice a few of them passing inappropriately.

I think it's a bit strange to use xfail instead of skipif, because xfail is more useful if you would like it to start passing at some point in the future and it's unnecessary to run tests where you don't actually care whether or not they pass.

@pganssle
Copy link
Member

One possible objection in favor of letting this find_packages_ns exist on older versions of Python is that as a general rule you cannot count on the version of Python used to invoke setup.py being the same as the version of Python for which you are installing the package and the script doesn't actually require Python 3 to work.

That said, it would be pretty unusual to create Python 3-only packages using Python 2, so I think I'm in favor of leaving it the way it is.

Copy link
Contributor

@jmbowman jmbowman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the discussion on this has mostly settled on the current form; is there anything further beyond documentation updates and a changelog.d entry that needs to be done before merging?

@@ -107,7 +115,11 @@ def _looks_like_package(path):
return True


find_packages = PackageFinder.find
find_packages = find_packages_std = PackageFinder.find
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is find_packages_std actually used anywhere?

Copy link
Member

@pganssle pganssle May 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think the idea is that it would be a more explicit public interface, but I think let's just drop it.

@pganssle
Copy link
Member

@jmbowman Yes, I think there are no substantive objections to adding this, it just needs documentation and a changelog entry as you mention.

Ideally once we have the documentation built on PRs, we can rebase this PR and look at the docs. I'd really like to reconsider how this looks in the API documentation before merging.

@silkentrance
Copy link
Contributor Author

@pganssle I have rebased to master and dropped find_packages_std.

@silkentrance
Copy link
Contributor Author

@pganssle rebased and resolved conflicts. what is the status about this?

find_420_packages = setuptools.PEP420PackageFinder.find
try:
from setuptools import find_packages_ns
except:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be except ImportError

@silkentrance silkentrance force-pushed the gh-97 branch 2 times, most recently from 457446d to 382ec75 Compare July 4, 2018 12:32
@pganssle
Copy link
Member

pganssle commented Jul 4, 2018

Python 2.7 is still supported.

@silkentrance
Copy link
Contributor Author

@pganssle argh. okay, will include the checks again.

@silkentrance silkentrance force-pushed the gh-97 branch 2 times, most recently from 4fef13b to bb87cd4 Compare July 4, 2018 12:47
CHANGES.rst Outdated
@@ -1,6 +1,7 @@
v39.2.0
-------

* #97: PEP 420: introduce find_packages_ns()
Copy link
Member

@pganssle pganssle Jul 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the future, changes go in their own separate files in the changelog.d folder (for various reasons, including avoiding merge conflicts with everyone editing the same CHANGELOG file). In this case you should create a stub called changelog.d/1312.change.rst.

See the contributing guide for more details. (Though I think when you first made this PR we hadn't set up town crier, so there's no way you could have known).

This fixes GH pypa#97 by introducing an alternate version of find_packages
that works with PEP 420 namespace packages.
@pganssle
Copy link
Member

pganssle commented Jul 4, 2018

@silkentrance FYI I moved the changelog entry into changelog.d and rewrote the history of the branch to separate the commits into the logical changes (that's not necessary it's just something I like to do). I'd like to take a quick look at the documentation before merging but I think it's essentially ready.

@@ -107,6 +110,21 @@ As you can see, it doesn't take much to use setuptools in a project.
Run that script in your project folder, alongside the Python packages
you have developed.

For Python 3.3+, and whenever you are using PEP 420 compliant implicit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be under "Basic Use". The majority of people will not be using namespace packages, and we should probably go into more detail about the exact caveats of using this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, where should we put it then? As for exact caveats, what do you have in mind? Present the user with a possible project structure and folders underneath some source root that look like packages but should be excluded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will put it into its own section directly below Using find_packages().

@silkentrance
Copy link
Contributor Author

@pganssle how did you separate the forced commits? or did you do new commits, assuming my user name when committing?

@pganssle
Copy link
Member

pganssle commented Jul 4, 2018

Hm, where should we put it then? As for exact caveats, what do you have in mind? Present the user with a possible project structure and folders underneath some source root that look like packages but should be excluded?

I made some changes to the documentation and pushed them to this branch. I was going to make it a PR against your fork, but there was some sort of problem with that, not sure what was going on there.

how did you separate the forced commits? or did you do new commits, assuming my user name when committing?

Are you sure you want to know this? I have been afflicted by this knowledge in the sense that now that I know it is possible to rewrite commit histories to align with my aesthetic preferences, I feel compelled to do so, which really makes the "Merge pull request" button a less useful :P.

In any case, I did git rebase -i, set the commit I wanted to split as edit, then when it stopped on that commit I did git reset HEAD^ (which loses the original commit message mind you, so make sure to record it if you care about keeping it), and then made new commits one-by-one with git commit --author="Carsten Klein <trancesilken@gmail.com>".

This moves the documentation out of "Basic Usage" and expands on some of
the caveats of this approach.
@silkentrance
Copy link
Contributor Author

@pganssle Okay, so you've been using the rebase trickery then 👍

@pganssle
Copy link
Member

pganssle commented Jul 4, 2018

Here is the relevant section of the documentation: https://deploy-preview-1312--ecstatic-morse-dada75.netlify.com/setuptools.html#find-packages-ns

@jmbowman Any interest in giving it a once over?

@silkentrance
Copy link
Contributor Author

@pganssle ok, this is looking good. I was about to do something similar:

+Using ``find_packages_ns()``
+----------------------------
+
+For Python 3.3+, and whenever you are using PEP 420 compliant implicit
+namespace packages, you can use ``find_packages_ns()`` instead.
+
+It is basically a drop-in replacement for ``find_packages()``, with the
+difference being that it supports finding of packages that do not have an
+``__init__.py`` module.
+
+That being said, ``find_packages_ns()`` will require you to provide exclusions
+whenever you have package-like folders in your source and which must not
+be detected as packages.
+
+Given the following source root::
+
+    caching_json_generator/
+        tests/
+            test_cache.py
+            test_fixtures/
+                fixture_1.json
+        internal/
+            cache.py
+            integration.py
+        resources/
+            template.json
+        __init__.py
+
+The following packages will be detected by ``find_packages_ns()``::
+
+    caching_json_generator
+    caching_json_generator.tests
+    caching_json_generator.tests.test_fixtures
+    caching_json_generator.internal
+    caching_json_generator.resources
+
+While excluding the tests might be familiar to you, excluding simple
+resource folders hasn't been a requirement yet.
+
+So in order to make ``find_packages_ns()`` work as expected, you should
+use proper exclusions::
+
+    from setuptools import setup, find_packages
+    setup(
+        name="HelloWorld",
+        version="0.1",
+        packages=find_packages(exclude=["tests", "resources"]),
+    )

But I think that your version is far more elaborate. Thanks!

Copy link
Contributor

@jmbowman jmbowman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like everything's been addressed, 👍

@pganssle pganssle merged commit 89155ab into pypa:master Jul 11, 2018
@pganssle
Copy link
Member

Thanks for your contribution @silkentrance!

@silkentrance
Copy link
Contributor Author

You are welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants