Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boto3-stubs not detected as (naively) expected #437

Open
tungol opened this issue Jun 4, 2024 · 5 comments
Open

boto3-stubs not detected as (naively) expected #437

tungol opened this issue Jun 4, 2024 · 5 comments
Labels
P3 minor: not priorized parsing-imports

Comments

@tungol
Copy link

tungol commented Jun 4, 2024

Describe the bug
boto3-stubs is the type stubs package for boto3, and it's a bit of an oddball because it and several hundred mypy-boto3-* packages are programmatically generated from the same code base.

The issue at hand is that the documented way to use boto3-stubs relies on the extras mechanism to specify which AWS services you intend to use, rather then including all 300+ of them by default. In poetry's pyproject.toml format an example might be:

boto3-stubs = { extras = ["ec2"], version = "^1.22.2" }

Which means you'll have the base stubs for boto3, and boto3-stubs will pull in mypy-boto3-ec2 as a dependency. In my project, there's a couple places where I import some types from mypy_boto3_ec2 for annotation purposes, gating it behind a if TYPE_CHECKING statement.

Fawltydeps (correrctly) detects this as an import without a matching dependency. However, it's conceptually a little redundant to list mypy_boto3_ec2 on it's own, because the requested 'ec2' extra for boto3-stubs is pretty explicit already.

Expected behavior
It'd be nice if fawltydeps understood how boto3-stubs uses package extras. That said, following package extras is probably the wrong thing to do in a lot of other cases, and I don't know if there's a way to handle that short of just special-casing it.

I think this issue is slightly similar to the case in #176 , but only in that boto3-stubs and mypy-boto3-* packages are conceptually a single unit while being separate packages in implementation. The specific namespace issues in that ticket have no parallel here.

I waffled on opening this ticket at all. I think there's a pretty reasonable case to made that taking the explicit dependency is the right thing to do when you get to the point of importing something. It did surprise me though, because I don't usually think of mypy-boto3-ec2 as a separate package from boto3-stubs.

@tungol
Copy link
Author

tungol commented Jun 4, 2024

Possibly related to #45 as well

@jherland
Copy link
Member

jherland commented Jun 5, 2024

Thanks for raising this issue, @tungol, it is indeed interesting.

My first thought was that we already have some code to deal with type stubs in FawltyDeps, but after a second look, it does not apply to your scenario (in fact, it only handles the ~opposite scenario)1.

However, with this precedence, we are not opposed to adding special handling for common type stubs patterns2, and I wonder if your case might be best solved with a type stubs-related special case. This could be something like:

  • imports inside if TYPE_CHECKING are skipped (in effect saying: "you do not need to declare type stubs")
  • import names that look like type stubs (i.e. _stubs suffix or mypy_ prefix) are skipped (in effect saying the same as above)
  • import names that look like type stubs (i.e. _stubs suffix or mypy_ prefix) are skipped, but only if the corresponding non-stubs import name is imported elsewhere (Question: do you also import boto3_ec2 elsewhere in your code?)
  • More ideas here?

If we don't make a special case, I think we would somehow need to accept that importing a transitive dependency is OK, and we may not want to open that can of worms: Not only would it loosen the principle of "always declare what you import", but also FawltyDeps currently does not look at transitive package dependencies at all, and I suspect that adding that functionality would carry both complexity and runtime overhead that we would rather avoid.

That said, maybe extras should not be counted as transitive dependencies?

On the one hand, when you say boto3-stubs = { extras = ["ec2"], ... }, you are indeed fairly explicit in your declaration, and it should not be lumped together with "vanilla" transitive dependencies (e.g. "package foo depends on requests, and therefore I can import requests without declaring it...").

On the other hand, you don't control the exact meaning of boto3-stubs[ec2], and a future version of this extra could e.g. alter the stubs naming scheme and leave your import mypy_boto3_ec2 high and dry.

Also, having to look at extras will carry some of the same complexity/overhead mentioned above.

I believe some more thinking is needed here. Is a type stubs-related special case the right way to solve this? Are there more general solutions that solve the problem without regressions or significant overhead?

Footnotes

  1. This will prevent an "unused" type stubs dependency from being reported when the corresponding non-stubbed name is imported. In your scenario this would correspond to you declaring boto3-stubs as a dependency, and your code only containing import boto3, but no import boto3_stubs. The rationale for this code is that you can declare a dependency on a type stubs packages, without intending to import that package (instead relying on your type checker to benefit from the type stubs being installed in your environment). FawltyDeps will thus not consider the type stubs package as unused.

  2. And for the existing type stubs handling in our code, it certainly seems like we should handle packages with a mypy- prefix the same as we handle those with a -stubs suffix.

@tungol
Copy link
Author

tungol commented Jun 5, 2024

do you also import boto3_ec2 elsewhere in your code?)

The relevant "real" import in this case is just boto3, which unlike the type stubs package does handle all AWS services out of a single package. So that wouldn't work as a check. (You can import boto3.ec2 but the most common pattern for using boto3 is import boto3; client = boto3.client('ec2') so that wouldn't be a helpful check either.)

As far as rules for type stubs go, I think that "if TYPE_CHECKING doesn't count" would be a better rule than "import names that look like type stubs" - there are reasons to use type stubs at runtime, and I'd expect those imports to be enforced by fawltydeps. That might be best with a flag to control strictness? I can easily imagine some projects wanting to validate TYPE_CHECKING imports and other projects not.

Honestly I've mostly talked myself around to "the right solution is the current behavior" on this.

@jherland
Copy link
Member

jherland commented Jun 6, 2024

I think I agree that a configurable include/exclude TYPE_CHECKING imports might be the way to go here. Not sure how much work that entails in terms of keeping track of TYPE_CHECKING conditionals when we traverse the parsed Python code.

I'll leave this issue as a P3 ("we'll get around to it at some point") for now (agree, @mknorps?), and suggest one of these workarounds for you, in the meantime:

  • Set up a custom mapping that "shortcuts" your boto3-stubs dependency to the mypy_boto3_ec2 import name. E.g. in your pyproject.toml:
    [tool.fawltydeps.custom_mapping]
    boto3-stubs = ["mypy_boto3_ec2"]
  • Tell FawltyDeps to simply ignore the undeclared mypy_boto3_ec2 import, with --ignore-undeclared mypy_boto3_ec2 on the command line, or a corresponding directive in pyproject.toml.

@jherland jherland added P3 minor: not priorized parsing-imports labels Jun 6, 2024
@mknorps
Copy link
Collaborator

mknorps commented Jun 10, 2024

For context: a comment I made on 5th June, that was removed:

From my short exploration, the situation is following:
Given that you have mypy-boto-ec2 installed, FawltyDeps is using the following code to establish what import names are provided by the package https://github.com/tweag/FawltyDeps/blob/main/fawltydeps/packages.py#L250:

from importlib_metadata import (
    _top_level_declared,
    _top_level_inferred,
)
...
            imports = list(
                _top_level_declared(dist)  # type: ignore[no-untyped-call]
                or _top_level_inferred(dist)  # type: ignore[no-untyped-call]
            )

In top_level.txt file of boto3-stubs there is only boto3-stubs name exported. We will have to investigate what is inferred in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 minor: not priorized parsing-imports
Projects
None yet
Development

No branches or pull requests

3 participants