Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad inference for asyncio.gather with *args #5124

Closed
JelleZijlstra opened this issue May 30, 2018 · 12 comments
Closed

Bad inference for asyncio.gather with *args #5124

JelleZijlstra opened this issue May 30, 2018 · 12 comments
Labels
bug mypy got something wrong topic-overloads

Comments

@JelleZijlstra
Copy link
Member

JelleZijlstra commented May 30, 2018

On the following file:

import asyncio

reveal_type(asyncio.gather(*[asyncio.sleep(1), asyncio.sleep(1)]))

I expect to get something like Future[Tuple[Any, ...]], but actually with current master I get:

$ mypy bin/asgat.py 
bin/asgat.py:3: error: Revealed type is 'asyncio.futures.Future[Tuple[<nothing>]]'

(i.e., a single-element tuple).

The asyncio.gather stubs at https://github.com/python/typeshed/blob/master/stdlib/3.4/asyncio/tasks.pyi#L27 look right to me.

I found this in real code that was doing something like a, b = await asyncio.gather(*[task1, task2]).

cc @Michael0x2a

@ilevkivskyi
Copy link
Member

The <nothing> part that confused me first is unrelated, it is a known issue:

reveal_type(asyncio.sleep(1))  # Future[<nothing>]

see #3032

The actual issue is better seen here,

reveal_type(asyncio.gather(*[asyncio.sleep(1, 1.0), asyncio.sleep(1, 2.0)]))  # Future[Tuple[builtins.float*]]

A type is correct, but single item tuple is inferred instead of variable length one. I think the problem is that the first overload matches the star signature, because they are overlapping. There already a TODO item to prohibit such unsafe overlaps in #5119, second bullet, but I am not sure what should we actually write in typeshed in such cases. Maybe we need to re-think the second bullet. @Michael0x2a your opinion?

@ilevkivskyi ilevkivskyi added bug mypy got something wrong topic-overloads labels May 30, 2018
@JelleZijlstra
Copy link
Member Author

(To be clear this is a regression on master; it didn't happen in 0.590 or 0.600.)

@ilevkivskyi
Copy link
Member

To be clear this is a regression on master

This is (kind of) expected. It was a necessary compromise, we decided to split overload overhaul in few PRs to avoid single mega-PR.

@Michael0x2a
Copy link
Collaborator

I think this is a case of a previously silent bug becoming more visible. If you try running @ilevkivskyi's example on mypy 0.600, the revealed type will be just Any.

But my PR removes the "let's have overloads sometimes give up and return Any" behavior which, unfortunately, ends up breaking some code.

I think the easiest way to fix this would be to flip the order in which the overloads are defined so that the alternative with the long list of arguments and the return type of Future[Tuple[Any, ...]] comes first. I believe this should be a safe rearrangement: it looks like the alternatives all have non-overlapping arity so it should be safe to rearrange them in any order.

The only catch is that when I tried making that change, mypy ended up inferring a return type of Future[Tuple[Any]] instead of Future[Tuple[Any, ...]] for some reason.

I think this might actually be yet another unrelated bug: I get the same issue when I try doing this:

def foo() -> Tuple[int, ...]: ...

reveal_type(foo())  # Revealed type is 'Tuple[int]', not 'Tuple[int, ...]'?

(That said, I suspect this sort of pattern might end up being mildly confusing: you typically want the signature with the widest types to come last so you avoid shadowing another signature, but you want the signatures with the largest number of arguments to come first so that func(*args) is most likely to do the right thing. There's a bit of an awkward tension there -- not sure what to do about that.)

@Michael0x2a
Copy link
Collaborator

Also, something I can also do is write a script to search to search typescript to search for long overloads/overloads with a steadily increasing number of arguments -- basically, search for cases where people are trying to use overloads to hack around the lack of "variable type vars" and manually check to make sure they're all ok.

@ilevkivskyi
Copy link
Member

I think this might actually be yet another unrelated bug:

This is a bug in reveal_type (or actually in how types are repred internally). I just tried few more tests and the resulting type behaves as variable length tuple and appears as such in error messages.

In the original example it is however a single element tuple (mypy gives "Index out of range" error), so indeed the first overload is matched. We just had a discussion with @JukkaL and agreed the best way to proceed is to modify the rule "in ambiguous cases first overload wins" to "in ambiguous cases type checker are free to determine best match; if a checker sees more than one equally best match, the first one wins." This has two upsides:

  • Some type-checkers may be not happy about alway choosing first overloads in cases of empty containers, multiple inheritance, etc. We give some freedom here, since this is anyway a corner case.
  • We can special case this situation by preferably matching star to star if there are multiple matches.

@Michael0x2a what do you think? If you agree, could you please update the checklist in #5119

@Michael0x2a
Copy link
Collaborator

@ilevkivskyi -- I'd be ok with leaving the behavior of overloads undefined when the ambiguity is due to things like multiple inheritance.

I'm a bit more hesitant about allowing the behavior to be undefined in other cases, though. In particular, I think it's worth pinning down what exactly ought to happen when you do my_overload(*args) and there are multiple matches: I think that'll end up being a relatively common case.

I'm not sure what exactly the spec should look like though.

One idea I had was that if a call could pass in a potentially infinite number of arguments (e.g. when doing my_overload(*args) or my_overload(**kwargs) or both), we automatically disqualify any overload alternative that can accept only a finite number of arguments as a potential match. We then take the alternatives that are left and pick the first one that matches.

If there are no matches left (e.g. none of the alternative signatures contain an *arg or **kwarg parameter), we could either be strict and throw an error or could just fall back to the existing behavior.


As a minor aside, I think the way we handle empty containers I think is already undefined. Basically, when we do my_overload([]), type checkers need to infer what the type of [] will end up being -- but we never specify anywhere (either in PEP 484 or in my proposal) how exactly that should be done.

Mypy currently will pick the first matching overload and use that to infer the correct type if the empty container is inline; if we try assigning the empty list to a variable it mandates a type annotation. I could see other type checkers being more strict in the case of the former -- they could maybe infer some sort of union type or maybe require an explicit cast or something.

@ilevkivskyi
Copy link
Member

I'm a bit more hesitant about allowing the behavior to be undefined in other cases, though. In particular, I think it's worth pinning down what exactly ought to happen when you do my_overload(*args) and there are multiple matches: I think that'll end up being a relatively common case.

OK, makes sense. Although the overloads like in this issue may be less important when we will have variadic type variables, the calls like this may still happen quite often.

The problem however is that the behaviour of star args is not well specified even for non-overloaded calls. A better spec may be also part of the upcoming shape types PEP (the "numpy" one). With the new PEP each collection will be given not only item type, but also shape (generalisation of size for 1D collections). The latter can be also Any-shape or object-shape. Currently, there is a bit of inconsistency in that sometimes arbitrary length collections behave like object-shape (Tuple[int, ...] is not a subtype of Tuple[<fixed number of ints>]) while sometimes like Any-shape (f(*lst) calls are accepted if f takes only fixed number of args). Ideally we should specify this at some point.

Coming back to overloads, current behaviour for non-overloaded calls is mostly Any-like with respect to shape (just length in this case), so we should be consistent. First I think this call should result in superclass, not Any as currently:

class B(A): ...

@overload
def k(x: int) -> B: ...
@overload
def k(x: Any) -> A: ...   # note _explicit_ Any overload

a: Any
k(a)  # Currently this is Any, but it is safe to infer 'A' I think.

Second, we can continue this to containers:

@overload
def h(x: List[int]) -> List[int]: ...
@overload
def h(x: List[Any]) -> List[Any]: ...   # Again, explicit Any

ll: List[Any]
h(ll)  # I think this should be List[Any], not Any as currently

Now coming back to star args. I think they should behave as Any-shape (or Any-size). So continuing the logic above, if there is an explicit star overload, we should match it (as I think we should match explicit Any-type overload, instead of assuming call is ambiguous). If there is no explicit star overload, then the call is ambiguous and type checkers are free to chose first (default) or infer a union, or Any result as we do for ambiguous matches due to Any.

@Michael0x2a what do you think? Sorry for potentially fuzzy logic, I hope you get my idea, I really need to run now.

@Michael0x2a
Copy link
Collaborator

@ilevkivskyi -- sorry for the delayed response; I didn't see this until now.

  1. Oh, I didn't realize that the behavior of star args was undefined. In that case, I'm a bit more comfortable with mypy doing its own thing.

  2. Regarding the Any inference thing -- I think we can get the behavior you're looking for relatively easily by checking the list of matching return types: if the last match is a supertype of all the previous ones, we return that instead of inferring 'Any'.

    I guess I'm not really sure what this has to do with "shape", though.

  3. I agree with your proposal for star overloads -- I think it actually amounts to the same thing (or at least a very similar thing) to my proposal.

    And just to make sure we're on the same page: if a call matches multiple alternatives that have star args, we're ok with picking and using the first one, right?

@ilevkivskyi
Copy link
Member

I guess I'm not really sure what this has to do with "shape", though.

This is normal :-) I mean I was thinking a lot about them recently in context of numeric stack support, so now I see analogies everywhere.

if the last match is a supertype of all the previous ones, we return that instead of inferring 'Any'.

This sounds right.

And just to make sure we're on the same page: if a call matches multiple alternatives that have star args, we're ok with picking and using the first one, right?

Yes, exactly.

Michael0x2a added a commit to Michael0x2a/mypy that referenced this issue Jun 7, 2018
This pull request implements the changes discussed in
python#5124.

Specifically...

1. When two overload alternatives match due to Any, we return the last
   matching return type if it's a supertype of all of the previous ones.
   If it's not a supertype, we give up and return 'Any' as before.

2. If a user calls an overload with a starred expression, we try
   matching alternatives with a starred arg or kwarg first, even if
   those alternatives do not appear first in the list. If none of the
   starred alternatives are a valid match, we fall back to checking the
   other remaining alternatives in order.
ilevkivskyi pushed a commit that referenced this issue Jun 13, 2018
This pull request implements the changes discussed in #5124.

Specifically...

1. When two overload alternatives match due to Any, we return the last matching return type if it's a supertype of all of the previous ones. If it's not a supertype, we give up and return 'Any' as before.

2. If a user calls an overload with a starred expression, we try matching alternatives with a starred arg or kwarg first, even if those alternatives do not appear first in the list. If none of the starred alternatives are a valid match, we fall back to checking the other remaining alternatives in order.
@Michael0x2a
Copy link
Collaborator

@JelleZijlstra -- sorry, I'm a bit late in following up on this, but we merged in #5166 a day or two ago which should hopefully make your code typecheck. When you get the chance, can you check and see if these regressions are fixed in your codebase?

@JelleZijlstra
Copy link
Member Author

Yes, all good now. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug mypy got something wrong topic-overloads
Projects
None yet
Development

No branches or pull requests

3 participants