Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform lifetime elision (more) syntactically, before type-checking. #39305

Merged
merged 12 commits into from
Jan 28, 2017

Conversation

eddyb
Copy link
Member

@eddyb eddyb commented Jan 25, 2017

The initial goal of this patch was to remove the (contextual) &RegionScope argument passed around rustc_typeck::astconv and allow converting arbitrary (syntactic) hir::Ty to (semantic) Ty.
I've tried to closely match the existing behavior while moving the logic to the earlier resolve_lifetime pass, and the crater report suggests none of the changes broke real code, but I will try to list everything:

There are few cases in lifetime elision that could trip users up due to "hidden knowledge":

type StaticStr = &'static str; // hides 'static
trait WithLifetime<'a> {
    type Output; // can hide 'a
}

// This worked because the type of the first argument contains
// 'static, although StaticStr doesn't even have parameters.
fn foo(x: StaticStr) -> &str { x }

// This worked because the compiler resolved the argument type
// to <T as WithLifetime<'a>>::Output which has the hidden 'a.
fn bar<'a, T: WithLifetime<'a>>(_: T::Output) -> &str { "baz" }

In the two examples above, elision wasn't using lifetimes that were in the source, not even needed by paths in the source, but rather happened to be part of the semantic representation of the types.
To me, this suggests they should have never worked through elision (and they don't with this PR).

Next we have an actual rule with a strange result, that is, the return type here elides to &'x str:

impl<'a, 'b> Trait for Foo<'a, 'b> {
    fn method<'x, 'y>(self: &'x Foo<'a, 'b>, _: Bar<'y>) -> &str {
        &self.name
    }
}

All 3 of 'a, 'b and 'y are being ignored, because the &self elision rule only cares that the first argument is "self by reference". Due implementation considerations (elision running before typeck), I've limited it in this PR to a reference to a primitive/struct/enum/union, but not other types, but I am doing another crater run to assess the impact of limiting it to literally &self and self: &Self (they're identical in HIR).

It's probably ideal to keep an "implicit Self for self" type around and only apply the rule to &self itself, but that would result in more bikeshed, and #21400 suggests some people expect otherwise.
Another decent option is treating self: X, ... -> Y like X -> Y (one unique lifetime in X used for Y).

The remaining changes have to do with "object lifetime defaults" (see RFCs 599 and 1156):

trait Trait {}
struct Ref2<'a, 'b, T: 'a+'b>(&'a T, &'b T);

// These apply specifically within a (fn) body,
// which allows type and lifetime inference:
fn main() {
    // Used to be &'a mut (Trait+'a) - where 'a is one
    // inference variable - &'a mut (Trait+'b) in this PR.
    let _: &mut Trait;

    // Used to be an ambiguity error, but in this PR it's
    // Ref2<'a, 'b, Trait+'c> (3 inference variables).
    let _: Ref2<Trait>;
}

What's happening here is that inference variables are created on the fly by typeck whenever a lifetime has no resolution attached to it - while it would be possible to alter the implementation to reuse inference variables based on decisions made early by resolve_lifetime, not doing that is more flexible and works better - it can compile all testcases from #38624 by not ending up with &'static mut (Trait+'static).

The ambiguity specifically cannot be an early error, because this is only the "default" (typeck can still pick something better based on the definition of Trait and whether it has any lifetime bounds), and having an error at all doesn't help anyone, as we can perfectly infer an appropriate lifetime inside the fn body.

TODO: write tests for the user-visible changes.

cc @nikomatsakis @arielb1

@rust-highfive
Copy link
Collaborator

r? @nrc

(rust_highfive has picked a reviewer for you, use r? to override)

@nrc nrc assigned nikomatsakis and unassigned nrc Jan 25, 2017
@arielb1
Copy link
Contributor

arielb1 commented Jan 25, 2017

Another decent option is treating self: X, ... -> Y like X -> Y (one unique lifetime in X used for Y).

That's certainly an interesting option with the rule being "resolution-based". I would like either that or to treat "just &self with HIR changes".

If we're changing things here, I would also prefer to change &mut *const Trait to be &mut *const Trait+'static, to be consistent with &mut RawPtr<Trait+'static>.

@eddyb
Copy link
Member Author

eddyb commented Jan 25, 2017

The second crater report (enabling the &self rule only for &self and self: &Self):

  • a bunch of breaking libsyntax changes in between crater runs (I rebased)
  • @joelself's tomllib (failure) is the only true regression
pub fn get_children<S>(self: &Parser<'a>, key: S) -> Option<&Children>

This is really bad because it doesn't fit the "only literally &self" rule, not even the "self: X with exactly one unique lifetime in X" rule, and I'd rather not break anything in practice, at least to start off with.

@eddyb
Copy link
Member Author

eddyb commented Jan 25, 2017

@alexcrichton I'm getting a travis failure but all I did was move E0106's explanation (including 2 code examples) from src/librustc_typeck/diagnostics.rs to src/librustc/diagnostics.rs - is the former not being checked at all, or is this somehow my bug still? (I checked them locally and the result is the same)

@alexcrichton
Copy link
Member

@eddyb whoa I can't even interpret that failure at all... I'm not too familiar with error index checking, but others may know more

@eddyb
Copy link
Member Author

eddyb commented Jan 26, 2017

I checked and the indices do correspond to E0106's code examples (in the combined error code page).
Since it's combined, my earlier question doesn't make sense at all, now that I think about it.

cc @rust-lang/docs

@GuillaumeGomez
Copy link
Member

@eddyb: I found 9 references to E0106 but they're lost in the middle of other errors. Seems a bit strange and not unexpected, no? Not sure of what is the problem in here.

@bors
Copy link
Contributor

bors commented Jan 26, 2017

☔ The latest upstream changes (presumably #39309) made this pull request unmergeable. Please resolve the merge conflicts.

@nikomatsakis
Copy link
Contributor

@arielb1

If we're changing things here, I would also prefer to change &mut *const Trait to be &mut *const Trait+'static, to be consistent with &mut RawPtr<Trait+'static>.

Hmm, this does seem like an oversight in the original RFC. I think the "spirit" of the rules would indeed be that *const T and *mut T "reset" the default, just as Box<T> does.


@eddyb @arielb1

Another decent option is treating self: X, ... -> Y like X -> Y (one unique lifetime in X used for Y).

I agree this is potentially a nice rule but I do not feel comfortable making changes like this to the RFC without at minimum wider discussion. What I had personally envisioned is that we would create a new elision rule which avoids some of the more confusing cases and open an RFC to lint for cases that fall outside of that newer rule. Similarly, I might want to pair such a rule with some improvements that allow for more anonymous or lightweight parameters -- i.e., so that even if you don't have to say nothing at all, you maybe don't have to go all the way towards declaring a named parameter -- but that depends on how many errors we see on crates.io.

A prime example of something I would want to be a lint is this:

struct Ref<'a, T> { ... }
impl Something {
    fn foo(&self) -> Ref<i32>;
}

But I do not want you to have to write fn foo<'a>(&'a self) -> Ref<'a, i32>. I think I would rather you can write something like Ref<&, i32>. I'm sure many will hate this but anyway my point is that while I do think the current rules can be improved, I think this is something we should discuss more broadly, unless there is something clearly broken in current implementation.

@nikomatsakis
Copy link
Contributor

In terms of what's actually implemented, then, the main change is that this all happens before typeck and thus has to be driven by what is syntactically written, right? So in particular the two examples you gave no longer "work":

fn foo(x: StaticStr) -> &str { x }
fn bar<'a, T: WithLifetime<'a>>(_: T::Output) -> &str { "baz" }

But any case like &self, self: &Self, or self: &Parser<'a, 'b> continues to work as it did before, right?

@nikomatsakis
Copy link
Contributor

Somehow those two changes feel "in the spirit" of the original RFC to me. I'm not sure how I feel about changing the behavior of self: &Parser<'a, 'b>. This seems a bit less clear. I think I am thinking in part of the (not accepted) RFC to disable elision in impls with lifetime parameters, for fear of confusion -- which would have been similar but more extreme.

@nikomatsakis
Copy link
Contributor

In any case, I think that this is something that @rust-lang/lang should weigh in on.

@eddyb
Copy link
Member Author

eddyb commented Jan 26, 2017

I don't want to change any more here, just noting it may be a good idea to do it, so that we can have simpler and more uniform rules (considering we can't just evaluate arbitrary types).

@nikomatsakis
Copy link
Contributor

@eddyb

I don't want to change any more here, just noting it may be a good idea to do it, so that we can have simpler and more uniform rules (considering we can't just evaluate arbitrary types).

Agreed. I think the main thing I want to change is that there should never be "invisible" lifetimes. Instead we should have some sufficiently light notation for acknowledging them. So e.g., fn(&T) is ok, but instead of fn(Ref<T>) you might write fn(Ref<&, T>). I feel like I'd prefer something lighter still but haven't figured out what that might be.

Copy link
Contributor

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, I think, the main thing I would like is more tests, as we discussed on IRC. I'll try to think of some edge cases.

let module = self.lower_mod(&c.module);
let attrs = self.lower_attrs(&c.attrs);
let exported_macros = c.exported_macros.iter().map(|m| self.lower_macro_def(m)).collect();
struct MiscCollector<'lcx, 'interner: 'lcx> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why such a generic name? This is basically the "lifetime collector" or something, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd use this for anything we may need to access out of order.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine. I'd probably just call it "collector" then but whatever.

@@ -89,6 +89,12 @@ impl fmt::Debug for Lifetime {
}
}

impl Lifetime {
pub fn is_elided(&self) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should leave a comment on the Name field, though I might prefer to make an enum here. Something like

enum LifetimeName {
     Elided,
    Present(Name)
}

Using a "sentinel" value feels a bit non-obvious.

@bors
Copy link
Contributor

bors commented Jan 27, 2017

☔ The latest upstream changes (presumably #39139) made this pull request unmergeable. Please resolve the merge conflicts.

@eddyb
Copy link
Member Author

eddyb commented Jan 27, 2017

Oh, these are the failures:

error[E0106]: missing lifetime specifier
 --> <anon>:6:10
  |
6 |     foo: Foo,     // error: expected 1, found 0
  |          ^^^ expected lifetime parameter

thread 'rustc' panicked at 'Some expected error codes were not found: ["E0107"]'
error[E0106]: missing lifetime specifier
 --> <anon>:5:15
  |
5 | impl Quux for Foo { } // error: expected 1, found 0
  |               ^^^ expected lifetime parameter

thread 'rustc' panicked at 'Some expected error codes were not found: ["E0107"]

Looks like this is about E0107 changing to E0106, not E0106 examples themselves.

@eddyb
Copy link
Member Author

eddyb commented Jan 27, 2017

@bors r=nikomatsakis

@bors
Copy link
Contributor

bors commented Jan 27, 2017

📌 Commit 8fbf336 has been approved by nikomatsakis

@bors
Copy link
Contributor

bors commented Jan 28, 2017

☔ The latest upstream changes (presumably #37057) made this pull request unmergeable. Please resolve the merge conflicts.

@eddyb
Copy link
Member Author

eddyb commented Jan 28, 2017

@bors r=nikomatsakis

@bors
Copy link
Contributor

bors commented Jan 28, 2017

📌 Commit c0e474d has been approved by nikomatsakis

bors added a commit that referenced this pull request Jan 28, 2017
@bors
Copy link
Contributor

bors commented Jan 28, 2017

⌛ Testing commit c0e474d with merge 0f49616...

bors added a commit that referenced this pull request Jan 28, 2017
Perform lifetime elision (more) syntactically, before type-checking.

The *initial* goal of this patch was to remove the (contextual) `&RegionScope` argument passed around `rustc_typeck::astconv` and allow converting arbitrary (syntactic) `hir::Ty` to (semantic) `Ty`.
I've tried to closely match the existing behavior while moving the logic to the earlier `resolve_lifetime` pass, and [the crater report](https://gist.github.com/eddyb/4ac5b8516f87c1bfa2de528ed2b7779a) suggests none of the changes broke real code, but I will try to list everything:

There are few cases in lifetime elision that could trip users up due to "hidden knowledge":
```rust
type StaticStr = &'static str; // hides 'static
trait WithLifetime<'a> {
    type Output; // can hide 'a
}

// This worked because the type of the first argument contains
// 'static, although StaticStr doesn't even have parameters.
fn foo(x: StaticStr) -> &str { x }

// This worked because the compiler resolved the argument type
// to <T as WithLifetime<'a>>::Output which has the hidden 'a.
fn bar<'a, T: WithLifetime<'a>>(_: T::Output) -> &str { "baz" }
```

In the two examples above, elision wasn't using lifetimes that were in the source, not even *needed* by paths in the source, but rather *happened* to be part of the semantic representation of the types.
To me, this suggests they should have never worked through elision (and they don't with this PR).

Next we have an actual rule with a strange result, that is, the return type here elides to `&'x str`:
```rust
impl<'a, 'b> Trait for Foo<'a, 'b> {
    fn method<'x, 'y>(self: &'x Foo<'a, 'b>, _: Bar<'y>) -> &str {
        &self.name
    }
}
```
All 3 of `'a`, `'b` and `'y` are being ignored, because the `&self` elision rule only cares that the first argument is "`self` by reference". Due implementation considerations (elision running before typeck), I've limited it in this PR to a reference to a primitive/`struct`/`enum`/`union`, but not other types, but I am doing another crater run to assess the impact of limiting it to literally `&self` and `self: &Self` (they're identical in HIR).

It's probably ideal to keep an "implicit `Self` for `self`" type around and *only* apply the rule to `&self` itself, but that would result in more bikeshed, and #21400 suggests some people expect otherwise.
Another decent option is treating `self: X, ... -> Y` like `X -> Y` (one unique lifetime in `X` used for `Y`).

The remaining changes have to do with "object lifetime defaults" (see RFCs [599](https://github.com/rust-lang/rfcs/blob/master/text/0599-default-object-bound.md) and [1156](https://github.com/rust-lang/rfcs/blob/master/text/1156-adjust-default-object-bounds.md)):
```rust
trait Trait {}
struct Ref2<'a, 'b, T: 'a+'b>(&'a T, &'b T);

// These apply specifically within a (fn) body,
// which allows type and lifetime inference:
fn main() {
    // Used to be &'a mut (Trait+'a) - where 'a is one
    // inference variable - &'a mut (Trait+'b) in this PR.
    let _: &mut Trait;

    // Used to be an ambiguity error, but in this PR it's
    // Ref2<'a, 'b, Trait+'c> (3 inference variables).
    let _: Ref2<Trait>;
}
```
What's happening here is that inference variables are created on the fly by typeck whenever a lifetime has no resolution attached to it - while it would be possible to alter the implementation to reuse inference variables based on decisions made early by `resolve_lifetime`, not doing that is more flexible and works better - it can compile all testcases from #38624 by not ending up with `&'static mut (Trait+'static)`.

The ambiguity specifically cannot be an early error, because this is only the "default" (typeck can still pick something better based on the definition of `Trait` and whether it has any lifetime bounds), and having an error at all doesn't help anyone, as we can perfectly infer an appropriate lifetime inside the `fn` body.

**TODO**: write tests for the user-visible changes.

cc @nikomatsakis @arielb1
@bors
Copy link
Contributor

bors commented Jan 28, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing 0f49616 to master...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants