Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random lifetime may not live long enough from other part of the code #96645

Open
DzenanJupic opened this issue May 2, 2022 · 20 comments
Open
Labels
A-incr-comp Area: Incremental compilation A-NLL Area: Non-lexical lifetimes (NLL) NLL-complete Working towards the "valid code works" goal P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@DzenanJupic
Copy link

I just stumbled across a weird and inconsistent behaviour, that seems unreasonable to me.

The project I'm working on is a big company project (closed source) with quite some files. While fixing a bug and editing one line in one module (src/dispatcher/compiler/module_range.rs) the compiler suddenly decided to reject the code of a function from a completely different module (src/services/module_state_service.rs) that has absolutely nothing to do with the module I'm working on.

The change that I made:

fn get_next_module_start(&self, module: &FunctionalModule<'a>, starter: &'a Fmid) -> Result<usize> {
        // -- snip --
        WorkflowIterator::exe_path_iter(
            self.modules,
            starter,
            |m| m.fmd.next_module.as_ref().into_option(),
-           |m| super::sort_exe_paths(&m.fmd.exe_paths),
+           |m| (m != module).then(|| super::sort_exe_paths(&m.fmd.exe_paths)).into_iter().flatten(),
            |m| m.md.kind,
        )
        // -- snip --
}

The compiler error:

error: lifetime may not live long enough
  --> src/services/module_state_service.rs:35:44
   |
32 |           &'_ self,
   |            -- let's call the lifetime of this reference `'1`
33 |           workflow_id: &WorkflowGroupId,
34 |           fmid: &Fmid,
   |                 - let's call the lifetime of this reference `'2`
35 |       ) -> Option<RwLockReadGuard<'_, [u8]>> {
   |  ____________________________________________^
36 | |         let map_guard = self.module_state.read().await;
37 | |
38 | |         let contains_state = map_guard
...  |
49 | |         }))
50 | |     }
   | |_____^ associated function was supposed to return data with lifetime `'1` but it is returning data with lifetime `'2`

error: lifetime may not live long enough
  --> src/services/module_state_service.rs:47:9
   |
32 |           &'_ self,
   |            -- let's call the lifetime of this reference `'1`
33 |           workflow_id: &WorkflowGroupId,
34 |           fmid: &Fmid,
   |                 - let's call the lifetime of this reference `'2`
...
47 | /         Some(RwLockReadGuard::map(map_guard, |map| {
48 | |             &**map.get(workflow_id).unwrap().get(fmid).unwrap()
49 | |         }))
   | |___________^ associated function was supposed to return data with lifetime `'2` but it is returning data with lifetime `'1`

The function the compiler complains about:

pub async fn get_module_state(
    &'_ self,
    workflow_id: &WorkflowGroupId,
    fmid: &Fmid,
) -> Option<RwLockReadGuard<'_, [u8]>> {
    let map_guard = self.module_state.read().await;

    let contains_state = map_guard
        .get(workflow_id)
        .map(|map| map.contains_key(fmid))
        .unwrap_or_default();

    if !contains_state {
        return None;
    }

    Some(RwLockReadGuard::map(map_guard, |map| {
        &**map.get(workflow_id).unwrap().get(fmid).unwrap()
    }))
}

When undoing the one-line change, the code compiled fine again, and after redoing it, it failed to compile again.
Now, after indenting a few pieces of code, to make pasting them into GitHub easier, the compiler again accepts the code.
I'm also no longer able to recreate the error.

I know, that's essentially nothing to work with, and I'm not even able to recreate the issue myself anymore, but it seems worth reporting to me.

@DzenanJupic
Copy link
Author

Could be closed immediately if not of value.

@lcnr lcnr added the A-incr-comp Area: Incremental compilation label May 9, 2022
@estebank estebank added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label May 10, 2022
@Edgeworth
Copy link

Edgeworth commented May 14, 2022

This has also been happening to me on recent nightlies (for the past week or two). Unfortunately, it does sound like a hard to repro bug. I have found this problem happens much more in rust-analyzer than in command line compiles, not sure if that means anything though.

As a workaround, I am doing a clean rebuild.

@Edgeworth
Copy link

This is now happening constantly for me (almost every build). Unfortunately, I don't have a repro. It does seem to be somehow related to rust-analyzer though (very low confidence for this).

@DzenanJupic
Copy link
Author

I don't use rust-analyzer, so it's probably not the origin. Though I didn't have any more problems so far.

@Demindiro
Copy link

I also had this happen though I haven't been able to reproduce it either. As far as I can tell it's related to async functions since I haven't seen it occur in any other case. Specifying lifetime bounds manually seems to help though.

@audunska
Copy link

audunska commented Jun 13, 2022

Seeing something similar in our closed-source module. The beta compiler (1.62.0-beta.4) suddenly started rejecting some sqlx code which looks like this (details omitted):

#[derive(Debug, sqlx::FromRow)]
struct Grid {
  id: i64,
  row: f32,
};

struct Target {
  id: i64,
  grid_id: Option<i64>,
  ...
}

/// Just for namespacing
struct GridSql;

impl GridSql {
    pub async fn add_empty<'t>(
        target: &Target,
        mut tx: sqlx::Transaction<'t, Postgres>,
    ) -> sqlx::Result<sqlx::Transaction<'t, Postgres>> {
        let grid_id = sqlx::query_as!(
            Grid,
            r#"
            INSERT INTO grid (
               row
            ) VALUES($1)
            RETURNING
                id, ...
            "#,
            0.0,
        )
        .fetch_one(&mut tx)
        .await?
        .id;
        sqlx::query!(
            "UPDATE target SET grid_id = $1 WHERE id = $2",
            grid_id,
            target.id,
        )
        .execute(&mut tx)
        .await?;
        Ok(tx)
    }
}

The error message is:

error: lifetime may not live long enough
  --> src/lib/grid.rs:70:56
   |
67 |       pub async fn add_empty<'t>(
   |                              -- lifetime `'t` defined here
68 |           target: &Target,
   |                         - let's call the lifetime of this reference `'1`
69 |           mut tx: sqlx::Transaction<'t, Postgres>,
70 |       ) -> sqlx::Result<sqlx::Transaction<'t, Postgres>> {
   |  ________________________________________________________^
71 | |         let grid_id = sqlx::query_as!(
72 | |             Grid,
73 | |             r#"
...  |
97 | |         Ok(tx)
98 | |     }
   | |_____^ associated function was supposed to return data with lifetime `'t` but it is returning data with lifetime `'1`

error: lifetime may not live long enough
  --> src/lib/grid.rs:97:9
   |
67 |     pub async fn add_empty<'t>(
   |                            -- lifetime `'t` defined here
68 |         target: &Target,
   |                       - let's call the lifetime of this reference `'1`
...
97 |         Ok(tx)
   |         ^^^^^^ associated function was supposed to return data with lifetime `'1` but it is returning data with lifetime `'t`

This does not happen on stable.

Unfortunately, pulling the code out of the codebase verbatim did not give an error, so creating a reproducible example seems very hard again.

@DzenanJupic
Copy link
Author

Hopefully, it has nothing to do with switching to the MIR borrow checker 😬. As far as I know, the plan is to stabilise it in August. Since @Edgeworth and I first noticed the bug on nightly, and it now propagated to beta this might be the case.

I have no idea when the NLL borrow checking was merged though, so it might be something different.

@DzenanJupic
Copy link
Author

@rustbot label +A-NLL +NLL-complete

@rustbot rustbot added A-NLL Area: Non-lexical lifetimes (NLL) NLL-complete Working towards the "valid code works" goal labels Jun 13, 2022
@Edgeworth
Copy link

By the way, I just found a breaking case for this even in the explicitly specified lifetimes case.

Originally, I had this, which sometimes produced the error message in the original post.
pub async fn rs(&mut self, key: &RecKey) -> Result<&RecSeries>

The frequency of this error decreased after I changed it to this signature:

pub async fn rs<'a, 'b>(&'a mut self, key: &'b RecKey) -> Result<&'a RecSeries>

However, I just got this error recently:

lifetime may not live long enough
consider adding the following bound: `'b: 'a`

Note that there is no reason for 'b to have to outlive 'a in my actual code (and indeed, it almost always compiles without issue).

@DzenanJupic
Copy link
Author

Looks like one of the new error messages of the NLL borrow checker: https://jackh726.github.io/rust/2022/06/10/nll-stabilization.html

@audunska
Copy link

For what it's worth, I haven't seen this since upgrading to the 1.63 beta.

@Jasperav
Copy link

I had this problem as well. I updated to 1.64 and the problem didn't occur when building clean.

@Jasperav
Copy link

Now I got the problem on 1.64 after a few builds, clean builds help but they take forever

@estebank estebank added I-prioritize Issue: Indicates that prioritization has been requested for this issue. I-compiler-nominated Nominated for discussion during a compiler team meeting. labels Sep 15, 2022
@estebank
Copy link
Contributor

Nominating for @rust-lang/compiler discussion because it seems like a latent bug or regression in incr-comp with nll borrowck.

@DzenanJupic
Copy link
Author

@dtolnay dtolnay changed the title Random lifetime may not life long enought from other part of the code Random lifetime may not live long enough from other part of the code Sep 15, 2022
@apiraino
Copy link
Contributor

Discussed during T-compiler meeting on Zulip (notes).

@rustbot label -I-compiler-nominated

@rustbot rustbot removed the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Sep 15, 2022
@DzenanJupic
Copy link
Author

I found a way to reliably reproduce the error locally (win 10). It, unfortunately, does not work in WSL or docker 😐.
Just published a repository, with the code I used to reproduce it: https://github.com/DzenanJupic/96645-reproduction

It's more or less a fuzzer, that generates random code from all the examples that were posted in this thread. It also sets up the exact same toolchain situation as on my local machine at the time I encountered the error (default: nightly-2022-07-12, override: stable-2022-06-30).

To run it, just execute cargo run.
!!! It will install new toolchains, and change your default toolchain !!!
It takes about max one or two minutes for me (when I run it in win 10)

Since I cannot reproduce it in WSL or docker, I included 5 target directories and Cargo.locks of times when it failed: https://github.com/DzenanJupic/96645-reproduction/tree/master/reproductions

Hopefully, that helps :) If I should run any other command locally, or I should share other files, I can do so.

@DzenanJupic
Copy link
Author

Cannot reproduce it with the latest stable/nightly on my machine

@lqd
Copy link
Member

lqd commented Sep 16, 2022

I feel there are multiple issues at play here, due to the multiple lifetime resolution on AST & incremental compilation issues, whose introduction and fixes spanned a release.

Some of these examples look similar to #98890, which matches that timeline:

Some also look coincidentally fixed by #97415 a few weeks beforehand (e.g. if you simply compile module B and then concat D, that reproduces the issue but the PR fixes that)

Some commenters mention having the issue on 1.64.0 nightlies but we don't have reproductions for those yet.

@apiraino
Copy link
Contributor

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium

@rustbot rustbot added the P-medium Medium priority label Jan 25, 2023
@rustbot rustbot removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-incr-comp Area: Incremental compilation A-NLL Area: Non-lexical lifetimes (NLL) NLL-complete Working towards the "valid code works" goal P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants