Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix anon const def-creation when macros are involved #129137

Merged
merged 2 commits into from
Sep 13, 2024

Conversation

camelid
Copy link
Member

@camelid camelid commented Aug 15, 2024

Fixes #128016.

Ever since #125915, some ast::AnonConsts turn into hir::ConstArgKind::Paths,
which don't have associated DefIds. To deal with the fact that we don't have
resolution information in DefCollector, we decided to implement a process
where if the anon const appeared to be trivial (i.e., N or { N }), we
would avoid creating a def for it in DefCollector. If later, in AST lowering,
we realized it turned out to be a unit struct literal, or we were lowering it
to something that didn't use hir::ConstArg, we'd create its def there.

However, let's say we have a macro m!() that expands to a reference to a free
constant FOO. If we use m!() in the body of an anon const (e.g., Foo<{ m!() }>),
then in def collection, it appears to be a nontrivial anon const and we create
a def. But the macro expands to something that looks like a trivial const arg,
but is not, so in AST lowering we "fix" the mistake we assumed def collection
made and create a def for it. This causes a duplicate definition ICE.

The long-term fix for this is to delay the creation of defs for all expression-like
nodes until AST lowering (see #128844 for an incomplete attempt at this). This
would avoid issues like this one that are caused by hacky workarounds. However,
doing this uncovers a pre-existing bug with opaque types that is quite involved
to fix (see #129023).

In the meantime, this PR fixes the bug by delaying def creation for anon consts
whose bodies are macro invocations until after we expand the macro and know
what is inside it. This is accomplished by adding information to create the
anon const's def to the data in Resolver.invocation_parents.

r? @BoxyUwU

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 15, 2024
@camelid camelid added the A-const-generics Area: const generics (parameters and arguments) label Aug 15, 2024
@rust-log-analyzer

This comment has been minimized.

Copy link
Member

@BoxyUwU BoxyUwU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for the version of the ICE that refers to an in scope generic parameter:

macro_rules! len {
    ($x:ident) => {
        $x
    };
}

fn bar<const N: usize>() {
    let val: [bool; len!(N)] = [true; N];
}

And also the one with Res::Err

macro_rules! len {
    () => {
        target
    };
}

fn main() {
    let val: [bool; len!()] = [];
}

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 15, 2024
@BoxyUwU
Copy link
Member

BoxyUwU commented Aug 15, 2024

cc @cjgillot

@rust-log-analyzer

This comment has been minimized.

@camelid camelid force-pushed the lazy-def-macro-const branch from 9e51204 to ef33827 Compare August 15, 2024 23:33
@camelid camelid added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 15, 2024
@rust-log-analyzer

This comment has been minimized.

@petrochenkov
Copy link
Contributor

Upd: I started reviewing this stuff today, starting from #125915.
I'm generally skeptical about moving any def collection to later stages (except for leaf DefIds perhaps), but I need to read the relevant PRs first to say more.

@petrochenkov
Copy link
Contributor

I think I disagree with both this fix and the "long term fix" from #129247.
I have left some comments in #129023.

#125915 didn't need to change anything in DefId creation and parenting.
The AnonConst def ids (or whole constants) can still be created for ConstArgKind::Path const expressions, it doesn't mean that they should be actually used in type checking and const evaluation for those expressions.

If min_generic_const_args goes further and supports arbitrary paths, then there will be situations in which it's not clear whether it is a "type system constant" even at AST lowering time.
E.g. in

Type1::Assoc<Type2::Assoc>

we do not know whether Type2::Assoc is an associated type or constant (or even enum variant constructor) because it's not resolved yet, and it cannot be determined from the expected Type1::Assoc's generic parameters as well because Type1::Assoc is not yet resolved either.

I'd prefer #125915 to be reverted and then re-landed with the HIR representation changes (hir::{ConstArg, ConstArgKind}), but without the DefId creation and parenting changes.

@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 20, 2024
@camelid
Copy link
Member Author

camelid commented Aug 22, 2024

#125915 didn't need to change anything in DefId creation and parenting.
The AnonConst def ids (or whole constants) can still be created for ConstArgKind::Path const expressions, it doesn't mean that they should be actually used in type checking and const evaluation for those expressions.

I was under the impression that we have a HIR validation check that ensures all DefIds are used. Is that not right? In any case, it would seem unfortunate to create defs that don't end up being used. It's also worth pointing out that we already did (before my PR) def creation in ast_lowering for some anon consts that were not parsed as such, like const paths parsed as type paths. Finally, IIRC we need to delay def creation to properly do query feeding for anon consts, though @BoxyUwU knows much more about this than I. I'm just not sure that it's as simple as moving the def creation earlier -- though I could be wrong.

I'd prefer #125915 to be reverted and then re-landed with the HIR representation changes (hir::{ConstArg, ConstArgKind}), but without the DefId creation and parenting changes.

If we do decide to move def creation earlier, creating unused defs in the process, then I don't see the point of reverting the whole PR. We can just make a targeted fix to the def creation parts rather than redoing the whole thing.

On that note, would you be ok with landing this current PR provided that we work to figure out a longer-term solution that is effective and satisfies all parties? It would be nice to get this bug fixed on nightly without the churn of reverting #125915 and going back to square one.

@BoxyUwU
Copy link
Member

BoxyUwU commented Aug 22, 2024

I don't think we have such a check off the top of my head but I do think that creating needless DefIds is both hacky and will not work since we will have ICEs from calling queries on those DefId's which will have unexpected hir nodes associated with them (e.g. an expression instead of an anon const). This would happen when encoding queries for AnonConst defids.

I do not really understand the opposition to landing this PR, it gives us a fully correct DefId tree since all items defined in anon consts will have the anon const as the parent. It also does not result in having any extra DefIds defined which would be problematic ICE wise and also annoying when looking at debug output since we will wind up with {constant#1} when there is only a single constant.

I think it makes far more sense to land #129246 which will fix the immediate beta regression indefinitely, then allow this PR to be reviewed and merged which will solve things long term. As far as I can tell there is ~no downside to this PR? The only issues I've heard of with the current approach to creating defs for anon consts is that a "fully capable" def tree is desired and this PR allows for this.

I agree with @camelid that reverting the original PR seems like needless churn. IMO it is far too strong of a reaction considering there are only fairly minor issues with the current scheme and it was almost trivial to simply gate the changes so it does not affect stable (unlike a revert which I tried and ran into conflicts with).

@BoxyUwU
Copy link
Member

BoxyUwU commented Aug 22, 2024

I'm not sold that #129247 is what we want long term, I think if we can end up with an "obviously" correct DefId tree then that's better (which this PR does). I'm aware though that we will likely want anon const DefId's to be created after ast lowering at some point so that we can remove hacks in query feeding for type_of which would necessitate that issue being resolved though. I think this is largely a completely separate question to this PR however.

@petrochenkov
Copy link
Contributor

I do not really understand the opposition to landing this PR

Because it layers more hacks on top of #125915's hacks and increases the amount of changes that I'd like to revert.
I'll try to implement my alternative on Saturday and see whether it can be done without introducing even worse hacks.

@cjgillot
Copy link
Contributor

Reverting #125915 agrees with the general policy to revert PRs that cause high impact bugs or perf regressions, to re-land them better later.

On the general direction, we should strive for the simpler code. This PR introduces special cases in the def collector, to accommodate special cases in ast lowering.

This makes me wonder: is ast lowering the right place to have the special case. Or is hir->ty lowering ?

I'd be interested in reading the design you have in mind long term for gce feature, to understand where are your tradeoffs and what constraints you battle against and should be lifted.

@BoxyUwU
Copy link
Member

BoxyUwU commented Aug 23, 2024

I agree that in general a policy of "revert then reland makes sense" I tried making a revert PR and it has merge conflicts which is why instead I filed #129246 which was easier and would also give good assurance that there shouldn't be any regressions (and also if there is a mistake in that PR or the revert, I think it will be easier to justify a backport of "add a feature gate check" than "touch this random compiler logic to make it not scuffed from a merge conflcit")

Moving hacks to hir ty lowering doesn't help anything IMO, I would like for anon consts to have their defs created there since it will help with the type_of feeding hack. That wouldn't change the fact that DefCollector shouldn't be creating the DefIds for things we either don't want to create or do want to create later on. It also is more complex since iirc tcx.create_def to create defids that should have bodies is a lot more error prone.

@petrochenkov
Copy link
Contributor

petrochenkov commented Aug 23, 2024

I don't think we have such a check off the top of my head but I do think that creating needless DefIds is both hacky and will not work since we will have ICEs from calling queries on those DefId's which will have unexpected hir nodes associated with them (e.g. an expression instead of an anon const). This would happen when encoding queries for AnonConst defids.

Always creating DefIds (maybe even for all generic arguments, to avoid the "we already did def creation in ast_lowering for some anon consts" bit) would be the simplest alternative.
Not creating DefIds in some cases would be an optimization that increases complexity, and as with any optimizations it should first be proven that it is beneficial.
(At metadata encoding we already know precisely which def ids actually required anon constants and which not, that knowledge can be used to skip encoding.)

Moving hacks to hir ty lowering doesn't help anything IMO, I would like for anon consts to have their defs created there since it will help with the type_of feeding hack.

I think that would be the second viable strategy, directly opposite to the early def id creation in def collector.
We already create artificial statics late in compilation similarly to this (see intern_as_new_static).
Such nodes do not have any nontrivial HIR (all necessary information about them is fed through queries), and don't participate in any nontrivial parenting relations, neither in HirId nor in DefId sense, they are "fully virtual".
They can also be created exactly when necessary, when the constant expression is complex enough, including the Type1::Assoc<Type2::Assoc> cases.

So in both "create def ids very early" and "create def ids very late" we won't have approximations and hacks in def collector and AST lowering.
In the first case it's because we create them for everything, and in the second case because we have full information and can create them exactly when necessary.

@camelid camelid force-pushed the lazy-def-macro-const branch 2 times, most recently from fae1882 to 6834dfe Compare September 8, 2024 05:39
@rust-log-analyzer

This comment has been minimized.

@camelid camelid force-pushed the lazy-def-macro-const branch from 7fa114c to bbf0477 Compare September 9, 2024 03:59
@rust-log-analyzer

This comment has been minimized.

@camelid camelid force-pushed the lazy-def-macro-const branch from bbf0477 to 6d3a7c4 Compare September 9, 2024 04:42
@BoxyUwU
Copy link
Member

BoxyUwU commented Sep 12, 2024

Can you delete the tests/ui/feature-gates/feature-gate-const-arg-path.rs test

@bors
Copy link
Contributor

bors commented Sep 12, 2024

☔ The latest upstream changes (presumably #130249) made this pull request unmergeable. Please resolve the merge conflicts.

@camelid camelid force-pushed the lazy-def-macro-const branch from 6d3a7c4 to a3e6d29 Compare September 12, 2024 17:47
Ever since rust-lang#125915, some `ast::AnonConst`s turn into `hir::ConstArgKind::Path`s,
which don't have associated `DefId`s. To deal with the fact that we don't have
resolution information in `DefCollector`, we decided to implement a process
where if the anon const *appeared* to be trivial (i.e., `N` or `{ N }`), we
would avoid creating a def for it in `DefCollector`. If later, in AST lowering,
we realized it turned out to be a unit struct literal, or we were lowering it
to something that didn't use `hir::ConstArg`, we'd create its def there.

However, let's say we have a macro `m!()` that expands to a reference to a free
constant `FOO`. If we use `m!()` in the body of an anon const (e.g., `Foo<{ m!() }>`),
then in def collection, it appears to be a nontrivial anon const and we create
a def. But the macro expands to something that looks like a trivial const arg,
but is not, so in AST lowering we "fix" the mistake we assumed def collection
made and create a def for it. This causes a duplicate definition ICE.

The ideal long-term fix for this is a bit unclear. One option is to delay def
creation for all expression-like nodes until AST lowering (see rust-lang#128844 for an
incomplete attempt at this). This would avoid issues like this one that are
caused by hacky workarounds. However, this approach has some downsides as well,
and the best approach is yet to be determined.

In the meantime, this PR fixes the bug by delaying def creation for anon consts
whose bodies are macro invocations until after we expand the macro and know
what is inside it. This is accomplished by adding information to create the
anon const's def to the data in `Resolver.invocation_parents`.
...and remove the `const_arg_path` feature gate as a result. It was only
a stopgap measure to fix the regression that the new lowering introduced
(which should now be fixed by this PR).
@camelid camelid force-pushed the lazy-def-macro-const branch from a3e6d29 to e0bd011 Compare September 12, 2024 17:56
@BoxyUwU
Copy link
Member

BoxyUwU commented Sep 12, 2024

@bors r+ rollup=never

Merging this to unblock more work on min_generic_const_args .

I don't consider "dont create extraneous defids" to be solely an optimisation as it will affect diagnostics output as anon const's will be misnumbered (e.g. we'll see foo::{constant#1} in output when there is only a single anon const). Having this increase in complexity in DefCollector seems logical to me since fundamentally what we are trying to do is not represent const args with anon consts in some cases that are only able to be determined after macro expansion.

Unconditionally creating DefIds and introducing hacks to defid encoding to skip these "fake" DefIds for anon consts might work (with the caveat that it'll make diagnostics and debugging worse(?)). However I think it places complexity in the compiler where it is not necessarily expected or relevant.

After this lands we should be able to remove a lot of the DefId creation in ast lowering of const arguments by always representing things as ConstArg::Path instead of having the weird approximation in DefCollector of what needs a DefId. This would leave us in a final position where the only times we create DefIds for const args is:

  • DefCollector
  • rustc_legacy_const_generics handling which only knows that a function argument is actually a const argument after name resolution has been performed (unrelated but at some point we should fix the ICEs around having nested bodies in these arguments)

Hopefully once we have made it so that the DefId creation in DefCollector is not an under approximation of what defs we actually need the ast lowering logic will be less "hacky" and the state of all this will seem acceptable.

@bors
Copy link
Contributor

bors commented Sep 12, 2024

📌 Commit e0bd011 has been approved by BoxyUwU

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 12, 2024
@bors
Copy link
Contributor

bors commented Sep 13, 2024

⌛ Testing commit e0bd011 with merge d3a8524...

@bors
Copy link
Contributor

bors commented Sep 13, 2024

☀️ Test successful - checks-actions
Approved by: BoxyUwU
Pushing d3a8524 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 13, 2024
@bors bors merged commit d3a8524 into rust-lang:master Sep 13, 2024
7 checks passed
@rustbot rustbot added this to the 1.83.0 milestone Sep 13, 2024
@camelid camelid deleted the lazy-def-macro-const branch September 13, 2024 04:32
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (d3a8524): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.2%, 0.4%] 6
Regressions ❌
(secondary)
0.2% [0.2%, 0.2%] 2
Improvements ✅
(primary)
-0.7% [-0.9%, -0.5%] 6
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.1%] 2
All ❌✅ (primary) -0.2% [-0.9%, 0.4%] 12

Max RSS (memory usage)

Results (primary 0.5%, secondary 0.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.3% [0.6%, 2.5%] 5
Regressions ❌
(secondary)
1.7% [1.3%, 3.0%] 4
Improvements ✅
(primary)
-0.9% [-1.1%, -0.8%] 3
Improvements ✅
(secondary)
-1.5% [-2.0%, -1.1%] 2
All ❌✅ (primary) 0.5% [-1.1%, 2.5%] 8

Cycles

Results (secondary 7.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
7.9% [7.9%, 7.9%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

Results (primary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.1%, 0.7%] 36
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.4% [-1.0%, -0.0%] 24
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.0% [-1.0%, 0.7%] 60

Bootstrap: 756.349s -> 757.342s (0.13%)
Artifact size: 341.21 MiB -> 341.24 MiB (0.01%)

@Mark-Simulacrum
Copy link
Member

Most or all of the regressions look to be spurious (perf recovered shortly after). Marking as triaged.

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 22, 2024
…elid

Fix anon const def-creation when macros are involved take 2

Fixes rust-lang#130321

There were two cases that rust-lang#129137 did not handle correctly:

- Given a const argument `Foo<{ bar!() }>` in which `bar!()` expands to `N`, we would visit the anon const and then visit the `{ bar() }` expression instead of visiting the macro call. This meant that we would build a def for the anon const as `{ bar!() }` is not a trivial const argument as `bar!()` is not a path.
- Given a const argument `Foo<{ bar!() }>` is which `bar!()` expands to `{ qux!() }` in which `qux!()` expands to `N`, it should not be considered a trivial const argument as `{{ N }}` has two pairs of braces.  If we only looked at `qux`'s expansion it would *look* like a trivial const argument even though it is not. We have to track whether we have "unwrapped" a brace already when recursing into the expansions of `bar`/`qux`/any macro

r? `@camelid`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-generics Area: const generics (parameters and arguments) merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ICE: adding a def'n for node-id NodeId(18) and def kind AnonConst but a previous def'n exists
9 participants