Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align module base between invalidation and edge tracking #57625

Merged
merged 4 commits into from
Mar 4, 2025

Conversation

Keno
Copy link
Member

@Keno Keno commented Mar 3, 2025

Our implicit edge tracking for bindings does not explicitly store any edges for bindings in the current module. The idea behind this is that this is a good time-space tradeoff for validation, because substantially all binding references in a module will be to its defining module, while the total number of methods within a module is limited and substantially smaller than the total number of methods in the entire system.

However, we have an issue where the code that stores these edges and the invalidation code disagree on which module is the current one. The edge storing code was using the module in which the method was defined, while the invalidation code was using the one in which the MethodTable is defined. With these being misaligned, we can miss necessary invalidations.

Both options are in principle possible, but I think the former is better, because the module in which the method is defined is also the module that we are likely to have a lot of references to (since they get referenced implicitly by just writing symbols in the code).

However, this presents a problem: We don't actually have a way to iterate all the methods defined in a particular module, without just doing the brute force thing of scanning all methods and filtering.

To address this, build on the deferred scanning code added in #57615 to also add any scanned modules to an explicit list in Module. This costs some space, but only proportional to the number of defined methods, (and thus proportional to the written source code).

Note that we don't actually observe any issues in the test suite on master due to this bug. However, this is because we are grossly over-invalidating, which hides the missing invalidations from this issue (#57617).

@Keno Keno force-pushed the kf/fiximplicitedgemodule branch from c4e665a to a6f501e Compare March 3, 2025 19:59
@Keno Keno changed the base branch from kf/invalidatemerge to master March 3, 2025 19:59
@Keno Keno force-pushed the kf/fiximplicitedgemodule branch from a6f501e to a12e1cc Compare March 3, 2025 20:00
@Keno
Copy link
Member Author

Keno commented Mar 3, 2025

This branch includes the previous three PRs and uses some utilities from them. However, because the other PRs expose this bug, everything needs to be merged together. My plan is to focus on this PR, get it passing and then non-squash merge it.

@nsajko nsajko added the don't squash Don't squash merge label Mar 3, 2025
Keno added 4 commits March 3, 2025 23:18
On master, when we invalidate a CodeInstance, we also invalidate the
entire associated MethodInstance. However, this is highly problematic,
because we have a lot of CodeInstances that are associated with
`getproperty(::Module, ::Symbol)` through constant propagation.
If one of these CodeInstances gets invalidated (e.g. because
the resolution of const-propagated M.s binding changed), it
would invalidate essentially the entire world.

Prevent this by re-checking the forward edges list to make sure
that the code instance we're invalidating is actually in there.
This addresses one of the last remaining TODOs of the binding partition
work by performing invalidations when bindings transition from being
undefined to being defined. This in particular finally addresses the
performance issue that #54733 was intended to address (the issue was
closed when we merged the mechanism, but it had so far been turned
off). Turning on the invalidations themselves were always easy (a
one line deletion). What is harder is making sure that the additional
invalidations don't take extra time.

To this end, we add two additional flags, one on Bindings, and one on
methods. The flag on bindings tells us whether any method scan has so
far found an implicit (not tracked in ->backedges) reference to this
binding in any method body. The insight here is that most undefined
bindings will not have been referenced previously (because they did
not exist), so with a simple one bit saturating counter of the number
of edges that would exist (if we did store them), we can fast-path
the invalidation.

However, this is not quite sufficient, as people often do things like:
```
foo() = bar()
bar() = ...
...
```
which, without further improvements would incur an invalidation upon
the definition of `bar`.

The second insight (and what the flag on `Method` is for) is that we
don't actually need to scan the method body until there is something
to invalidate (i.e. until some `CodeInstance` has been created for
the method). By defering the scanning until the first time that inference
accesses the lowered code (with a flag to only do it once), we can
easily avoid invalidation in the above scenario (while still invalidating
if `foo()` was called before the definition of `bar`).

As a further bonus, this also speeds up bootstrap by about 20% (putting
us about back to where we used to be before the full bpart change) by
skipping unnecessary invalidations even for non-guard transitions.

Finally, this does not yet turn on inference's ability to infer guard
partitions as `Union{}`. The reason for this is that such partitions
can be replaced by backdated constants without invalidation. However,
as soon as we remove the backdated const mechanism, this PR will allow
us to turn on that change, further speeding up inference (by cutting off
inference on branches known to error due to missing bindings).
This implements the optimizations I promised in #57602 by checking
in invalidation whether or not the information that inference sees
changes. The primary situation in which this is useful is avoiding
an invalidation for `export` flag changes or changes of which module
a binding is being imported from that do not change what the actual
binding being imported is. Base itself uses these patterns sparingly,
so the bootstrap impact over #57615 is minimal (though non-zero).
Our implicit edge tracking for bindings does not explicitly store
any edges for bindings in the *current* module. The idea behind
this is that this is a good time-space tradeoff for validation,
because substantially all binding references in a module will be
to its defining module, while the total number of methods within
a module is limited and substantially smaller than the total number
of methods in the entire system.

However, we have an issue where the code that stores these edges
and the invalidation code disagree on which module is the *current*
one. The edge storing code was using the module in which the method
was defined, while the invalidation code was using the one in which
the MethodTable is defined. With these being misaligned, we can
miss necessary invalidations.

Both options are in principle possible, but I think the former is
better, because the module in which the method is defined is also
the module that we are likely to have a lot of references to (since
they get referenced implicitly by just writing symbols in the code).

However, this presents a problem: We don't actually have a way to
iterate all the methods defined in a particular module, without just
doing the brute force thing of scanning all methods and filtering.

To address this, build on the deferred scanning code added in #57615
to also add any scanned modules to an explicit list in `Module`. This
costs some space, but only proportional to the number of defined methods,
(and thus proportional to the written source code).

Note that we don't actually observe any issues in the test suite on
master due to this bug. However, this is because we are grossly
over-invalidating, which hides the missing invalidations from this
issue (#57617).
@Keno Keno added the backport 1.12 Change should be backported to release-1.12 label Mar 3, 2025
@Keno Keno force-pushed the kf/fiximplicitedgemodule branch from a12e1cc to a8a1c3b Compare March 3, 2025 23:28
@topolarity
Copy link
Member

Excellent catch - Is it worth setting up some better test harnesses so that we can write unit tests, etc. for invalidation directly?

Only being able to test indirectly via end-to-end invalidation checks seems to lead to a fair number of #265 bugs leaking through recently (to be expected since we're adding lots of new invalidation circuitry, but I think the point stands)

@Keno
Copy link
Member Author

Keno commented Mar 4, 2025

@qinsoon @udesou heads up this will require a change to the mmtk binding when merged.

@udesou
Copy link
Contributor

udesou commented Mar 4, 2025

@Keno I can open a PR to the binding with the fix, but unless you're merging it first and updating the version inside the BinaryBuilder and Julia itself in this PR, we will need additional PRs for pushing the fix.

@Keno
Copy link
Member Author

Keno commented Mar 4, 2025

Yes, I'm planning to merge this without the mmtk changes. I'm just giving a heads up.

@udesou
Copy link
Contributor

udesou commented Mar 4, 2025

No worries. Thank you, I've updated the binding and I'm testing it with your branch here. This was also a good reminder about having to make this process (of updating the binding) easier in the future.

@Keno Keno merged commit 274d80e into master Mar 4, 2025
4 of 7 checks passed
@Keno Keno deleted the kf/fiximplicitedgemodule branch March 4, 2025 04:23
@KristofferC KristofferC mentioned this pull request Mar 4, 2025
39 tasks
KristofferC pushed a commit that referenced this pull request Mar 4, 2025
Our implicit edge tracking for bindings does not explicitly store any
edges for bindings in the *current* module. The idea behind this is that
this is a good time-space tradeoff for validation, because substantially
all binding references in a module will be to its defining module, while
the total number of methods within a module is limited and substantially
smaller than the total number of methods in the entire system.

However, we have an issue where the code that stores these edges and the
invalidation code disagree on which module is the *current* one. The
edge storing code was using the module in which the method was defined,
while the invalidation code was using the one in which the MethodTable
is defined. With these being misaligned, we can miss necessary
invalidations.

Both options are in principle possible, but I think the former is
better, because the module in which the method is defined is also the
module that we are likely to have a lot of references to (since they get
referenced implicitly by just writing symbols in the code).

However, this presents a problem: We don't actually have a way to
iterate all the methods defined in a particular module, without just
doing the brute force thing of scanning all methods and filtering.

To address this, build on the deferred scanning code added in #57615 to
also add any scanned modules to an explicit list in `Module`. This costs
some space, but only proportional to the number of defined methods, (and
thus proportional to the written source code).

Note that we don't actually observe any issues in the test suite on
master due to this bug. However, this is because we are grossly
over-invalidating, which hides the missing invalidations from this issue
(#57617).

(cherry picked from commit 274d80e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.12 Change should be backported to release-1.12 don't squash Don't squash merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants