-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] per-module arenas #11152
[red-knot] per-module arenas #11152
Conversation
a7b304f
to
b087d97
Compare
b087d97
to
35fdb40
Compare
|
code | total | + violation | - violation | + fix | - fix |
---|---|---|---|---|---|
PYI001 | 1 | 1 | 0 | 0 | 0 |
Formatter (stable)
✅ ecosystem check detected no format changes.
Formatter (preview)
✅ ecosystem check detected no format changes.
Also requesting review from @BurntSushi, not necessarily to review all the code (though that would be awesome too!) but mostly just to double-check whether any of what I wrote in the summary sounds totally off base at a high level. |
And @AlexWaygood, mostly just to observe that I switched unions and intersections from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
And so we will introduce contention on the atomic reference count even for reads of highly-used types.
I'm not sure I follow the reasoning here. Isn't the main contention the module lock that guards access to the type? And if so, isn't holding on to the lock for longer than necessary (for longer than incrementing a reference counter) causing worse contention?
I think my main worry is that it's non obvious for callers that they're holding on to a lock, making it very easy to hold on to a lock for too-long (I just call this very expensive function and pass a reference to a &FunctionType).
I wonder if Salsa solves some of this by making the accessors methods on the Id
types.
let function: FunctionId = // somehow get a function id
let name = function.name(db); // smolstr::Clone is O(1)
let parameters = function.parameters(db); // We could intern `Parameters` into an arena so that `Parameters` is just an Id.
for parameter in parameters.iter(db) {
let name = parameter.name(db);
}
I must admit, it's a bit awkward dealing with so many ids and I don't know what the overhead of all the hash table lookups is.
But I'm happy to try this out and iterate on it later on.
I am really not sure how all this will play out in practice with our actual workload, so take all of this with a grain of salt! (Really my top-line motivation here was just to do the quickest thing that would work and move on, and re-evaluate once we can actually benchmark on real workloads and see how the contention plays out.) Dashmap allows multiple simultaneous readers (they have to take a reader lock, but they don't contend with each other, only with writers). With the approach in this PR, there is no contention between readers, but readers may block writers for longer (however long a TypeRef is held). If we add Arc, then even readers have to contend over incrementing the atomic reference count (even if there is no writer). It may be that this is a non-issue in practice, I don't know. I just know that e.g. for CPython, which also does reference counting, contention over reference counts for hot shared objects is a huge performance issue if you try to run Python code concurrently and make reference counts atomic. And I know in general (just from searching on the topic) that Arc contention can be a big problem for scaling parallel Rust workloads too.
Yes, I'm also concerned that this will be a foot-gun.
Yes, I saw you did this with the |
My other issue with using Arc here is similar to your objection to using a GC crate: it's just overhead that we don't actually need. We don't need reference counting, because we are using arenas to deallocate the types all at once, and IDs to refer between them. So all the work to update the reference counts (and any contention it causes) is just waste. (Not pure waste, because it does provide a solution to the problem of callers taking temporary references to the type data and knowing they will keep it alive, without blocking writers to the arena. But it still feels like a big hammer for that problem, which should be solvable with just references and lifetimes.) |
On the |
Another option is using a different concurrent hash-table library, like |
Where would the savings come from here? It looks to me like weakrefs only really have a cost if you use them. You could save a tiny bit of memory in
Ooh, thank you for pointing this out! I spent a few minutes yesterday looking for exactly this, but for some reason |
I got this all working and solved the API lifetime issues without Arc, by means of a new set of `TypeRef` structs. The remaining potential performance issue is that anytime you hold on to any of the new `TypeRef` structs, you lock a shard of the `TypeStore::modules` dashmap to writes (because you are holding a reference into it). So it will be important to minimize the use and scope of these type-refs. I think we can do this to some degree by caching type judgments using just type IDs. I also think for CLI use when we want to be highly parallel, we can be smart about ordering (check all module bodies first, then check function bodies when module level types are all populated) to minimize write contention. Also, if needed we can break up `ModuleTypeStore`, or use inner mutability and internal locking to have finer-grained locking within it. I went with this version instead of rewriting to have the type arenas hold Arc to the types, because I am not totally convinced the Arc version will be better. With Arc every "read" turns into a write to the atomic reference count, which introduces overhead (which is really useless overhead for us, since ultimately we rely on the arenas for garbage collection). And so we will introduce contention on the atomic reference count even for reads of highly-used types. So for both versions we will have to be careful with our use of references. I think the Arc-free version is lower overhead and sets us up better for future optimization of the locking strategy, once we have more working code to optimize against. Even if I turn out to be wrong about the above and eventually we decide to use Arc, I'd rather go with this for now and move on to type evaluation, and make the Arc change later when we can evaluate the effects better.
I got this all working and solved the API lifetime issues without Arc, by means of a new set of
XTypeRef
structs.The remaining potential performance issue is that anytime you hold on to any of the new
XTypeRef
structs, you lock a shard of theTypeStore::modules
dashmap to writes (because you are holding a reference into it). So it will be important to minimize the use and scope of these type-refs. I think we can do this to some degree by caching type judgments using just type IDs. I also think for CLI use when we want to be highly parallel, we can be smart about ordering (check all module bodies first, then check function bodies when module level types are all populated) to minimize write contention. Also, if needed we can break upModuleTypeStore
, or use inner mutability and internal locking to have finer-grained locking within it.I went with this version instead of rewriting to have the type arenas hold Arc to the types, because I am not totally convinced the Arc version will be better. With Arc every "read" turns into a write to the atomic reference count, which introduces overhead (which is really useless overhead for us, since ultimately we rely on the arenas for garbage collection). And so we will introduce contention on the atomic reference count even for reads of highly-used types. So for both versions we will have to be careful with our use of references. I think the Arc-free version is lower overhead and sets us up better for future optimization of the locking strategy, once we have more working code to optimize against.
Even if I turn out to be wrong about the above and eventually we decide to use Arc, I'd rather go with this for now and move on to type evaluation, and make the Arc change later when we can evaluate the effects better.