Use atomic reads and writes in code that uses double-checked locking. #819
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In a couple of places in nanobind we see this idiom:
This is the classic double-checked locking idiom, which on architectures that don't have total store ordering is racy (e.g. ARM, not x86). To use this pattern correctly, we need to use an atomic acquire load for the read outside the lock, and to use an atomic store release for the store inside the lock. These add the necessary fences to ensure that, for example, the contents of
tp
do not appear populated to the reader before the writer has stored them to memory.This PR adds an include of
<atomic>
to nb_internals.h if free-threading is enabled. I was unable to think of a good way to avoid this, bar using intrinsics. The use of atomics seemed appropriate to me in the presence of free threading.