-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize struct RMW ops in OptimizeInstructions #7225
Conversation
When the RMW operation can be proven not to change the accessed value, optimize it to a simple atomic get instead. This is valid because a write that does not change an in-memory value does not synchronize with any subsequent reads of that value, since those reads can be considered to be reading from the previous write. Also optimize RMW operations on unshared structs to their non-atomic equivalent operations. This can increase code size, but can also enable follow-on optimizations of the simpler operations and can be less expensive at runtime.
@conrad-watt, could I ask you to double check that these optimizations are valid? In particular, can you confirm that it is valid to optimize atomic RMW operations that do not write new values to atomic gets? |
src/passes/OptimizeInstructions.cpp
Outdated
|
||
Builder builder(*getModule()); | ||
|
||
// Even when the access to shared memory, we can optimize out the modify and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Even when the access to shared memory, we can optimize out the modify and | |
// Even when we access shared memory, we can optimize out the modify and |
// operation into several non-atomic operations is safe because no other | ||
// thread can observe an intermediate state in the unshared memory. This | ||
// initially increases code size, but the more basic operations may be | ||
// more optimizable than the original RMW. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How likely are we to succeed? I am somewhat worried that generally the size increase here will stick around, and might also be slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure! We can experiment with it once we have real test cases. I figure it will be easier to experimentally disable this optimization than it would be to write the optimization just for an experiment, though. I could imagine in the long run we would want to do this only when optimizing for speed over size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds good. Perhaps add a TODO for that?
// operation into several non-atomic operations is safe because no other | ||
// thread can observe an intermediate state in the unshared memory. This | ||
// initially increases code size, but the more basic operations may be | ||
// more optimizable than the original RMW. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds good. Perhaps add a TODO for that?
When the RMW operation can be proven not to change the accessed value,
optimize it to a simple atomic get instead. This is valid because a
write that does not change an in-memory value does not synchronize with
any subsequent reads of that value, since those reads can be considered
to be reading from the previous write.
Also optimize RMW operations on unshared structs to their non-atomic
equivalent operations. This can increase code size, but can also enable
follow-on optimizations of the simpler operations and can be less
expensive at runtime.