Accessing a field of a Vector4 causes later codegen to be inefficient if inlined #10045
Labels
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
optimization
Milestone
As a simple example, look at:
This is currently producing:
Manually inlining this code to be:
Produces the much more efficient code:
We should recognize when the accessed value is 16-bytes and just do a block-copy (as was done with the manually inlined code).
It may also be beneficial to recognize that
[rsp+20H]
contains the "last use" ofvalue
and that we can just reference the memory directly (without needing to copy).category:cq
theme:vector-codegen
skill-level:intermediate
cost:small
impact:small
The text was updated successfully, but these errors were encountered: