Suboptimal codegen for memset/memcpy unrolling #83277
Labels
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone
When JIT unrolls memset/memcpy it does suboptimal decisions for certain sizes, e.g. to memset 30 bytes:
so to zero 30 bytes it uses GPR twice. It's better to keep using SIMD and overlap with previously zeroed part:
Etc for other sizes.
PS: it seems that arm64 is doing the right thing here
The text was updated successfully, but these errors were encountered: