-
Notifications
You must be signed in to change notification settings - Fork 646
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[CPU] Improve vector tile sizes for sub-byte matmuls on Aarch64 (#16143)
This PR introduces a simple heuristic to make sure that we at least fill one vector register for the smallest data type used in the matmul. For example, given a 128-bit vector and a `i32 <- i4, i4` matmul, we used 16 tile size for the main vector dimension (16x4 = 64 bits, half vector). With this PR we use 32 (32x4 = 128 bits, full vector).
- Loading branch information
Showing
2 changed files
with
126 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters