-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add detection for zen 5 #56967
base: master
Are you sure you want to change the base?
add detection for zen 5 #56967
Conversation
src/processor_x86.cpp
Outdated
@@ -236,6 +237,7 @@ constexpr auto znver2 = znver1 | get_feature_masks(clwb, rdpid, wbnoinvd); | |||
constexpr auto znver3 = znver2 | get_feature_masks(shstk, pku, vaes, vpclmulqdq); | |||
constexpr auto znver4 = znver3 | get_feature_masks(avx512f, avx512cd, avx512dq, avx512bw, avx512vl, avx512ifma, avx512vbmi, | |||
avx512vbmi2, avx512vnni, avx512bitalg, avx512vpopcntdq, avx512bf16, gfni, shstk, xsaves); | |||
constexpr auto znver5 = znver4 | get_feature_masks(avxvnni, movdiri, movdir64b, avx512vp2intersect, /*prefetchi,*/ avxvnni); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume prefetchi
needs to be added to src/features_x86.h
, but I didn't know how
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 113 in 4750dc2
JL_FEATURE_DEF(avxvnni, 32 * 9 + 4, 120000) |
Now you need to look in the CPU docs for how prefetchi
is encoded.
From the "Processor Programming Reference" https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/57896.zip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/llvm/llvm-project/blob/3edbe36c3eb01d1c35ac1761da108e3a493258ee/clang/lib/Headers/cpuid.h#L220 The bits are here, though you will to add the
// EAX=7,ECX=1: EDX
branch IIUC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hints! What I don't get is where the 32 * 8
, 32 * 9
etc. is coming from.
Is this the correct patch or are the 32 * 9
bits incorrect?
diff --git a/src/features_x86.h b/src/features_x86.h
index 2ecc8fee32..b817781404 100644
--- a/src/features_x86.h
+++ b/src/features_x86.h
@@ -113,6 +113,9 @@ JL_FEATURE_DEF(wbnoinvd, 32 * 8 + 9, 0)
JL_FEATURE_DEF(avxvnni, 32 * 9 + 4, 120000)
JL_FEATURE_DEF(avx512bf16, 32 * 9 + 5, 0)
+// EAX=7,ECX=1: EDX
+JL_FEATURE_DEF(prefetchi, 32 * 9 + 20, 0)
+
// EAX=0x14,ECX=0: EBX
JL_FEATURE_DEF(ptwrite, 32 * 10 + 4, 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm implementing it and maybe adding some comments
Won't we need to wait for #56130 to be merged before we can use Zen5 since that is only in LLVM 19? |
Yes, to take full advantage of zen 5 features I believe LLVM 19 is needed, but this PR is still an improvement since we now fall back to the |
src/features_x86.h
Outdated
JL_FEATURE_DEF(avx512vnniw, 32 * 4 + 2, 0) | ||
JL_FEATURE_DEF(avx512fmaps, 32 * 4 + 3, 0) | ||
JL_FEATURE_DEF(uintr, 32 * 4 + 5, 140000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the last statement a comment which LLVM version introduced support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it turns out those were never implemented :)
ref llvm/llvm-project@149a150