enable DYNAMIC_BMI2 by default on x86 (32-bit mode) #4252
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Following #4251 and #4248,
it's now possible to get
bmi2
working in x86 32-bit mode.This development opens the opportunity to enable
DYNAMIC_BMI2
automatically in 32-bit mode(so far, it was only enabled for x64).
DYNAMIC_BMI2
will only capture a portion of the speed benefits of nativebmi2 + avx2
as illustrated by below benchmark :
Decompression speed benchmark, measured on a i7-9700k, ubuntu 24.04,
gcc
13.3.0:-m32
dev
-m32
thisPR
(w/DYNAMIC_BMI2
)-m32 -mavx2 -mbmi2
As one can see,
DYNAMIC_BMI2
does not match the speed of nativebmi2 + avx2
(although it still outperforms the absence ofbmi2
). This discrepancy may be attributed to the specific benefits ofavx2
, which likely alleviates register pressure on thex86
architecture.DYNAMIC_BMI2
can still be manually overridden, allowing it to be disabled as needed.