Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better vectorization and crc64 #79

Merged
merged 119 commits into from
Sep 5, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
119 commits
Select commit Hold shift + click to select a range
c6248e0
Added CRC32C AVX512 support.
javazque Jun 22, 2023
cf22bca
Fixed routine name to indicate crc32c
pbadari Jun 27, 2023
def3a68
Merge pull request #1 from pbadari/avx512_support
pbadari Jun 27, 2023
375fa35
Add sse42 avx512_intrinsics support
javazque Jul 14, 2023
9e18b50
Merge pull request #2 from pbadari/sse42_avx512_intrinsics
pbadari Jul 14, 2023
469ef71
Merge branch 'main' into main
JonathanHenson Jul 14, 2023
2eb5578
Refactoring work for the AVX512 code path. Testing shows it not quite…
JonathanHenson Jul 19, 2023
837d5a1
Keep the naive avx512 path on for figuring out codebuild capabilities…
JonathanHenson Jul 19, 2023
2289c96
Fix build and do correct cpu feature detection.
JonathanHenson Jul 19, 2023
ee3e5da
fix 32-bit builds and builds that need to work without intrinsics ava…
JonathanHenson Jul 19, 2023
005ed7c
Not sure how the avx512 code got called. hopefully coedebuild is just…
JonathanHenson Jul 19, 2023
39094d4
Found why the wrong build files were being used at least.
JonathanHenson Jul 19, 2023
d4ffdc1
Make test pass when it passes.
JonathanHenson Jul 19, 2023
bf79936
Try it again.
JonathanHenson Jul 19, 2023
1e24d06
fix leftover symbol collision.
JonathanHenson Jul 19, 2023
907e721
Added more compile gates and assertions.
JonathanHenson Jul 19, 2023
5ab0046
Fix osx build.
JonathanHenson Jul 20, 2023
ca43c51
make the bitflips uniform.
JonathanHenson Jul 20, 2023
28dde8b
add additional runtime cpuid check and run formatter.
JonathanHenson Jul 20, 2023
a00a8e3
work around nasty bitflipping logic.
JonathanHenson Jul 20, 2023
5138407
Addressed review comments, use ternary logic instructions and optimiz…
pbadari Jan 1, 2024
2a084ac
Added crc64 implementations for arm and intel, Added some avx512 code…
JonathanHenson Jan 18, 2024
3f0092f
ran formatter.
JonathanHenson Jan 18, 2024
d78dcb8
Updated formatter and intel compile branch.
JonathanHenson Jan 18, 2024
1cda49a
lets try again.
JonathanHenson Jan 18, 2024
145267d
Manually format and updated compiler flag.
JonathanHenson Jan 18, 2024
53fb00f
msvc fix, more formatting, append compiler flag rather than resetting…
JonathanHenson Jan 18, 2024
1e50c18
sucks to only have an arm machine to test this on.
JonathanHenson Jan 18, 2024
fca8adc
See if cleaning this up helps.
JonathanHenson Jan 22, 2024
dab6a82
Merge branch 'main' into better_vectorization_and_crc64
JonathanHenson Jan 23, 2024
05799a5
Another clmul typo.
JonathanHenson Jan 23, 2024
ae55f55
who knows.
JonathanHenson Jan 23, 2024
160d587
use the new macros
JonathanHenson Jan 23, 2024
dd35f50
missed one.
JonathanHenson Jan 23, 2024
765313f
okay, now we're back to code being broken babuy
JonathanHenson Jan 23, 2024
166ddd9
there's the vl we needed.
JonathanHenson Jan 23, 2024
9bad62d
maybe this is all i needed.
JonathanHenson Jan 23, 2024
543c487
add sse4.2 flag back to the avx512 build for gcc8.
JonathanHenson Jan 23, 2024
ea25508
windows build fixes, as well as x86 build mismatch.
JonathanHenson Jan 23, 2024
7b63f06
more windows fixes and macros.
JonathanHenson Jan 23, 2024
58ece21
non-standard intrinsics headers?
JonathanHenson Jan 23, 2024
c6e65ea
incorrect macro syntax.
JonathanHenson Jan 23, 2024
b074483
run formatter, fix function delcarations.
JonathanHenson Jan 23, 2024
a3ab193
Maybe the headers i need for older compilers.
JonathanHenson Jan 23, 2024
d7ccb7d
more formatter fixes.
JonathanHenson Jan 23, 2024
4a04be9
linters.
JonathanHenson Jan 23, 2024
42f6e10
Magical IDE tabs are the frickin worst.
JonathanHenson Jan 23, 2024
b0875b6
Check 64-bit arch before assuming can use arm8.1
JonathanHenson Jan 23, 2024
3456c37
don't compile the crc64 arm stuff if not 64 bit.
JonathanHenson Jan 23, 2024
032a7e5
use consistent header includes.
JonathanHenson Jan 23, 2024
5473f71
use an actually widely documented intrinsic.
JonathanHenson Jan 23, 2024
d61841a
fix test build and exported symbol needed for tests.
JonathanHenson Jan 23, 2024
f9a7709
Use sse2 on the msvc version when 4.2. is specified.
JonathanHenson Jan 23, 2024
856e2a2
work around old microsoft compiler.
JonathanHenson Jan 23, 2024
7ad72d5
add it to the correct file this time.
JonathanHenson Jan 23, 2024
0cda5eb
get your types right dude.
JonathanHenson Jan 23, 2024
73330e1
don't use the clmul version on old msvc.
JonathanHenson Jan 23, 2024
af2952e
run formatter.
JonathanHenson Jan 23, 2024
24165d7
format again.
JonathanHenson Jan 23, 2024
bfb8600
Added more thorough testing to make sure all the hw accelerated branc…
JonathanHenson Jan 24, 2024
86ca022
msvc compiler errors.
JonathanHenson Jan 24, 2024
82a8ed3
i think the warning was actually right.
JonathanHenson Jan 24, 2024
449c8b4
Use runtime cpu checks for arm.
JonathanHenson Jan 24, 2024
995ea61
Clean up cmake.
JonathanHenson Jan 24, 2024
1a3d6bd
Remove unneeded glob.
JonathanHenson Jan 24, 2024
52ed9e7
make sure tests use the testing allocator.
JonathanHenson Jan 24, 2024
a8fecf8
fix windows test compiler watning.
JonathanHenson Jan 24, 2024
c91a81d
restructured the code so the fallthroughs are less complicated.
JonathanHenson Jan 29, 2024
4319bab
typo.
JonathanHenson Jan 29, 2024
34de264
put the asm file back.
JonathanHenson Jan 29, 2024
1fa581a
compile guard on the sse cmul fallback.
JonathanHenson Jan 29, 2024
aaa5a02
make sure that uber file has the right flags.
JonathanHenson Jan 29, 2024
048bf58
Move includes inside their macro guards.
JonathanHenson Jan 29, 2024
333a6d3
Add the null implementation back
JonathanHenson Jan 30, 2024
7021062
fix the null inclusion in cmake.
JonathanHenson Jan 30, 2024
e028f3e
add the generic implementation, renamed from null.
JonathanHenson Jan 30, 2024
8516cc9
Shave that yak!
JonathanHenson Jan 30, 2024
a43f739
just learned why a c-style cast is memory unsafe even when you know y…
JonathanHenson Jan 31, 2024
43e87ad
try just making sure the data is aligned first.
JonathanHenson Jan 31, 2024
4ac54af
fix constness.
JonathanHenson Jan 31, 2024
a7d22dd
run formatter and fix conditional.
JonathanHenson Jan 31, 2024
0e31476
Use the correct branch this time.
JonathanHenson Jan 31, 2024
4870411
see what happens without an alignment on those arrays.
JonathanHenson Feb 1, 2024
0d4f728
see what happens without an alignment on those arrays.
JonathanHenson Feb 1, 2024
ef83ed9
Try not telling ASAN quite so much info about the type and see if it…
JonathanHenson Feb 1, 2024
86604f0
restrict the input size to always hit the sw implmentation on smaller…
JonathanHenson Feb 1, 2024
19d5344
Try intrinsics we can actually use everywhere.
JonathanHenson Feb 1, 2024
2e73b17
zmm, not xmm.
JonathanHenson Feb 1, 2024
778ed1d
Fix xlmuil intel build.
JonathanHenson Feb 1, 2024
96067a6
Use cmake function more widely available.
JonathanHenson Feb 1, 2024
3dfaaf6
update crc32c and clean up macros.
JonathanHenson Feb 5, 2024
1c2625c
Don't do the generic fallback file as its unneeded.
JonathanHenson Feb 5, 2024
4c51525
Remove unneeded clang format stuff.
JonathanHenson Feb 5, 2024
729bf32
visual studio not saving without an explicit ctrl+s is some b.s.
JonathanHenson Feb 5, 2024
0ab865e
More build fixes.
JonathanHenson Feb 5, 2024
2e61be3
Update tests to have randomized input data.
JonathanHenson Feb 5, 2024
d86e10c
Fixed avx512 detection for compiling crc64_avx512 impl.
JonathanHenson Feb 6, 2024
9cd74b9
Added runner for apple arm.
JonathanHenson Feb 8, 2024
8fd01c7
fix build of profiler run.
JonathanHenson Feb 8, 2024
4eb218d
use branch name for builder.
JonathanHenson Feb 8, 2024
c776234
use actual branch name.
JonathanHenson Feb 8, 2024
8b11a21
why is it not using the host arch
JonathanHenson Feb 8, 2024
3f37ea6
read the source code i guess...
JonathanHenson Feb 8, 2024
49fefa1
specify target.
JonathanHenson Feb 8, 2024
49e1aa0
specify target.
JonathanHenson Feb 8, 2024
95bd82e
try a different target.
JonathanHenson Feb 8, 2024
5983351
Fix memory read for stack allocated buffers.
JonathanHenson Feb 8, 2024
94b6f5d
Fix windows conversion errors on profile run.
JonathanHenson Feb 8, 2024
1657ac5
run profiler as part of tests.
JonathanHenson Feb 8, 2024
499ec69
run profiler as part of tests.
JonathanHenson Feb 8, 2024
d86d672
use correct value name for test steaps.
JonathanHenson Feb 8, 2024
4a61537
try just invoking ctest.
JonathanHenson Feb 8, 2024
eb95b28
use default tester path.
JonathanHenson Feb 8, 2024
1657807
switch over crc64 to nvme flavor
DmitriyMusatkin Sep 3, 2024
15b9716
fix benchmark
DmitriyMusatkin Sep 3, 2024
4915177
Merge branch 'main' into better_vectorization_and_crc64
DmitriyMusatkin Sep 3, 2024
0517612
lint
DmitriyMusatkin Sep 3, 2024
478c6cb
lets try again
DmitriyMusatkin Sep 3, 2024
2f55d46
minor
DmitriyMusatkin Sep 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix windows test compiler watning.
  • Loading branch information
JonathanHenson committed Jan 24, 2024
commit a8fecf8837979f3f95c3b8a20f06a33e61d21d83
2 changes: 1 addition & 1 deletion tests/crc64_test.c
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ static int s_test_known_crc64xz(struct aws_allocator *allocator, const char *fun
// Spin through buffer offsets
for (int off = 0; off < 16; off++) {
// Fill the test buffer with different values for each iteration
aws_byte_buf_write_u8_n(&test_buf, off + 129, test_buf.capacity - test_buf.len);
aws_byte_buf_write_u8_n(&test_buf, (uint8_t)off + 129, test_buf.capacity - test_buf.len);
uint64_t expected = 0;
int len = 1;
// Spin through input data lengths
Expand Down
2 changes: 1 addition & 1 deletion tests/crc_test.c
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ static int s_test_vs_reference_crc_32(
// Spin through buffer offsets
for (int off = 0; off < 16; off++) {
// Fill the test buffer with different values for each iteration
aws_byte_buf_write_u8_n(&test_buf, off + 129, test_buf.capacity - test_buf.len);
aws_byte_buf_write_u8_n(&test_buf, (uint8_t)off + 129, test_buf.capacity - test_buf.len);
uint32_t expected = 0;
int len = 1;
// Spin through input data lengths
Expand Down