Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

complete all pack and unpack method #1

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tang-hi
Copy link

@tang-hi tang-hi commented Jul 5, 2023

I have implemented pack and unpack for the remaining items, except for 32, as it seems that it does not require compression and only needs to be copied. Surprisingly, when switching to IntVector, the speed has significantly improved compared to Long. I haven't finished running the benchmark yet, but once I do, I will post it. I hope this PR saves you time.

I think our next step

  1. test the scalar code
  2. generate code automatically. (I will also post this later.)

@tang-hi
Copy link
Author

tang-hi commented Jul 5, 2023

Here is the encode benchmark

Benchmark Mode Cnt Score Error Units
Benchmark.encode10ForUtil thrpt 5 24.931 ± 0.462 ops/us
Benchmark.encode10SimdPack thrpt 5 137.347 ± 2.547 ops/us
Benchmark.encode11ForUtil thrpt 5 24.184 ± 0.433 ops/us
Benchmark.encode11SimdPack thrpt 5 133.867 ± 4.950 ops/us
Benchmark.encode12ForUtil thrpt 5 26.851 ± 4.366 ops/us
Benchmark.encode12SimdPack thrpt 5 137.089 ± 3.596 ops/us
Benchmark.encode13ForUtil thrpt 5 22.474 ± 4.161 ops/us
Benchmark.encode13SimdPack thrpt 5 123.514 ± 37.684 ops/us
Benchmark.encode14ForUtil thrpt 5 25.812 ± 3.500 ops/us
Benchmark.encode14SimdPack thrpt 5 121.874 ± 25.787 ops/us
Benchmark.encode15ForUtil thrpt 5 25.394 ± 2.692 ops/us
Benchmark.encode15SimdPack thrpt 5 127.692 ± 2.167 ops/us
Benchmark.encode16ForUtil thrpt 5 53.138 ± 3.727 ops/us
Benchmark.encode16SimdPack thrpt 5 151.850 ± 2.213 ops/us
Benchmark.encode17ForUtil thrpt 5 15.232 ± 1.092 ops/us
Benchmark.encode17SimdPack thrpt 5 124.229 ± 4.657 ops/us
Benchmark.encode18ForUtil thrpt 5 14.183 ± 0.321 ops/us
Benchmark.encode18SimdPack thrpt 5 123.324 ± 2.841 ops/us
Benchmark.encode19ForUtil thrpt 5 11.748 ± 0.303 ops/us
Benchmark.encode19SimdPack thrpt 5 119.589 ± 9.774 ops/us
Benchmark.encode1ForUtil thrpt 5 33.853 ± 1.695 ops/us
Benchmark.encode1SimdPack thrpt 5 117.328 ± 17.573 ops/us
Benchmark.encode20ForUtil thrpt 5 15.283 ± 1.104 ops/us
Benchmark.encode20SimdPack thrpt 5 124.027 ± 5.945 ops/us
Benchmark.encode21ForUtil thrpt 5 14.914 ± 0.978 ops/us
Benchmark.encode21SimdPack thrpt 5 117.567 ± 3.683 ops/us
Benchmark.encode22ForUtil thrpt 5 15.325 ± 0.492 ops/us
Benchmark.encode22SimdPack thrpt 5 118.220 ± 3.754 ops/us
Benchmark.encode23ForUtil thrpt 5 14.541 ± 0.859 ops/us
Benchmark.encode23SimdPack thrpt 5 107.146 ± 38.841 ops/us
Benchmark.encode24ForUtil thrpt 5 16.600 ± 3.122 ops/us
Benchmark.encode24SimdPack thrpt 5 117.694 ± 4.493 ops/us
Benchmark.encode25ForUtil thrpt 5 14.250 ± 0.338 ops/us
Benchmark.encode25SimdPack thrpt 5 111.049 ± 1.944 ops
Benchmark.encode26ForUtil thrpt 5 14.268 ± 0.415 ops/us
Benchmark.encode26SimdPack thrpt 5 106.436 ± 4.357 ops/us
Benchmark.encode27ForUtil thrpt 5 13.871 ± 0.976 ops/us
Benchmark.encode27SimdPack thrpt 5 108.070 ± 6.025 ops/us
Benchmark.encode28ForUtil thrpt 5 16.362 ± 2.884 ops/us
Benchmark.encode28SimdPack thrpt 5 105.754 ± 4.053 ops/us
Benchmark.encode29ForUtil thrpt 5 13.323 ± 1.909 ops/us
Benchmark.encode29SimdPack thrpt 5 103.449 ± 9.690 ops/us
Benchmark.encode2ForUtil thrpt 5 42.112 ± 1.316 ops/us
Benchmark.encode2SimdPack thrpt 5 129.143 ± 3.222 ops/us
Benchmark.encode30ForUtil thrpt 5 16.772 ± 0.496 ops/us
Benchmark.encode30SimdPack thrpt 5 110.330 ± 1.753 ops/us
Benchmark.encode31ForUtil thrpt 5 17.295 ± 0.610 ops/us
Benchmark.encode31SimdPack thrpt 5 106.625 ± 5.102 ops/us
Benchmark.encode3ForUtil thrpt 5 35.811 ± 3.596 ops/us
Benchmark.encode3SimdPack thrpt 5 123.668 ± 4.353 ops/us
Benchmark.encode4ForUtil thrpt 5 45.050 ± 1.472 ops/us
Benchmark.encode4SimdPack thrpt 5 144.834 ± 2.866 ops/us
Benchmark.encode5ForUtil thrpt 5 32.496 ± 1.882 ops/us
Benchmark.encode5SimdPack thrpt 5 138.943 ± 4.958 ops/us
Benchmark.encode6ForUtil thrpt 5 34.773 ± 0.595 ops/us
Benchmark.encode6SimdPack thrpt 5 136.707 ± 8.658 ops/us
Benchmark.encode7ForUtil thrpt 5 34.792 ± 0.523 ops/us
Benchmark.encode7SimdPack thrpt 5 126.586 ± 2.790 ops/us
Benchmark.encode8ForUtil thrpt 5 53.890 ± 1.244 ops/us
Benchmark.encode8SimdPack thrpt 5 132.190 ± 4.752 ops/us
Benchmark.encode9ForUtil thrpt 5 22.176 ± 0.513 ops/us
Benchmark.encode9SimdPack thrpt 5 131.336 ± 4.564 ops/us

@tang-hi
Copy link
Author

tang-hi commented Jul 5, 2023

Here is the code generator script
https://gist.github.com/tang-hi/02708ccd3bf8837f774f2da9a16a3e2d
python vectorized.py <pack/unpack> <bitPerValue>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant