Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add target features for the SIMD compare256 operations #160

Merged
merged 1 commit into from
Aug 23, 2024

Conversation

folkertdev
Copy link
Collaborator

important in the generic case where avx2 is not statically enabled


Benchmark 1 (53 runs): ./target/release/examples/baseline 1 rs silesia-small.tar
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          95.0ms ±  720us    93.7ms … 97.2ms          1 ( 2%)        0%
  peak_rss           26.6MB ± 64.1KB    26.5MB … 26.6MB          0 ( 0%)        0%
  cpu_cycles          362M  ± 2.45M      357M  …  370M           1 ( 2%)        0%
  instructions        851M  ±  255       851M  …  851M           0 ( 0%)        0%
  cache_references   19.9M  ±  130K     19.7M  … 20.2M           0 ( 0%)        0%
  cache_misses        475K  ± 83.1K      374K  …  820K           1 ( 2%)        0%
  branch_misses      3.06M  ± 2.25K     3.06M  … 3.07M           2 ( 4%)        0%
Benchmark 2 (57 runs): ./target/release/examples/blogpost-compress 1 rs silesia-small.tar
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          87.9ms ±  918us    86.8ms … 91.3ms          3 ( 5%)        ⚡-  7.5% ±  0.3%
  peak_rss           26.6MB ± 63.1KB    26.5MB … 26.6MB          0 ( 0%)          +  0.0% ±  0.1%
  cpu_cycles          330M  ± 3.22M      326M  …  343M           4 ( 7%)        ⚡-  9.0% ±  0.3%
  instructions        776M  ±  278       776M  …  776M           0 ( 0%)        ⚡-  8.9% ±  0.0%
  cache_references   19.8M  ±  145K     19.6M  … 20.3M           1 ( 2%)          -  0.4% ±  0.3%
  cache_misses        448K  ± 75.2K      349K  …  642K           0 ( 0%)          -  5.8% ±  6.3%
  branch_misses      3.06M  ± 2.53K     3.06M  … 3.07M           0 ( 0%)          -  0.1% ±  0.0%

Benchmark 1 (53 runs): ./target/release/examples/baseline 1 rs silesia-small.tar
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          95.0ms ±  720us    93.7ms … 97.2ms          1 ( 2%)        0%
  peak_rss           26.6MB ± 64.1KB    26.5MB … 26.6MB          0 ( 0%)        0%
  cpu_cycles          362M  ± 2.45M      357M  …  370M           1 ( 2%)        0%
  instructions        851M  ±  255       851M  …  851M           0 ( 0%)        0%
  cache_references   19.9M  ±  130K     19.7M  … 20.2M           0 ( 0%)        0%
  cache_misses        475K  ± 83.1K      374K  …  820K           1 ( 2%)        0%
  branch_misses      3.06M  ± 2.25K     3.06M  … 3.07M           2 ( 4%)        0%
Benchmark 2 (57 runs): ./target/release/examples/blogpost-compress 1 rs silesia-small.tar
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          87.9ms ±  918us    86.8ms … 91.3ms          3 ( 5%)        ⚡-  7.5% ±  0.3%
  peak_rss           26.6MB ± 63.1KB    26.5MB … 26.6MB          0 ( 0%)          +  0.0% ±  0.1%
  cpu_cycles          330M  ± 3.22M      326M  …  343M           4 ( 7%)        ⚡-  9.0% ±  0.3%
  instructions        776M  ±  278       776M  …  776M           0 ( 0%)        ⚡-  8.9% ±  0.0%
  cache_references   19.8M  ±  145K     19.6M  … 20.3M           1 ( 2%)          -  0.4% ±  0.3%
  cache_misses        448K  ± 75.2K      349K  …  642K           0 ( 0%)          -  5.8% ±  6.3%
  branch_misses      3.06M  ± 2.53K     3.06M  … 3.07M           0 ( 0%)          -  0.1% ±  0.0%
@folkertdev folkertdev merged commit 2064779 into main Aug 23, 2024
17 checks passed
@folkertdev folkertdev deleted the compare256-target-features branch August 23, 2024 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant