Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify chunk_encoder #218

Closed
wants to merge 1 commit into from
Closed

Simplify chunk_encoder #218

wants to merge 1 commit into from

Conversation

mina86
Copy link
Contributor

@mina86 mina86 commented Jan 20, 2023

Firstly, use [T]::chunks rather than doing index calculation
manually.

Secondly, get rid of max_input_length. The function was overly
complicated for no reason. After all, no matter whether engine uses
padding or not, N-byte output buffer can accommodate (N/4*3) input
bytes.

Encode benchmarks here are kind of all over. Some improvements and
some regressions:

encode/
           3   29.272 ns   -0.4%   No change
          50   54.162 ns  +12.4%   Regression
         100   95.488 ns   -7.8%   Improvement
         500   255.84 ns   +3.5%   Regression
        3072   1.3527 µs   +0.1%   No change
     3145728   1.4584 ms   +0.2%   Within noise
    10485760   4.8923 ms   +1.6%   Regression
    31457280   16.087 ms   -2.6%   Improvement

encode_display/
           3   41.038 ns   -6.9%   Improvement
          50   70.877 ns   +7.2%   Regression
         100   91.190 ns   +1.1%   Within noise
         500   264.58 ns   -3.1%   Improvement
        3072   1.4515 µs   -3.8%   Improvement
     3145728   1.4239 ms   -0.2%   Within noise
    10485760   4.9768 ms   -0.7%   Within noise
    31457280   18.202 ms   -1.5%   Improvement

encode_reuse_buf/
           3   20.264 ns  -23.1%   Improvement
          50   46.631 ns   +2.1%   Regression
         100   66.405 ns   -2.0%   Improvement
         500   243.52 ns   +3.5%   Regression
        3072   1.3871 µs   -1.2%   Within noise
     3145728   1.4278 ms   +0.8%   Within noise
    10485760   5.0530 ms   -1.6%   Improvement
    31457280   15.798 ms   +0.7%   No change

encode_reuse_buf_stream/
           3   21.007 ns   +1.1%   Regression
          50   56.542 ns   -0.9%   No change
         100   75.214 ns   -1.4%   Improvement
         500   247.61 ns   -4.2%   Improvement
        3072   1.2836 µs   +0.1%   No change
     3145728   1.3177 ms   -0.5%   Within noise
    10485760   4.3843 ms   +0.3%   Within noise
    31457280   14.058 ms   -0.5%   Within noise

encode_slice/
           3   11.859 ns   +0.1%   No change
          50   30.239 ns   +2.8%   Regression
         100   53.383 ns   -6.4%   Improvement
         500   207.90 ns   +0.2%   No change
        3072   1.2050 µs   -0.1%   Within noise
     3145728   1.2294 ms   -0.5%   Within noise
    10485760   4.1025 ms   -0.1%   No change
    31457280   12.352 ms   +0.1%   Within noise

encode_string_reuse_buf_stream/
           3   39.237 ns   -0.9%   Within noise
          50   100.15 ns   -1.6%   Improvement
         100   124.03 ns   +3.5%   Regression
         500   312.15 ns  +10.0%   Regression
        3072   1.4781 µs   +2.7%   Regression
     3145728   1.4942 ms   +3.8%   Regression
    10485760   4.9965 ms   +4.3%   Regression
    31457280   16.990 ms   +7.4%   Regression

encode_string_stream/
           3   48.437 ns   -4.2%   Improvement
          50   152.06 ns   +3.5%   Regression
         100   170.02 ns   -5.2%   Improvement
         500   324.46 ns   -0.8%   Within noise
        3072   1.5332 µs   +1.6%   Regression
     3145728   1.4947 ms   +3.6%   Regression
    10485760   4.9710 ms   +3.2%   Regression
    31457280   19.075 ms   +2.2%   Regression

Firstly, use `[T]::chunks` rather than doing index calculation
manually.

Secondly, get rid of max_input_length.  The function was overly
complicated for no reason.  After all, no matter whether engine uses
padding or not, N-byte output buffer can accommodate (N/4*3) input
bytes.

Encode benchmarks here are kind of all over.  Some improvements and
some regressions:

    encode/
               3   29.272 ns   -0.4%   No change
              50   54.162 ns  +12.4%   Regression
             100   95.488 ns   -7.8%   Improvement
             500   255.84 ns   +3.5%   Regression
            3072   1.3527 µs   +0.1%   No change
         3145728   1.4584 ms   +0.2%   Within noise
        10485760   4.8923 ms   +1.6%   Regression
        31457280   16.087 ms   -2.6%   Improvement

    encode_display/
               3   41.038 ns   -6.9%   Improvement
              50   70.877 ns   +7.2%   Regression
             100   91.190 ns   +1.1%   Within noise
             500   264.58 ns   -3.1%   Improvement
            3072   1.4515 µs   -3.8%   Improvement
         3145728   1.4239 ms   -0.2%   Within noise
        10485760   4.9768 ms   -0.7%   Within noise
        31457280   18.202 ms   -1.5%   Improvement

    encode_reuse_buf/
               3   20.264 ns  -23.1%   Improvement
              50   46.631 ns   +2.1%   Regression
             100   66.405 ns   -2.0%   Improvement
             500   243.52 ns   +3.5%   Regression
            3072   1.3871 µs   -1.2%   Within noise
         3145728   1.4278 ms   +0.8%   Within noise
        10485760   5.0530 ms   -1.6%   Improvement
        31457280   15.798 ms   +0.7%   No change

    encode_reuse_buf_stream/
               3   21.007 ns   +1.1%   Regression
              50   56.542 ns   -0.9%   No change
             100   75.214 ns   -1.4%   Improvement
             500   247.61 ns   -4.2%   Improvement
            3072   1.2836 µs   +0.1%   No change
         3145728   1.3177 ms   -0.5%   Within noise
        10485760   4.3843 ms   +0.3%   Within noise
        31457280   14.058 ms   -0.5%   Within noise

    encode_slice/
               3   11.859 ns   +0.1%   No change
              50   30.239 ns   +2.8%   Regression
             100   53.383 ns   -6.4%   Improvement
             500   207.90 ns   +0.2%   No change
            3072   1.2050 µs   -0.1%   Within noise
         3145728   1.2294 ms   -0.5%   Within noise
        10485760   4.1025 ms   -0.1%   No change
        31457280   12.352 ms   +0.1%   Within noise

    encode_string_reuse_buf_stream/
               3   39.237 ns   -0.9%   Within noise
              50   100.15 ns   -1.6%   Improvement
             100   124.03 ns   +3.5%   Regression
             500   312.15 ns  +10.0%   Regression
            3072   1.4781 µs   +2.7%   Regression
         3145728   1.4942 ms   +3.8%   Regression
        10485760   4.9965 ms   +4.3%   Regression
        31457280   16.990 ms   +7.4%   Regression

    encode_string_stream/
               3   48.437 ns   -4.2%   Improvement
              50   152.06 ns   +3.5%   Regression
             100   170.02 ns   -5.2%   Improvement
             500   324.46 ns   -0.8%   Within noise
            3072   1.5332 µs   +1.6%   Regression
         3145728   1.4947 ms   +3.6%   Regression
        10485760   4.9710 ms   +3.2%   Regression
        31457280   19.075 ms   +2.2%   Regression
@marshallpierce
Copy link
Owner

Thanks for the contribution! Fortunately I didn't see quite as much benchmark variation, and the presence of changes in, say, encode_slice that don't touch this logic would imply that there's a fair amount of baseline noise (which my systems have as well). I've simplified it a little in #224.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants