-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
basenc: perform faster, streaming encoding #6719
Conversation
Improve the performance, both in memory and time, of the encoding performed by the basenc (except in --z85 mode), base32, and base64 programs. These programs now perform encoding in a buffered/streaming manner, so encoding is not constrained by the amount of available memory.
Setup
No wrappingNew implementationElapsed (wall clock) time (h:mm:ss or m:ss): 0:00.70
Existing implementationElapsed (wall clock) time (h:mm:ss or m:ss): 0:01.99
GNU Core Utilities's implementationElapsed (wall clock) time (h:mm:ss or m:ss): 0:00.72
Default wrapping (76 characters)New implementationElapsed (wall clock) time (h:mm:ss or m:ss): 0:03.40
Existing implementationElapsed (wall clock) time (h:mm:ss or m:ss): 0:08.04
GNU Core Utilities's implementationElapsed (wall clock) time (h:mm:ss or m:ss): 0:00.99
|
GNU testsuite comparison:
|
Please use hyperfine for benchmarking. Time isn't reliable enough for performances |
(Was having trouble getting
No wrappingpoopNew implementation
Existing implementation
hyperfineNew implementation
Existing implementation
|
you should call hyperfine once with the old and new implementation |
No wrapping
Default wrapping (76 characters)
|
GNU testsuite comparison:
|
GNU testsuite comparison:
|
Same as previous changes, just applied to decoding
Decoding performance dataSetup
Hyperfine
poop
|
GNU testsuite comparison:
|
Last one:
|
a305350
to
27482c5
Compare
Encoding with line wrapping enabled was the last remaining scenario I'm aware of in which GNU Core Utilities was still faster, and now that's been fixed: hyperfine
poop
|
GNU testsuite comparison:
|
GNU testsuite comparison:
|
ec7261f
to
2ebce39
Compare
GNU testsuite comparison:
|
Not directly related, but I'm mentioning it for completeness: #6008 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good, and your performance comparison looks awesome, thank you!
I have some suggestions about some names and style decisions that I feel are way too unusual and surprising, and would like to ask you to change them.
Also, Codecov complains about lines being untested, and there seem to be no unit test for the hundreds of new lines. Please fix that.
This PR improves base*
substantially, and I'm looking forward to seeing it merged!
992a352
to
9eb35c7
Compare
I have added property testing of the core decoding/encoding logic to the PR. Doing so helped catch two bugs: one that I introduced in The other bug was related to my misunderstanding of input length validation for Z85. Anyway, this definitely sold me on property testing. I have tuned the parameters so that the property tests for |
GNU testsuite comparison:
|
9c82332
to
a0718ef
Compare
Could you please add your methodology into src/uu/base64/BENCHMARKING.md like we are doing for other programs? thanks |
GNU testsuite comparison:
|
a0718ef
to
7672dd9
Compare
Benchmarking instructions, and some benchmark results, added: |
GNU testsuite comparison:
|
great doc, it is a nice read :) |
7672dd9
to
11b4640
Compare
11b4640
to
f3b2304
Compare
GNU testsuite comparison:
|
26782d5
to
9fa405f
Compare
GNU testsuite comparison:
|
GNU testsuite comparison:
|
Improve the performance, both in memory and time, of the encoding performed by the basenc (except in --z85 mode), base32, and base64 programs.
These programs now perform encoding in a buffered/streaming manner, so encoding is not constrained by the amount of available memory.