-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 Zstd 1.5.0 Release 🎉 #2636
🎉 Zstd 1.5.0 Release 🎉 #2636
Conversation
Memory constrained use cases that manage multiple archives benefit from retaining multiple archive seek tables without retaining a ZSTD_seekable instance for each. * New opaque type for seek table: ZSTD_seekTable. * ZSTD_seekable_copySeekTable() supports copying seek table out of a ZSTD_seekable. * ZSTD_seekTable_[eachSeekTableOp]() defines seek table API that mirrors existing seek table operations. * Existing ZSTD_seekable_[eachSeekTableOp]() retained; they delegate to ZSTD_seekTable the variant. These changes allow the above-mentioned use cases to initialize a ZSTD_seekable, extract its ZSTD_seekTable, then throw the ZSTD_seekable away to save memory. Standard ZSTD operations can then be used to decompress frames based on seek table offsets. The copy and delegate patterns are intended to minimize impact on existing code and clients. Using copy instead of move for the infrequent operation extracting a seek table ensures that the extraction does not render the ZSTD_seekable useless. Delegating to *new* seek table-oriented APIs ensures that this is not a breaking change for existing clients while supporting all meaningful operations that depend only on seek table data.
[contrib] Support seek table-only API
read-only objects are properly const-ified in parameters
Seekable hang fix
and simple roundtrip test
New direct seekTable access methods
It is a stack high-point for some compression strategies and has an easy fix. This moves the normalized count into the entropy workspace.
Reduce stack usage of ZSTD_buildCTable()
This saves ~700 bytes of stack space in HUF_writeCTable.
Add HUF_writeCTable_wksp() function
* Use `HUF_readStats_wksp()` * Use workspace in `HUF_fillDTableX2*()` * Clean up workspace usage to use a workspace struct
* Move `counting` into the workspace * Inrease `HUF_DECOMPRESS_WORKSPACE_SIZE` by 512 bytes
doc: ZSTD_free*() functions accept NULL pointer
Make the number of physical CPU cores detection more robust
This commit introduces a GitHub action that is triggered on release creation, which creates the release tarball, compresses it, hashes it, signs it, and attaches all of those files to the release.
changed strategy, now unconditionally prefetch the first 2 cache lines, instead of cache lines corresponding to the first and last bytes of the match. This better corresponds to cpu expectation, which should auto-prefetch following cachelines on detecting the sequential nature of the read. This is globally positive, by +5%, though exact gains depend on compiler (from -2% to +15%). The only negative counter-example is gcc-9.
…_prefetch_refactor
This seems to bring an additional ~+1.2% decompression speed on average across 10 compilers x 6 scenarios.
Refactor prefetching for the decoding loop
the new alignment setting is better for gcc-9 and gcc-10 by about ~+5%. Unfortunately, it's worse for essentially all other compilers. Make the new alignment setting conditional to gcc-9+.
Apply flags to libzstd-nomt in libzstd style
improved gcc-9 and gcc-10 decoding speed
When running armv6 userspace on armv8 hardware with a 64 bit Linux kernel, the mode 2 caused SIGBUS (unaligned memory access). Running all our arm builds in the build farm only on armv8 simplifies administration a lot. Depending on compiler and environment, this change might slow down memory accesses (did not benchmark it). The original analysis is 6 years old. Fixes #2632
Avoid SIGBUS on armv6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As expected,
extended fuzzer tests started during the week-end have not found anything so far.
This seems good to go.
On Windows 10, maybe this release has a performance regression. Just replace the |
This change is missing from changelog: [1.5.0] Enable multithreading in lib build by default (#2584) |
and restored limit to 256 when in 64-bit mode (it was reduced to 200 to give more room for 32-bit). This should fix test instability issues using lot of threads in 32-bit environments.
With small enough input files, the inferred value of fileWindowLog could be smaller than ZSTD_WINDOWLOG_MIN. This can be reproduced like so: $ echo abc > small $ echo abcdef > small2 $ zstd --patch-from small small2 -o patch previously, this would fail with the error "zstd: error 11 : Parameter is out of bound"
reduce ZSTDMT_NBWORKERS_MAX in 32-bit mode
hopefully, bionic will have a more recent version of python required to install meson.
Fixed meson test on travisCI
Changelog
ZSTD_defaultCLevel()
ZSTD_getDictID_fromCDict()
ZSTD_compress_advanced()
ZSTD_compress_usingCDict_advanced()
ZSTD_compressBegin_advanced()
ZSTD_compressBegin_usingCDict_advanced()
ZSTD_initCStream_srcSize()
ZSTD_initCStream_usingDict()
ZSTD_initCStream_usingCDict()
ZSTD_initCStream_advanced()
ZSTD_initCStream_usingCDict_advanced()
ZSTD_resetCStream()
clang
and for--long
modes (faster speed for decompressSequencesLong #2614 improved gcc-9 and gcc-10 decoding speed #2630, @Cyan4973)ZSTD_entropyCost()
, fix superblocks no sequences case #2592, @senhuang42)ZSTD_estimateCCtxSize*()
monotonically increases with compression level (Add memory monotonicity test over srcSize #2538, @senhuang42)zdict.h
dictionary training API documentation ([zdict] Add a FAQ to the top of zdict.h #2622, @terrelln)ZSTD_free*()
functions accept NULL pointers (doc: ZSTD_free*() functions accept NULL pointer #2521, @animalize)zstd_errors.h
andzdict.h
tolib/
root ([1.5.0] Movezstd_errors.h
andzdict.h
tolib/
root #2597, @terrelln)build/
directory (Move Single-File Build Script fromcontrib/
tobuild/
#2618, @felixhandte)ZSTDMT_JOBSIZE_MIN
to be configured at compile-time, reduce default to 512KB (allow jobSize to be as low as 512 KB #2611, @Cyan4973)ZBUFF_*()
is no longer built by default ([1.5.0] Remove ZBUFF #2583, @senhuang42)md5
on Darwin (Detect Presence ofmd5
on Darwin #2609, @felixhandte)--progress
flag added to always display progress bar (Add --progress flag #2595, @senhuang42)--force
(Allow Reading from Block Devices with--force
#2613, @felixhandte)--filelist
end-of-line bug (fix --filelist compatibility with Windows cr+lf line ending #2620, @Cyan4973)