-
Notifications
You must be signed in to change notification settings - Fork 21
Home
For test results, look here.
For comparison we used Win32 assembly-based optimized version (Asm
), Win32 C-based version (C
), Win32 DLL-based version (DLL
) and Win32 C-based version built from original sources (Orig
). "DLL based version" utilized prebuilt zlib 1.2.3 obtained from Winimage zlib page. These DLL files were built from zlib with official Assembly patches.
Also performance-oriented zlib fork, zlib-ng was tested.
Tests were built using Visual Studio 2010 with speed optimization. Test system is Windows 8.1 64-bit, Intel Core i7-4702MQ @ 2.2 GHz.
For testing we used 3 data sets.
- Unreal engine 4 source code (Engine/Source)
- Unreal engine 4 binaries (Engine/Binaries/Win64)
- Graphical files containing geometry and textures in tif and png formats.
All data sets contains up to 256 Mb of data (Unreal engine 4 source code is smaller, 137Mb).
For 64-bit target, original C implementation ("Orig") is 6-10% slower than 32-bit build. The same is for "DLL" build. However optimized C version ("C") in 64 bits in faster than 32-bit assembly code, so code gets additional performance boost (compared to "C" version) just because of its 64-bitness.
For 32-bit target, optimized assembly version is ~5% faster than optimized C version, so 64-bit "C" version performance is somewhere between 32-bit "Asm" and 32-bit "C".
I've tested library on 32-bit Ubuntu, compiled with GCC 5.4.0. Optimized code performs 1-4% slower than Win32 version. Original zlib implementation performs 7-8% slower.
Test mode is name of compared version, mentioned in Compared versions paragraph. The following number is compression level. Each table cell contains data in following format: <elapsed time> / <compressed size> / <compression speed>
.
As you may see, Asm version is just a little bit faster than C version. Optimized version performs nearly 2.5-10 times faster than original C version. Thank slower compression, than more performance improvement achieved.
Test application was designed to exclude file access times as much as possible. File reading times are fully excluded, however file writing times are still here. However we can expect that OS will buffer these times and perform write operations asynchronously. Test application source code could be seen here.
Current tests are for zlib version 1.2.11.
This release performs 15-35% faster than Release 1. Than slower compression, than more performance boost we'll get.
Test mode | Source code | Binaries | Geometry data |
---|---|---|---|
Asm -9 | 5.2s / 28858611 b / 26.49 Mb/s | 12.3s / 54473779 b / 20.74 Mb/s | 18.7s / 114915875 b / 13.72 Mb/s |
C -9 | 5.1s / 28858611 b / 27.08 Mb/s | 12.6s / 54473779 b / 20.29 Mb/s | 19.1s / 114915875 b / 13.42 Mb/s |
zlib-ng -9 | 9.5s / 28859066 b / 14.53 Mb/s | 45.3s / 54485028 b / 5.65 Mb/s | 128.1s / 114934697 b / 2.00 Mb/s |
Dll -9 | 9.6s / 28859103 b / 14.37 Mb/s | 14.37s / 54485836 b / 6.10 Mb/s | 150.0s / 114942962 b / 1.71 Mb/s |
Orig -9 | 10.7s / 28859103 b / 12.84 Mb/s | 52.6s / 54485836 b / 4.87 Mb/s | 184.4s / 114942962 b / 1.39 Mb/s |
Test mode | Source code | Binaries | Geometry data |
---|---|---|---|
Asm -9 | 11.8s / 51136013 b / 21.65 Mb/s | 15.4s / 54456154 b / 16.64 Mb/s | 25.4s / 114917505 b / 10.07 Mb/s |
C -9 | 12.3s / 51136013 b / 20.81 Mb/s | 16.1s / 54456154 b / 15.88 Mb/s | 26.6s / 114917505 b / 9.63 Mb/s |
Dll -9 | 22.6s / 51145811 b / 11.35 Mb/s | 42.0s / 54485836 b / 6.10 Mb/s | 150.0s / 114942962 b / 1.71 Mb/s |
Orig -9 | 26.3s / 51145811 b / 9.75 Mb/s | 52.6s / 54485836 b / 4.87 Mb/s | 184.4s / 114942962 b / 1.39 Mb/s |