Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add xz decoder #14434

Merged
merged 5 commits into from
Jan 26, 2023
Merged

Add xz decoder #14434

merged 5 commits into from
Jan 26, 2023

Conversation

FnControlOption
Copy link
Contributor

@FnControlOption FnControlOption commented Jan 23, 2023

Initial implementation for #2851

Currently only supports files compressed with a single xz stream and no compression filters other than LZMA2 (which seems to be the most common use case)

File format: https://tukaani.org/xz/xz-file-format.txt (markdown)

LZMA2 code ported from https://github.com/gendx/lzma-rs

Test files from https://github.com/xz-mirror/xz/tree/master/tests/files

@andrewrk
Copy link
Member

LZMA is useful independently from xz, is it not?

@FnControlOption
Copy link
Contributor Author

FnControlOption commented Jan 24, 2023

Kinda. The original LZMA (aka LZMA1) is also used by lzip (and probably other things) while LZMA2 was developed solely for xz. Or at least, that's my understanding. From the lzma-rs code, it seems there's some overlap in decoding LZMA1 and LZMA2, but the LZMA format is poorly documented so tbh that doesn't sound fun to me to investigate lol. Should I delete lzma.zig and only support uncompressed LZMA2 streams for now? Or perhaps we should revisit this after proper LZMA1 and/or LZMA2 support is added?

@andrewrk
Copy link
Member

andrewrk commented Jan 24, 2023

I am perfectly content with xz-specific lzma code, at the very least for the time being.

P.S. can you please add 4 more lines to your diff? (just kidding) 😜

@andrewrk
Copy link
Member

Nice work.

I added a commit to your branch, making some executive decisions to better fit this implementation into the standard library API (as well as a breaking change to gzip to go with it). I hope you will find them to your liking.

Additionally, I added another commit to the branch, hooking up the new xz implementation into the package manager, enabling fetching of .tar.xz files. So, this PR closes #14300 in addition to #2851.

FnControlOption and others added 4 commits January 24, 2023 15:24
 * add xz to std.compress
 * prefer importing std.zig by file name, to reduce reliance on the
   standard library being a special case.
 * extract some types from inside generic functions. These types are the
   same regardless of the generic parameters.
 * expose some more types in the std.compress.xz namespace.
 * rename xz.stream to xz.decompress
 * rename check.Kind to Check
 * use std.leb for LEB instead of a redundant implementation
This includes a breaking change:

std.compress.gzip.GzipStream renamed to
std.compress.gzip.Decompress

This follows the same naming convention as std.compress.xz so that the
stream type can be passed as a comptime parameter.
@FnControlOption
Copy link
Contributor Author

FnControlOption commented Jan 25, 2023

Looks good! No idea why CI is failing though since the tests are passing on both my macOS and Windows machines

@andrewrk
Copy link
Member

andrewrk commented Jan 25, 2023

With your patch that passed the CI checks, it was missing the line in compress.zig to expose xz. So it wasn't actually being tested by the CI at all. Now that the CI is actually running the new code, it found this:

2023-01-25T00:16:04.8712159Z 1711/2400 test.compressed data... FAIL (BadHeader)
2023-01-25T00:16:06.9388340Z /home/ci/actions-runner/_work/zig/zig/lib/std/compress/xz.zig:55:17: 0x1657ea7 in init (test)
2023-01-25T00:16:06.9392790Z                 return error.BadHeader;
2023-01-25T00:16:06.9394804Z                 ^
2023-01-25T00:16:07.4059531Z /home/ci/actions-runner/_work/zig/zig/lib/std/compress/xz.zig:34:5: 0x1657cef in decompress__anon_128280 (test)
2023-01-25T00:16:07.4060059Z     return Decompress(@TypeOf(reader)).init(allocator, reader);
2023-01-25T00:16:07.4060363Z     ^
2023-01-25T00:16:07.8704360Z /home/ci/actions-runner/_work/zig/zig/lib/std/compress/xz/test.zig:8:21: 0x165ae83 in decompress (test)
2023-01-25T00:16:07.8704949Z     var xz_stream = try xz.decompress(testing.allocator, in_stream.reader());
2023-01-25T00:16:07.8705291Z                     ^
2023-01-25T00:16:08.3345492Z /home/ci/actions-runner/_work/zig/zig/lib/std/compress/xz/test.zig:15:17: 0x165b82b in testReader__anon_128279 (test)
2023-01-25T00:16:08.3345997Z     const buf = try decompress(data);
2023-01-25T00:16:08.3346277Z                 ^
2023-01-25T00:16:08.7970138Z /home/ci/actions-runner/_work/zig/zig/lib/std/compress/xz/test.zig:22:5: 0x165c9d3 in test.compressed data (test)
2023-01-25T00:16:08.7970769Z     try testReader(@embedFile("testdata/good-0-empty.xz"), "");
2023-01-25T00:16:08.7971099Z     ^
2023-01-25T00:16:08.7973502Z 1712/2400 test.unsupported... expected error.Unsupported, found error.BadHeader
2023-01-25T00:16:08.7974036Z FAIL (TestExpectedError)
2023-01-25T00:16:09.2618280Z /home/ci/actions-runner/_work/zig/zig/lib/std/compress/xz.zig:55:17: 0x1657ea7 in init (test)
2023-01-25T00:16:09.2618720Z                 return error.BadHeader;
2023-01-25T00:16:09.2618980Z                 ^
2023-01-25T00:16:09.7292202Z /home/ci/actions-runner/_work/zig/zig/lib/std/compress/xz.zig:34:5: 0x1657cef in decompress__anon_128280 (test)
2023-01-25T00:16:09.7292712Z     return Decompress(@TypeOf(reader)).init(allocator, reader);
2023-01-25T00:16:09.7293024Z     ^
2023-01-25T00:16:10.1991481Z /home/ci/actions-runner/_work/zig/zig/lib/std/compress/xz/test.zig:8:21: 0x165ae83 in decompress (test)
2023-01-25T00:16:10.1992075Z     var xz_stream = try xz.decompress(testing.allocator, in_stream.reader());
2023-01-25T00:16:10.1992414Z                     ^
2023-01-25T00:16:10.6759206Z /home/ci/actions-runner/_work/zig/zig/lib/std/testing.zig:37:13: 0x165d0b7 in expectError__anon_128536 (test)
2023-01-25T00:16:10.6759686Z             return error.TestExpectedError;
2023-01-25T00:16:10.6759969Z             ^
2023-01-25T00:16:11.1468891Z /home/ci/actions-runner/_work/zig/zig/lib/std/compress/xz/test.zig:75:9: 0x165d3a3 in test.unsupported (test)
2023-01-25T00:16:11.1469469Z         try testing.expectError(
2023-01-25T00:16:11.1469738Z         ^
2023-01-25T00:16:12.4551099Z 2333 passed; 65 skipped; 2 failed.
2023-01-25T00:16:12.4661738Z error: the following test command failed with exit code 1:
2023-01-25T00:16:12.4662357Z qemu-mips /home/ci/actions-runner/_work/zig/zig/build-debug/zig-local-cache/o/640e7a9605343bd9fee3884791bdde4d/test
2023-01-25T00:16:12.4722885Z error: test...
2023-01-25T00:16:12.4723136Z error: The following command exited with error code 1:
2023-01-25T00:16:12.4725213Z /home/ci/actions-runner/_work/zig/zig/build-debug/stage3-debug/bin/zig test /home/ci/actions-runner/_work/zig/zig/lib/std/std.zig --test-name-prefix std-mips-linux-none-Debug-bare-multi-default  --cache-dir /home/ci/actions-runner/_work/zig/zig/build-debug/zig-local-cache --global-cache-dir /home/ci/actions-runner/_work/zig/zig/build-debug/zig-global-cache --name test -fno-single-threaded -target mips-linux-none -mcpu mips32 --test-cmd qemu-mips --test-cmd-bin -I /home/ci/actions-runner/_work/zig/zig/test -L /home/ci/deps/zig+llvm+lld+clang-x86_64-linux-musl-0.11.0-dev.971+19056cb68/lib -I /home/ci/deps/zig+llvm+lld+clang-x86_64-linux-musl-0.11.0-dev.971+19056cb68/include --zig-lib-dir /home/ci/actions-runner/_work/zig/zig/lib --enable-cache 

Note the target: mips-linux-none

It failed on the first big-endian target that is checked by the CI.

You can test this target locally on Linux, if you install QEMU:

[nix-shell:~/dev/zig/build-release]$ stage3/bin/zig test ../lib/std/std.zig -target mips-linux 
warning: the host system (x86_64-linux.5.15.82...5.15.82-gnu.2.35) does not appear to be capable of executing binaries from the target (mips-linux.3.16...5.10.81-musl). Consider using '--test-cmd qemu-mips --test-cmd-bin' to run the tests
error: the following command failed with 'InvalidExe':
/home/andy/dev/zig/zig-cache/o/6f1b172430cddcf12b9304a8f2057769/test

Follow the advice printed at the end. Example:

qemu-mips /home/andy/dev/zig/zig-cache/o/6f1b172430cddcf12b9304a8f2057769/test

or add it to your original invocation:

stage3/bin/zig test ../lib/std/std.zig -target mips-linux --test-cmd qemu-mips --test-cmd-bin

auto-merge was automatically disabled January 25, 2023 16:49

Head branch was pushed to by a user without write access

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants