Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zstd compression level always reported as 1? #53

Closed
javafanboy opened this issue Aug 19, 2024 · 3 comments
Closed

Zstd compression level always reported as 1? #53

javafanboy opened this issue Aug 19, 2024 · 3 comments

Comments

@javafanboy
Copy link

I created a bunch of Parquet files with Zstd compression and tried different levels, the time taken was different and file size changed but PQRS always reported Zstd(level 1) when I did schema --detailed.

This is on MAC / MacOS / M3.

@ttencate
Copy link

ttencate commented Aug 19, 2024

Heh, I noticed this too last week, and just now spotted your issue. I think I can explain. The compression level that was used to create the file isn't actually stored in the Parquet file anywhere. Level 1 is reported because the parquet library requires a compression level in the enum, and 1 is the default for ZstdLevel.

pqrs could fix this without any changes needed to the upstream parquet library by omitting everything from the first ( onwards. I'll send a PR.

Edit: actually the printing does come from upstream, here. I'll send the PR there :)

@ttencate
Copy link

It's been fixed upstream. I suppose this issue can be closed, or do you want to wait for a release?

@javafanboy
Copy link
Author

No that is fine - mostly wanted a confirmation - this "1" sent me on a chase of errors in my code for not setting the desired compression level ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants