Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[zstd_stream] Don't block in reader.Read if a zstd block is available #96
[zstd_stream] Don't block in reader.Read if a zstd block is available #96
Changes from all commits
42c5dcb
0b51bae
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think you are correct that if
r.decompSize == len(r.decompressionBuffer)
then zstd should output a decompressed block without additional input (except if its estimation was not correct but in that case, next loop it will ask more data).But I don't think it means the opposite is correct though: i.e:
r.decompSize < len(r.decompressionBuffer)
does not meanwithout any additional input, not output will be produced
. From the documentation (https://github.com/DataDog/zstd/blob/1.x/zstd.h#L777), a previous output could actually produce many blocks even if no additional input is given (andretCode
is only a hint) so I think it would be safer to defer the return ofio.EOF
after the actual call to C zstd if err == EOF andcompressionLeft == 0
(no compression left)You might have more background than me on non-EOFed pipes but my understanding is that we should get (and return) only EOF when the pipe will not provide data anymore at any point in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm initially thinking of changing the condition to:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that if
r.decompSize < len(r.decompressionBuffer)
, ie ifoutput.pos < output.size
, the decoder flushed all it could (see https://github.com/DataDog/zstd/blob/1.x/zstd.h#L774); in other words the decoder needs more data in order to output anything else.The line you linked seems to refer to a special condition if
len(r.decompressionBuffer) > ZSTD_BLOCKSIZE_MAX
. However this will never be the case, because the decompression buffer is allocated by a pool to a size ofZSTD_DStreamOutSize
, which is ==ZSTD_BLOCKSIZE_MAX
. (Which is the purpose of usingZSTD_DStreamOutSize()
.)So the condition looks correct to me as is. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed! I didn't read on
ZSTD_DStreamOutSize == ZSTD_BLOCKSIZE_MAX
but yes you are correctThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking also about a input stream that got accidentaly cut off (so it starts returning
EOF
but we still have some zstd partial data), we could return aio.UnexpectedEOF
https://golang.org/pkg/io/#pkg-variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some code & comment about that. https://github.com/DataDog/zstd/pull/96/files#diff-ff3e40bda515a4af0936733c5415920eb09cc903913a7243bdec1074266784b0R470 Thoughts?