Pass through non-UTF8 bytes in lines preprocessor #107
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Attempting to
CSV.decode
a stream that contains non-UTF8 bytes raises aFunctionClauseError
:This makes it impossible to handle encoding errors per-line or use machinery like
Decoder
'sreplacement
option.The code that would prevent this crash was accidentally deleted in 4f5069b because it is "unused" for files that only contain valid UTF8.
This PR restores the deleted clause and adds a high-level test; existing tests cover
Decoder
andLexer
but not the complete pipeline.