-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(parsers.csv): Allow null-delimiters #12247
Conversation
…parsing to pipeline
Hi, Thanks for putting up this PR. It appears that this is essentially a "find and replace" before doing any parsing. Instead of having this hard-code the replacement, can we instead rename Essentially if I had:
It would replace those null bytes with commas. What do you think about that? Before you go off and make changes I would like to hear from @srebhan as well |
@Meceron thanks for your PR! Instead of forcing the user to give another option how about the following (in pseudo-code):
What do you think? |
Thanks for the advice! I've changed code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick turn around! I have put up some comments that clarify the docs and rename the bool to invalidDelimiter
.
Take a look and let us know!
@Meceron any news on this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, I didn't finished my review..
plugins/parsers/csv/parser.go
Outdated
var replacementByte = "\ufffd" | ||
var commaByte = "\u002C" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe const instead of var and directly a byte array so no conversion is needed on usage?
err := p.Init() | ||
require.NoError(t, err) | ||
|
||
testCSV := strings.Join([]string{"3.4", "70", "test_name"}, "\u0000") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testCSV := strings.Join([]string{"3.4", "70", "test_name"}, "\u0000") | |
testCSV := "3.4\u000070\u0000test_name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the existing as it makes it very clear what is going on versus having to count and determine the end of a unicode value
Sorry , I was on sickleave :(. As I see You All done everything, thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Sven Rebhan <36194019+srebhan@users.noreply.github.com>
Co-authored-by: Sven Rebhan <36194019+srebhan@users.noreply.github.com>
I wanted to try to land this for today's release, so I accepted the changes. |
Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip. 🥳 This pull request decreases the Telegraf binary size by -1.35 % for linux amd64 (new size: 157.2 MB, nightly size 159.3 MB) 📦 Click here to get additional PR build artifactsArtifact URLs |
@powersj my comments are still open and un-addressed.. |
@Meceron can you please at least make those two strings const? |
This was done already. I accepted your change that did this. edit: what Hipska said :) |
Oh didn't see a commit here... Sorry, I should better take a look at the code... :-/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice improvement @Meceron! LGTM.
Merging as integration test is a known issue fixed on master |
https://github.com/influxdata/telegraf/issues/12243
resolves #12243
Fixing null delimiters in csv, change the selected bytes before parsing to pipeline.