-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
importccl: Preserve '\r\n' during CSV import #28181
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of the lint rules are failing because of imported code. e.g. you don't need to check for errors on Write
or Flush
because you call Error
immediately afterwards. This is the semantics of how the csv writer is designed. How can ask the linter to ignore these?
Reviewable status: complete! 0 of 0 LGTMs obtained
Can you post the output of the failed lint rule? I'm not seeing it in the failed TeamCity job. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benesh this is the link to TC lint job:
https://teamcity.cockroachdb.com/viewLog.html?buildTypeId=Cockroach_UnitTests_Check&buildId=810996&branch_Cockroach_UnitTests_Check=28181
[/TestUnused] lint_test.go:879: /go/src/github.com/cockroachdb/cockroach/pkg/util/encoding/csv/reader.go:85:2: var ErrTrailingComma is unused (U1000)
lint_test.go:879: /go/src/github.com/cockroachdb/cockroach/pkg/util/encoding/csv/reader.go:134:2: field TrailingComma is unused (U1000)
[23:49:08][TestLint] /TestErrCheck
[23:49:08]
[/TestErrCheck] lint_test.go:1145: pkg/util/encoding/csv/example_test.go:121:12: w.WriteAll(records) // calls Flush internally <- unchecked error
lint_test.go:1145: pkg/util/encoding/csv/writer.go:103:11: w.w.Flush() <- unchecked error
lint_test.go:1145: pkg/util/encoding/csv/writer_test.go:69:9: f.Write([]string{"abc"}) <- unchecked error
lint_test.go:1145: pkg/util/encoding/csv/writer_test.go:78:9: f.Write([]string{"abc"}) <- unchecked error
[/TestForbiddenImports] lint_test.go:1031:
github.com/cockroachdb/cockroach/pkg/util/encoding/csv: log <- please use "util/log" instead of "log"
The Unused variables is easy to remove.
The ErrCheck rule conflicts with the semantics of write/flush.
Use of log
package means adding contexts.
I don't want to change this stdlib code more than I have to, as it will make it hard to maintain. I'm open to suggestions on how you normally handle this.
Reviewable status: complete! 0 of 0 LGTMs obtained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The job ran on revision1 of this PR - in case the line numbers don't align with the current version. I tried to fix some of these but did a bad job.
Reviewable status: complete! 0 of 0 LGTMs obtained
See cockroachdb#25344. It appears this is being caused by the behaviour of Golang's encoding/csv library, which folds \r\n into \n when reading. This was fixed in golang/go#21201 but then reverted golang/go#22746. It appears based on that second issue that Go is unlikely to change that behavior. Check in the stdlib `encoding/csv` into `pkg/util` with golang/go#22746 reverted. Release note: `\r\n` characters in CSV files were silently converted into `\n`. This causes imported data to be different. This is now fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added test case for this in pkg/ccl/importccl/import_stmt_test.go
Reviewable status: complete! 0 of 0 LGTMs obtained
To suppress the errcheck warnings, I'd be OK with you adding cockroach/pkg/testutils/lint/lint_test.go Line 1133 in 2e4952d
If you do so, please add a comment indicating why it's OK. |
I too would be fine with exempting from linters to minimize diff from stdlib, but also LGTM as-is. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d rather not list (*bufio.Writer).Flush
in errcheck_excludes. It’s not in general safe to ignore its error. It’s only safe here because encoding/csv knows that it can find that error later by calling w.Write(nil)
. Which actually seems like an undocumented implementation detail to rely on; but whatever, parity with the stdlib definitely wins here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-ignorepkg
doesn't do what we thought. -ignorepkg ./pkg/util/encoding/csv
doesn't work - need to do -ignorepkg bufio
which defeats the purpose of what we want to do.
https://github.com/kisielk/errcheck
The -ignorepkg flag takes a comma-separated list of package import paths to ignore
It's not which packages to ignore when running the checks.
If you are both happy with this, could @benesh or @dt give an LGTM and trigger bors?
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/ccl/importccl/import_stmt_test.go, line 179 at r2 (raw file):
create: `t text`, typ: "CSV", data: "\"hello\r\nworld\"\n\"friend\nfoe\"\n\"mr\rmrs\"",
Is this the correct way to handle this?
D’oh, sorry to lead you astray. Your approach is definitely the lesser poison. Leaving to @dt to answer your other question and merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/ccl/importccl/import_stmt_test.go, line 179 at r2 (raw file):
Previously, neeral (Neeral Dodhia) wrote…
Is this the correct way to handle this?
Yeah, looks right to me! It might be a little easier to read using backticks rather than escaping all the quotes, but not enough to justify another CI cycle.
bors r+ |
👎 Rejected by code reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dt could you give an LGTM, bors is complaining no one has approved this PR
Reviewable status: complete! 0 of 0 LGTMs obtained
Sorry, my review was in the way. bors r+ |
Thanks for the fix, @neeral! |
Build failed (retrying...) |
Build succeeded |
27868: backport-2.0: storage: prevent unbounded raft log growth without quorum r=nvanbenschoten a=nvanbenschoten Backport 2/2 commits from #27774. /cc @cockroachdb/release --- Fixes #27772. This change adds safeguards to prevent cases where a raft log would grow without bound during loss of quorum scenarios. It also adds a new test that demonstrates that the raft log does not grow without bound in these cases. There are two cases that need to be handled to prevent the unbounded raft log growth observed in #27772. 1. When the leader proposes a command and cannot establish a quorum. In this case, we know the leader has the entry in its log, so there's no need to refresh it with `reasonTicks`. To avoid this, we no longer use `refreshTicks` as a leader. 2. When a follower proposes a command that is forwarded to the leader who cannot establish a quorum. In this case, the follower can't be sure (currently) that the leader got the proposal, so it needs to refresh using `reasonTicks`. However, the leader now detects duplicate forwarded proposals and avoids appending redundant entries to its log. It does so by maintaining a set of in-flight forwarded proposals that it has received during its term as leader. This set is reset after every leadership change. Both of these cases are tested against in the new TestLogGrowthWhenRefreshingPendingCommands. Without both of the safeguards introduced in this commit, the test fails. Release note (bug fix): Prevent loss of quorum situations from allowing unbounded growth of a Range's Raft log. 28225: release-2.0: importccl: Preserve '\r\n' during CSV import r=dt a=dt Backport 1/1 commits from #28181. /cc @cockroachdb/release --- See #25344. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: neeral <neeral@users.noreply.github.com> Co-authored-by: David Taylor <tinystatemachine@gmail.com>
See #25344.