-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds ability to quote always in CSV.build #6723
Conversation
Maybe |
Yep, I investigated the other programming languages.
Well, I found two points.
|
Naming the enum |
763a583
to
e20f63e
Compare
I modified the code to use puts CSV.build {|csv| csv.row 1, ","}
puts CSV.build(quoting: CSV::Builder::Quoting::NONE) {|csv| csv.row 1, ","}
puts CSV.build(quoting: CSV::Builder::Quoting::RFC ) {|csv| csv.row 1, ","}
puts CSV.build(quoting: CSV::Builder::Quoting::ALL ) {|csv| csv.row 1, ","}
|
return true | ||
case @quoting | ||
when .rfc? | ||
if value.is_a?(String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the value is a Char or Symbol the quoting logic is not applied.
That was a pre-existing corner case that was not handled.
Should we handled it now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I prefer this PR just provides a new feature about quoting behavior.
And then, I'd like to fix the corner case soon in another PR that provides fix the existing issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then perhaps revert the changes to def <<(value : Nil | Bool | Char | Number | Symbol)
, meaning value
here is always String
, and you can revert the if value.is_a? String
change. And leave all this to a different PR.
FYI: There was almost no performance change in compatible mode( Benchmark.ips(warmup: 0, calculation: 10) do |x|
x.report("csv(orig)") { build_csv1(reports) }
x.report("csv(RFC) ") { build_csv2(reports, "RFC") }
x.report("csv(ALL) ") { build_csv2(reports, "ALL") }
end
|
src/csv/builder.cr
Outdated
@@ -35,15 +35,21 @@ require "csv" | |||
# 4,5,6,7,8 | |||
# ``` | |||
class CSV::Builder | |||
enum Quoting | |||
NONE # No quotes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This documentation doesn't work, it needs to be before the constants to show up in crystal doc
.
# ``` | ||
# result = CSV.build do |csv| | ||
# csv.row "one", "two" | ||
# csv.row "three" | ||
# end | ||
# result # => "one,two\nthree\n" | ||
# result = CSV.build(quoting: CSV::Builder::Quoting::ALL) do |csv| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quoting: :all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I didn't know such a cool syntax! Thank you!
What's the point of no quotes? Isn't that a broken format? |
I don't see why is was made an enum, it should just be a boolean, |
Actually i've changed my mind. If all these CSV libs have an option for no quoting, there must be a reason. An enum is fine. |
Right, but I'd like to know the reason. |
To be honest, I just imitated other libraries, so I don't know exact reasons. In my opinion, it can be expected to bypass the judgment process and speed up a little, such as when it is composed only of numbers or alphabets. |
Makes sense |
@asterite compile failed, looks like a compiler bug? |
Oh, I forgot about special keywords |
@maiha the error is not about the special enum members All & None, it looks like an automatic cast error |
Yes, there's a bug regarding named arguments and automatic casts. I think it's #6623 . I don't have time to fix it. |
Since it seems that it takes time to solve the problem, I reverted the refactoring codes about "use Symbol for enum". |
This is actually a new bug introduced since 0.26.1 notice how the first std_spec compiles with the 0.26.1 compiler than fails when rebuilt on master. |
So it's definitely #6618 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @maiha 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is GTG.
A PR should follow to handle the corner case for quoting for Chars and Symbol.
Currently,
CSV.build
quotes only those fields which contain special characters.PR adds ability to quote all fields by quote_always option.
This allows us to link to tools that are not RFC compliant. For example, ClickHouse can't parse the former.
ClickHouse/ClickHouse#2192
Best regards,