-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PartiQL should use idiomatic CSV parsing #366
Comments
We should really consider utilizing an existing RFC-4180 CSV parser as a dependency in the CLI project and removing the hand-rolled one we created. |
The use cases below are organized such that they can be implemented in sequence and individually have value. i.e. it would be helpful to implement use cases 1 and 2, while 3, 4 and 5 can come some time later. Use case 1: the csv file has header row. in this case, the columns should be bound to This selects the first three columns of
Use case 2: the csv file has a header row. in this case, the columns should be bound to their names:
Use case 3: the user does not know if there are column names. We should try to autodetect.
Use case 4: the user would like to specify a predefined csv format:
If Possible values for Use case 5: the user would like to create a custom format:
For a complete list of all the different configuration options, see: |
PartiQL's CSV parser today doesn't take into account quotation marks around fields. Typical CSV writers will write a field with a comma in it to be surrounded by quotations. This is mentioned in the PartiQL code as a TODO here.
CSVReader from Apache is a great example of how to implement this easily. A one-line fix could just be replacing this line with a CSVParser.parse call.
The text was updated successfully, but these errors were encountered: