Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] Convert custom datetime column when reading a CSV file #27854

Merged
merged 5 commits into from
Sep 1, 2022

Conversation

pcmoritz
Copy link
Contributor

Why are these changes needed?

Give an example of how a datetime column can be parsed when reading a CSV file.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Copy link
Contributor

@c21 c21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 565 to 570
>>> # Convert a date column with a custom format from a CSV file.
>>> from pyarrow import csv
>>> convert_options = csv.ConvertOptions(
... timestamp_parsers=["%m/%d/%Y"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to generalize this example to all convert_options (pyarrow.csv.ConvertOptions) or even arrow_csv_args?

In other words, should we delegate this documentation over to Arrow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, part of my desire to document this is to make sure that people know how to put in arbitrary ConvertOptions (and I wanted to show one that is probably particularly useful as an example). I'll add a note about the more general case and also link the Arrow documentation :)

@pcmoritz pcmoritz force-pushed the csv-datetime-column branch from 4783086 to 9a8934a Compare August 31, 2022 23:28
@pcmoritz
Copy link
Contributor Author

Haha yeah, I fixed it now :)

@pcmoritz pcmoritz merged commit 1bba657 into ray-project:master Sep 1, 2022
@pcmoritz pcmoritz deleted the csv-datetime-column branch September 1, 2022 04:25
XiaodongLv pushed a commit to XiaodongLv/ray that referenced this pull request Sep 2, 2022
ilee300a pushed a commit to ilee300a/ray that referenced this pull request Sep 12, 2022
…ject#27854)

Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Signed-off-by: ilee300a <ilee300@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants