-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support writing hive style partitioned files in DataFrame::write
command
#9237
Comments
Dataframe::write_parquet and related methods use the COPY logical/ physical plans under the hood, so if we knock out #8493 this ticket should come almost for free. |
@devinjdangelo implemented the code in #9240 In order to close this ticket we just need to add test coverage for writing partitioned parquet in My suggestion is:
The new test could basically do the same thing as the tests added in
|
* tests: adds tests associated with #9237 * style: clippy
Is your feature request related to a problem or challenge?
@Omega359 asked on discord: https://discord.com/channels/885562378132000778/1166447479609376850/1207458257874984970
Q: Is there a way to write out a dataframe to parquet with hive-style partitioning without having to create a table provider? I am pretty sure that a ListingTableProvider or a custom table provider will work but that seems like a ton of config for this
Describe the solution you'd like
I would like to be able to use
DataFrame::write_parquet
and the other APIs to write partitioned filesI suggest adding the
table_partition_cols
from ListingOptions as one of the options on https://docs.rs/datafusion/latest/datafusion/dataframe/struct.DataFrameWriteOptions.htmlSo way to specify partition information would be as described on
ListingOptions::with_table_partition_cols
So that would look something like
Describe alternatives you've considered
No response
Additional context
Possibly related to #8493
The text was updated successfully, but these errors were encountered: