-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GpuInsertIntoHiveTable supports parquet format #10912
Conversation
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
build |
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
build |
val storage = insertCmd.table.storage | ||
// Configs check for Parquet write enabling/disabling | ||
|
||
// FIXME Need to check serde and output format classes ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment out of date, or is there more to do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. Removed
sql-plugin/src/main/scala/org/apache/spark/sql/hive/rapids/GpuHiveFileFormat.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/org/apache/spark/sql/hive/rapids/GpuHiveFileFormat.scala
Outdated
Show resolved
Hide resolved
s"as an int or a long") | ||
} | ||
|
||
// FIXME Need a new format type for Hive Parquet write ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think a new format type is required, but I could see it done that way if desired. Comment needs to be addressed in some way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx a lot for this info. Removed
…HiveFileFormat.scala Co-authored-by: Jason Lowe <jlowe@nvidia.com>
…HiveFileFormat.scala Co-authored-by: Jason Lowe <jlowe@nvidia.com>
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
build |
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
build |
# ProjectExec falls back on databricks due to a new expression named "MapFromArrays". | ||
fallback_nodes = ['ProjectExec'] if is_databricks_runtime() else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it understood why MapFromArrays is appearing here? Note that MapFromArrays is not "new" in the sense that it's been in Apache Spark since Spark 2.4. The concern is that this test is allowing a fallback when we're not testing for a fallback.
Do we have confidence this won't appear in a normal query? I suspect it's an artifact of how map generation works from Python, but then I wonder why we're not needing to fallback on MapFromArrays in other tests that generate maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the details on MapFromArrays yet. And file an issue for this. Here it is #10948
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
build |
1 similar comment
build |
This is a new feature adding the parquet support for GpuInsertIntoHiveTable, who only supports text write now. And this feature is tested by the new added tests in this PR. --------- Signed-off-by: Firestarman <firestarmanllc@gmail.com> Co-authored-by: Jason Lowe <jlowe@nvidia.com>
close #9939
This is a new feature adding the parquet support for
GpuInsertIntoHiveTable
, who only supports text write now. And this feature is tested by the new added tests in this PR.