-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark 4: Fix miscellaneous tests including logic, repart, hive_delimited. [databricks] #11129
Spark 4: Fix miscellaneous tests including logic, repart, hive_delimited. [databricks] #11129
Conversation
Signed-off-by: MithunR <mithunr@nvidia.com>
Test is inexplicably failing with ANSI off.
Still investigating.
Moved overflowing test into a separate function, tested with ANSI on/off.
Still a work in progress. A couple of other tests to be addressed. |
Record comparisons do not currently account for legitimate whitespace differences. See NVIDIA#11154.
Build |
Build |
That last failure was an interesting one to track down. Time interval calculations on Spark < 3.3 involve multiplication/division aggregation operations. These tend to fall off the GPU in ANSI mode because of #5114. This test is guaranteed to fail, because part of the plan is off the GPU. For Spark >= 3.3, the same calculations seem to involve modulo operations that don't seem susceptible to ANSI-mode failures. I've included a skip for this test with ANSI enabled, on Spark < 3.3. This can be rolled back once #5114 is addressed. |
Build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, a couple of minor documentation/comment nits.
1. Cited the source for the clumsy error message. 2. Fixed comment regarding fallback to CPU.
Build |
@NVnavkumar, I was wondering if you might take another look at this one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit left here.
Build |
There seems to be an error on Spark 3.3, where the expected exception isn't thrown. It's taking a bit of time to repro. I'll update here once I have something. |
Changed datagen to guarantee overflow. Dropped superfluous num_parts value.
I think I've addressed the Databricks failure. I'll kick off another build, and request the reviewers for another round. |
Build |
@NVnavkumar, I've fixed the last nit. Does this look agreeable? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you for reviewing, @NVnavkumar. This change has now been merged. |
Fixes NVIDIA#11537. This commit addresses the failure of the `test_raise_error` test in `misc_expr_test.py` for Databricks 14.3. This is an extension of NVIDIA#11129, where this test was skipped for Apache Spark 4.0. The failure on Databricks 14.3 shares the same cause as in Spark 4.0, i.e. a backward-incompatible Spark change in the signature of RaiseError, as introduced in https://issues.apache.org/jira/browse/SPARK-44838. The work to support this change in a Spark-RAPIDS shim will be tracked in NVIDIA#10969. This test will be skipped until that work is completed. Signed-off-by: MithunR <mithunr@nvidia.com>
Fixes #11537. This commit addresses the failure of the `test_raise_error` test in `misc_expr_test.py` for Databricks 14.3. This is an extension of #11129, where this test was skipped for Apache Spark 4.0. The failure on Databricks 14.3 shares the same cause as in Spark 4.0, i.e. a backward-incompatible Spark change in the signature of RaiseError, as introduced in https://issues.apache.org/jira/browse/SPARK-44838. The work to support this change in a Spark-RAPIDS shim will be tracked in #10969. This test will be skipped until that work is completed. Signed-off-by: MithunR <mithunr@nvidia.com>
Fixes #11031.
This PR addresses tests that fail on Spark 4.0 in the following files:
integration_tests/src/main/python/datasourcev2_read_test.py
integration_tests/src/main/python/expand_exec_test.py
integration_tests/src/main/python/get_json_test.py
integration_tests/src/main/python/hive_delimited_text_test.py
integration_tests/src/main/python/logic_test.py
integration_tests/src/main/python/repart_test.py
integration_tests/src/main/python/time_window_test.py
integration_tests/src/main/python/json_matrix_test.py
integration_tests/src/main/python/misc_expr_test.py
integration_tests/src/main/python/orc_write_test.py