-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support RaiseError for DB 14.3 and Spark 4.0.0 #10969
Labels
Spark 4.0+
Spark 4.0+ issues
Comments
This was referenced Jun 4, 2024
@razajafri, could you please confirm if this issue is necessary, if we already have #10107? Can we close this as a dupe? (Or vice versa.) |
mythrocks
added a commit
to mythrocks/spark-rapids
that referenced
this issue
Oct 28, 2024
Fixes NVIDIA#11537. This commit addresses the failure of the `test_raise_error` test in `misc_expr_test.py` for Databricks 14.3. This is an extension of NVIDIA#11129, where this test was skipped for Apache Spark 4.0. The failure on Databricks 14.3 shares the same cause as in Spark 4.0, i.e. a backward-incompatible Spark change in the signature of RaiseError, as introduced in https://issues.apache.org/jira/browse/SPARK-44838. The work to support this change in a Spark-RAPIDS shim will be tracked in NVIDIA#10969. This test will be skipped until that work is completed. Signed-off-by: MithunR <mithunr@nvidia.com>
mythrocks
added a commit
that referenced
this issue
Nov 4, 2024
Fixes #11537. This commit addresses the failure of the `test_raise_error` test in `misc_expr_test.py` for Databricks 14.3. This is an extension of #11129, where this test was skipped for Apache Spark 4.0. The failure on Databricks 14.3 shares the same cause as in Spark 4.0, i.e. a backward-incompatible Spark change in the signature of RaiseError, as introduced in https://issues.apache.org/jira/browse/SPARK-44838. The work to support this change in a Spark-RAPIDS shim will be tracked in #10969. This test will be skipped until that work is completed. Signed-off-by: MithunR <mithunr@nvidia.com>
16 tasks
mythrocks
added a commit
to mythrocks/spark-rapids
that referenced
this issue
Jan 16, 2025
Fixes NVIDIA#10969. This commit adds support for `raise_error()` on Databricks 14.3 and Spark 4.0. On these new Spark versions, the `RaiseError` expression (that powers the `raise_error()` API function) was changed from a Unary expression to a Binary one. This was done without modifying the arity of `raise_error()`. The ostensible reason seems to have been to eventually allow user-code to raise custom errors via `raise_error()`. This commit allows `raise_error()` to work on the GPU as it currently does on the CPU: as a unary function powered by a binary expression in the background. The tests have been modified to verify both the new behaviour and the legacy one on new platforms, while continuing to run as before on legacy platforms. Signed-off-by: MithunR <mithunr@nvidia.com>
mythrocks
added a commit
to mythrocks/spark-rapids
that referenced
this issue
Jan 16, 2025
Fixes NVIDIA#10969. This commit adds support for `raise_error()` on Databricks 14.3 and Spark 4.0. On these new Spark versions, the `RaiseError` expression (that powers the `raise_error()` API function) was changed from a Unary expression to a Binary one. This was done without modifying the arity of `raise_error()`. The ostensible reason seems to have been to eventually allow user-code to raise custom errors via `raise_error()`. This commit allows `raise_error()` to work on the GPU as it currently does on the CPU: as a unary function powered by a binary expression in the background. The tests have been modified to verify both the new behaviour and the legacy one on new platforms, while continuing to run as before on legacy platforms. Signed-off-by: MithunR <mithunr@nvidia.com>
mythrocks
added a commit
to mythrocks/spark-rapids
that referenced
this issue
Jan 16, 2025
Fixes NVIDIA#10969. This commit adds support for `raise_error()` on Databricks 14.3 and Spark 4.0. On these new Spark versions, the `RaiseError` expression (that powers the `raise_error()` API function) was changed from a Unary expression to a Binary one. This was done without modifying the arity of `raise_error()`. The ostensible reason seems to have been to eventually allow user-code to raise custom errors via `raise_error()`. This commit allows `raise_error()` to work on the GPU as it currently does on the CPU: as a unary function powered by a binary expression in the background. The tests have been modified to verify both the new behaviour and the legacy one on new platforms, while continuing to run as before on legacy platforms. Signed-off-by: MithunR <mithunr@nvidia.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This PR #5487 added the ability to convert a UDF that can throw SparkException into a catalyst expression with RaiseError.
Support for RaiseError was added in #5540 but Apache Spark 4.0 changed fundamentally how it throws exceptions so we have to match that change.
We'd like to have GpuRaiseError so that we can prevent a columnar-to-row transition when an error should be raised.
The text was updated successfully, but these errors were encountered: