-
Notifications
You must be signed in to change notification settings - Fork 28.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-43440][PYTHON][CONNECT] Support registration of an Arrow-optim…
…ized Python UDF ### What changes were proposed in this pull request? The PR proposes to provide support for the registration of an Arrow-optimized Python UDF in both vanilla PySpark and Spark Connect. ### Why are the changes needed? Currently, when users register an Arrow-optimized Python UDF, it will be registered as a pickled Python UDF and thus, executed without Arrow optimization. We should support Arrow-optimized Python UDFs registration and execute them with Arrow optimization. ### Does this PR introduce _any_ user-facing change? Yes. No API changes, but result differences are expected in some cases. Previously, a registered Arrow-optimized Python UDF will be executed without Arrow optimization. Now, it will be executed with Arrow optimization, as shown below. ```sh >>> df = spark.range(2) >>> df.createOrReplaceTempView("df") >>> from pyspark.sql.functions import udf >>> udf(useArrow=True) ... def f(x): ... return str(x) ... >>> spark.udf.register('str_f', f) <pyspark.sql.udf.UserDefinedFunction object at 0x7fa1980c16a0> >>> spark.sql("select str_f(id) from df").explain() # Executed with Arrow optimization == Physical Plan == *(2) Project [pythonUDF0#32 AS f(id)#30] +- ArrowEvalPython [f(id#27L)#29], [pythonUDF0#32], 101 +- *(1) Range (0, 2, step=1, splits=16) ``` Enabling or disabling Arrow optimization can produce result differences in some cases - we are working on minimizing the result differences though. ### How was this patch tested? Unit test. Closes #41125 from xinrong-meng/registerArrowPythonUDF. Authored-by: Xinrong Meng <xinrong@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
- Loading branch information
1 parent
dd4db21
commit 7cd8f90
Showing
4 changed files
with
39 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters