Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
…e numpy ndarray ### What changes were proposed in this pull request? Make `lit` accept `str` and `bool` type numpy ndarray ### Why are the changes needed? to be consistent with PySpark Classic ```python In [4]: spark.range(1).select(sf.lit(np.array(["a", "b"], np.str_))).show() +---------------+ |ARRAY('a', 'b')| +---------------+ | [a, b]| +---------------+ ``` ### Does this PR introduce _any_ user-facing change? yes before: ```python In [3]: spark.range(1).select(sf.lit(np.array(["a", "b"], np.str_))).show() --------------------------------------------------------------------------- PySparkTypeError Traceback (most recent call last) Cell In[3], line 1 ----> 1 spark.range(1).select(sf.lit(np.array(["a", "b"], np.str_))).schema File ~/Dev/spark/python/pyspark/sql/utils.py:272, in try_remote_functions.<locals>.wrapped(*args, **kwargs) 269 if is_remote() and "PYSPARK_NO_NAMESPACE_SHARE" not in os.environ: 270 from pyspark.sql.connect import functions --> 272 return getattr(functions, f.__name__)(*args, **kwargs) 273 else: 274 return f(*args, **kwargs) File ~/Dev/spark/python/pyspark/sql/connect/functions/builtin.py:274, in lit(col) 272 dt = _from_numpy_type(col.dtype) 273 if dt is None: --> 274 raise PySparkTypeError( 275 errorClass="UNSUPPORTED_NUMPY_ARRAY_SCALAR", 276 messageParameters={"dtype": col.dtype.name}, 277 ) 279 # NumpyArrayConverter for Py4J can not support ndarray with int8 values. 280 # Actually this is not a problem for Connect, but here still convert it 281 # to int16 for compatibility. 282 if dt == ByteType(): PySparkTypeError: [UNSUPPORTED_NUMPY_ARRAY_SCALAR] The type of array scalar 'str32' is not supported. ``` after: ```python In [4]: spark.range(1).select(sf.lit(np.array(["a", "b"], np.str_))).show() +-----------+ |array(a, b)| +-----------+ | [a, b]| +-----------+ ``` ### How was this patch tested? ci ### Was this patch authored or co-authored using generative AI tooling? no Closes #48591 from zhengruifeng/connect_lit_bool_str. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
- Loading branch information