-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49889][PYTHON] Add argument trim
for functionstrim/ltrim/rtrim
#48363
Conversation
return _invoke_function_over_columns("ltrim", col) | ||
def ltrim(col: "ColumnOrName", trim: Optional["ColumnOrName"] = None) -> Column: | ||
if trim is not None: | ||
return _invoke_function_over_columns("ltrim", trim, col) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is kind of confusing on the arg order here:
1, connect: _invoke_function_over_columns("ltrim", trim, col)
-> follow the order in constructor
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
Line 1507 in d85e7bc
def this(trimStr: Expression, srcStr: Expression) = this(srcStr, Option(trimStr)) |
2, classic: _invoke_function_over_columns("ltrim", col, trim)
follows the signature in scala functions
def ltrim(e: Column, trimString: String): Column
def ltrim(e: Column, trim: Column): Column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably we can unify the function invocations of both connect and classic in some way
thanks, merged to master |
* @group string_funcs | ||
* @since 4.0.0 | ||
*/ | ||
def ltrim(e: Column, trim: Column): Column = Column.fn("ltrim", trim, e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is trim
the expression for the trimString? Looks a bit weird to call this parameter trim
. It's also not consistent with the function doc of expression StringTrim
, which names this parameter as trimStr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortanately, the argument names were already not consistent between python and scala, in many functions.
in this PR I use trim
to be consistent with python side btrim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/python/pyspark/sql/functions/builtin.py
Line 13399 in 78135dc
def btrim(str: "ColumnOrName", trim: Optional["ColumnOrName"] = None) -> Column: |
…rim` ### What changes were proposed in this pull request? Add argument `trim` for functions`trim/ltrim/rtrim` ### Why are the changes needed? this argument is missing in PySpark: we can specify the it in scala side but cannot do it in python. ### Does this PR introduce _any_ user-facing change? yes, new argument supported ### How was this patch tested? added doctests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#48363 from zhengruifeng/func_trim_str. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
What changes were proposed in this pull request?
Add argument
trim
for functionstrim/ltrim/rtrim
Why are the changes needed?
this argument is missing in PySpark: we can specify the it in scala side but cannot do it in python.
Does this PR introduce any user-facing change?
yes, new argument supported
How was this patch tested?
added doctests
Was this patch authored or co-authored using generative AI tooling?
no