Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49889][PYTHON] Add argument trim for functionstrim/ltrim/rtrim #48363

Closed
wants to merge 1 commit into from

Conversation

zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Add argument trim for functionstrim/ltrim/rtrim

Why are the changes needed?

this argument is missing in PySpark: we can specify the it in scala side but cannot do it in python.

Does this PR introduce any user-facing change?

yes, new argument supported

How was this patch tested?

added doctests

Was this patch authored or co-authored using generative AI tooling?

no

return _invoke_function_over_columns("ltrim", col)
def ltrim(col: "ColumnOrName", trim: Optional["ColumnOrName"] = None) -> Column:
if trim is not None:
return _invoke_function_over_columns("ltrim", trim, col)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is kind of confusing on the arg order here:
1, connect: _invoke_function_over_columns("ltrim", trim, col) -> follow the order in constructor

def this(trimStr: Expression, srcStr: Expression) = this(srcStr, Option(trimStr))

2, classic: _invoke_function_over_columns("ltrim", col, trim) follows the signature in scala functions

  def ltrim(e: Column, trimString: String): Column
  def ltrim(e: Column, trim: Column): Column

Copy link
Contributor Author

@zhengruifeng zhengruifeng Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably we can unify the function invocations of both connect and classic in some way

@zhengruifeng
Copy link
Contributor Author

thanks, merged to master

@zhengruifeng zhengruifeng deleted the func_trim_str branch October 8, 2024 06:55
* @group string_funcs
* @since 4.0.0
*/
def ltrim(e: Column, trim: Column): Column = Column.fn("ltrim", trim, e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is trim the expression for the trimString? Looks a bit weird to call this parameter trim. It's also not consistent with the function doc of expression StringTrim, which names this parameter as trimStr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortanately, the argument names were already not consistent between python and scala, in many functions.

in this PR I use trim to be consistent with python side btrim

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def btrim(str: "ColumnOrName", trim: Optional["ColumnOrName"] = None) -> Column:

himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
…rim`

### What changes were proposed in this pull request?
Add argument `trim` for functions`trim/ltrim/rtrim`

### Why are the changes needed?
this argument is missing in PySpark: we can specify the it in scala side but cannot do it in python.

### Does this PR introduce _any_ user-facing change?
yes, new argument supported

### How was this patch tested?
added doctests

### Was this patch authored or co-authored using generative AI tooling?
no

Closes apache#48363 from zhengruifeng/func_trim_str.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants