Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expressified expr.name.map #18358

Open
rben01 opened this issue Aug 25, 2024 · 2 comments
Open

Expressified expr.name.map #18358

rben01 opened this issue Aug 25, 2024 · 2 comments
Labels
enhancement New feature or an improvement of an existing feature

Comments

@rben01
Copy link
Contributor

rben01 commented Aug 25, 2024

Description

expr.name.map takes a function (str) -> str. This means if the user has a function that takes a string Expr and returns a new one, they can't use it with expr.name.map. For instance, one might have

def to_canonical_ident(expr: pl.Expr) -> pl.Expr:
    return (
        expr.str.replace_all(r"[^\w]", "_")
        .str.replace_all(r"__+", "_")
        .str.to_lowercase()
    )

As far as I can tell, this can only be used on column values, not column names. If you wanted to do expr.name.map(to_canonical_ident) you'd have to instead reimplement the function in “plain” python.

It would be nice if there were an expr.name.map_expr which would pass in lit(name) and pull the result out of the resulting Expr. I am currently working around this as follows, but it's a lot of ceremony:

@overload
def to_canonical_ident(s: str) -> str: ...

@overload
def to_canonical_ident(s: pl.Expr) -> pl.Expr: ...

def to_canonical_ident(s: str | pl.Expr):
    is_str = isinstance(s, str)
    if is_str:
        s = pl.lit(s)

    expr = (
        s.str.replace_all(r"[^\w]", "_").str.replace_all(r"__+", "_").str.to_lowercase()
    )

    if is_str:
        return pl.select(expr).item()

    return expr
@rben01 rben01 added the enhancement New feature or an improvement of an existing feature label Aug 25, 2024
@cmdlineluser
Copy link
Contributor

Not exactly what you're asking for, but may be of interest:

There is a pending .name.replace() PR for regex manipulation of column names.

@rben01
Copy link
Contributor Author

rben01 commented Aug 26, 2024

On second thought it might be better to simply have a function that goes from a (str) -> str to a (Expr) -> Expr.

def expr_fn_to_str_fn(f: Callable[[pl.Expr], pl.Expr]) -> Callable[[str], str]:
    return lambda s: pl.select(f(pl.lit(s))).item()

def to_canonical_ident(expr: pl.Expr) -> pl.Expr:
    return (
        expr.str.replace_all(r"[^\w]", "_")
        .str.replace_all(r"__+", "_")
        .str.to_lowercase()
    )

expr.name.map(expr_fn_to_str_fn(to_canonical_ident))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants