Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: slice expr can be taken in cse #11628

Merged
merged 1 commit into from
Oct 10, 2023
Merged

Conversation

reswqa
Copy link
Collaborator

@reswqa reswqa commented Oct 10, 2023

This fixes #11594.

I'm not 100% sure, but I feel that slice can be reused by cse.


Besides:

There should be other bug in reusing CSE that can cause the following plan to occur. I will delve deeper into how to fix this in next PR.

SELECT [

col("__POLARS_CSER_14783041659764827719").slice(offset=1, length=[(count()) - (1)]).alias("1"),

col("__POLARS_CSER_14783041659764827719").slice(offset=2, length=[(count()) - (1)]).alias("2"),

[(count()) - (1)].alias("__POLARS_CSER_14783041659764827719")] FROM

  DF ["a", "b"]; PROJECT 1/2 COLUMNS; SELECTION: "None"

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Oct 10, 2023
@reswqa reswqa marked this pull request as ready for review October 10, 2023 04:37
Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should. Why did this panic?

@ritchie46 ritchie46 merged commit 52a2632 into pola-rs:main Oct 10, 2023
29 checks passed
@reswqa
Copy link
Collaborator Author

reswqa commented Oct 10, 2023

@ritchie46

TBH, I can not reproduce panic. But the output is not correct indeed.

>>> df = pl.DataFrame({
...     "a": [1,2,1,2,1,2],
...     "b": [None, None, 1,1,1,1]
... })
>>> df.lazy().select(
...     pl.col("a").slice(offset = 1, length=pl.count() - 1).alias("1"),
...     pl.col("a").slice(offset = 1, length=pl.count() - 1).alias("2")
... ).collect()
shape: (0, 2)
┌─────┬─────┐
│ 1   ┆ 2   │
│ --- ┆ --- │
│ u32 ┆ u32 │
╞═════╪═════╡
└─────┴─────┘

It seems that col ("a") has been mistakenly replaced with cse expr

SELECT [

col("__POLARS_CSER_14783041659764827719").slice(offset=1, length=[(count()) - (1)]).alias("1"),

col("__POLARS_CSER_14783041659764827719").slice(offset=1, length=[(count()) - (1)]).alias("2"),

[(count()) - (1)].alias("__POLARS_CSER_14783041659764827719")] FROM

  DF ["a", "b"]; PROJECT 1/2 COLUMNS; SELECTION: "None"

@reswqa
Copy link
Collaborator Author

reswqa commented Oct 10, 2023

All right I think I found the culprit, will create PR to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Slice with Length specified by Expr in Lazy Context cannot Determine Length Correctly
2 participants