Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slice with Length specified by Expr in Lazy Context cannot Determine Length Correctly #11594

Closed
2 tasks done
abstractqqq opened this issue Oct 8, 2023 · 1 comment · Fixed by #11628
Closed
2 tasks done
Labels
accepted Ready for implementation bug Something isn't working python Related to Python Polars

Comments

@abstractqqq
Copy link
Contributor

abstractqqq commented Oct 8, 2023

Checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

df = pl.DataFrame({
    "a": [1,2,1,2,1,2],
    "b": [None, None, 1,1,1,1]
})
# Runs perfectly
df.select(
    pl.col("a").slice(offset = 0, length=pl.count() - 2).alias("1"),
    pl.col("a").slice(offset = 1, length=pl.count() - 2).alias("2")
)
# Will panic because it cannot determine the length correctly
df.lazy().select(
    pl.col("a").slice(offset = 0, length=pl.count() - 2).alias("1"),
    pl.col("a").slice(offset = 1, length=pl.count() - 2).alias("2")
).collect()

Log output

No response

Issue description

pl.col("").slice(), with length given as an expression cannot determine the correct selected column length.

PanicException: The column lengths in the DataFrame are not equal.

This is likely a problem in the order of execution.

Expected behavior

Eager and lazy should return the same result

Installed versions

--------Version info---------
Polars:              0.19.7
Index type:          UInt32
Platform:            Windows-10-10.0.19045-SP0
Python:              3.11.3 | packaged by Anaconda, Inc. | (main, Apr 19 2023, 23:46:34) [MSC v.1916 64 bit (AMD64)]
@orlp
Copy link
Collaborator

orlp commented Oct 9, 2023

I think a more informative example of what goes wrong is this:

>>> df = pl.DataFrame({
...     "a": [1,2,1,2,1,2],
...     "b": [None, None, 1,1,1,1]
... })
>>> df.lazy().select(
...     pl.col("a").slice(offset = 1, length=pl.count() - 1).alias("1"),
...     pl.col("a").slice(offset = 1, length=pl.count() - 1).alias("2")
... ).collect()
shape: (0, 2)
┌─────┬─────┐
│ 12   │
│ --- ┆ --- │
│ u32 ┆ u32 │
╞═════╪═════╡
└─────┴─────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation bug Something isn't working python Related to Python Polars
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants