Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: added test for pd.where overflow error GH31687 #49926

Merged
merged 4 commits into from
Nov 28, 2022

Conversation

mliu08
Copy link
Contributor

@mliu08 mliu08 commented Nov 27, 2022

Added a test for pd.where to /indexing/test_where.py

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this!



@pytest.mark.parametrize(
"replacement", [0.001, True, "snake", DATETIME_JAN_1_1900_OPTIONAL_TZ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DATETIME_JAN_1_1900_OPTIONAL_TZ is gonna be a LazyStrategy object here, which I guess isn't quite what you wanted, I think you'll need @given to use hypothesis tests
Not sure we need that though, just some date should be fine if you want to test using a date as a replacement

Also, the original issue used None, shall we include that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I didn't want a LazyStrategy object! I'll change that to a date, and also include None to align better with the original issue. I was also wondering if there was already a quick way to test all scalar types rather than having to use a long parametrized list. Or would that be overkill?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think just keeping the original snippet from the issue as a test should be fine

)
def test_where_int_overflow(replacement):
# GH 31687
df = DataFrame([[1.0, 2e19, "nine"], [np.nan, 0.1, None]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally for regression tests it's better to keep them as close as possible to the original - any reason to use 2e19 instead of 2e25?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I reduced it down to the smallest order of magnitude that still had the problem in pandas 1.4.1, but I don't have strong feelings about using that instead of the 2e25 from the original.

@mliu08 mliu08 force-pushed the pd.where-overflow-error-test branch from abc954c to 48f81cd Compare November 28, 2022 00:31
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mliu08 !

@MarcoGorelli MarcoGorelli added the Needs Tests Unit test(s) needed to prevent regressions label Nov 28, 2022
@MarcoGorelli MarcoGorelli added this to the 2.0 milestone Nov 28, 2022
@MarcoGorelli MarcoGorelli merged commit 494025c into pandas-dev:main Nov 28, 2022
@mliu08 mliu08 deleted the pd.where-overflow-error-test branch November 29, 2022 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pd.where OverflowError with large numbers
2 participants