BUG: Row-wise comparison between two series always evaluates to all False
when one series contains pd.NA
#45599
Closed
2 of 3 tasks
Labels
ExtensionArray
Extending pandas with custom dtypes or arrays.
Numeric Operations
Arithmetic, Comparison, and Logical operations
Usage Question
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
In my actual use case, I'm performing row-wise comparisons between an integer column and the same column shifted by various periods
column == column.shift(periods=i)
for1 <= i <= 6
to check if a previous row's value is the same as the current row's value.
Because of the behavior described in my example, these comparisons are all incorrectly evaluating into columns filled with all
False
values, even if there are rows with the same values between both columns.Note: I did not try reproducing this bug with the main branch version of pandas but I scanned through the list of commit messages from commits pushed since pandas version 1.4.0 and did not notice any that sound like they would address this issue.
Expected Behavior
Since
pandas.Series.eq
performs an element-wise comparison between each series, I would expect for comparisons involvingpd.NA
to behave like those which involvenp.nan
:Installed Versions
The text was updated successfully, but these errors were encountered: