pandas.DataFrame.duplicated to allow take_all #6511
Labels
Algos
Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff
Indexing
Related to indexing on series/frames, not to indexes themselves
Numeric Operations
Arithmetic, Comparison, and Logical operations
Milestone
When working with external data, I often see rows with primary key violations. Currently, I could not easily select all the violating rows. For example, if I have a massive file with some inconsistent data
In this use case, it would be good if we can do
df[df.duplicated('datecol', take_all=True)]
to directly get the bad rowsThe text was updated successfully, but these errors were encountered: