ENH: Adding pd.options.observed_true_on_all_groupbys #49904
Labels
Categorical
Categorical Data Type
Enhancement
Groupby
Needs Triage
Issue that has not been reviewed by a pandas team member
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
After adding observed = True to allot of my group by's in order to avoid memory crashes. I wish to be able to change the default in away that I can put on of my script.
Feature Description
I want to be able to set:
I know it will make the following:
df.groupby("var",observed=False)
not being respected. But I don't think anybody would want that and I have tried to make it as clear as posible in the naming
Alternative Solutions
Impliment #43999
Or make a warning on memory usage if there is more than 100,000,000 buckets used and there is less than 1,000,000 unique values in any of the variables For example.
Additional Context
Allot of people are facing this problem https://stackoverflow.com/questions/50051210/avoiding-memory-issues-for-groupby-on-large-pandas-dataframe
The text was updated successfully, but these errors were encountered: