-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add standard deviation requirement for ptax outliters #22
Conversation
glue/sales_val_flagging.py
Outdated
group_string = "_".join(groups) | ||
df["ptax_flag_original"] = df["sale_filter_ptax_flag"] | ||
|
||
df["sale_filter_ptax_flag"] = df.apply( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (non-blocking): Rather than overwriting the original column, I would just change the SQL ingest query to return sale_filter_ptax_flag AS ptax_flag_original
, then create a new column named ptax_flag_w_deviation
(or similar).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, should now be implemented.
|
||
df["ptax_flag_w_deviation"] = df["ptax_flag_original"] & ( | ||
(df[f"sv_price_deviation_{group_string}"] >= ptax_sd[1]) | ||
| (df[f"sv_price_deviation_{group_string}"] <= -ptax_sd[0]) | ||
| (df[f"sv_price_per_sqft_deviation_{group_string}"] >= ptax_sd[1]) | ||
| (df[f"sv_price_per_sqft_deviation_{group_string}"] <= -ptax_sd[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Nice fixup!
All scripts tested. Added parameter input for which standard deviations to use with ptax outlier. Original ptax flag preserved.
Closes #15