Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

10 add support for condominium sale flagging #13

Merged
merged 0 commits into from
Sep 21, 2023

Conversation

wagnerlmichael
Copy link
Member

First pass at condo inclusion. Largely incorporated from Billy's changed to the flagging code here with boolean switches.

Everything seems to run well, tested initial_flagging.py, manual_update.py, and the glue job.

@wagnerlmichael wagnerlmichael linked an issue Sep 6, 2023 that may be closed by this pull request
Copy link
Member

@dfsnow dfsnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall good work @wagnerlmichael! My main concern here is that there's already a bit of code duplication across the various flagging scripts, and this PR really only adds to that. I would refactor a little bit to try to reduce some of the larger duplicate bits. Hopefully Jean's concurrent work on #17 will let us reduce some of the duplicate code as well.

See my comments for necessary changes, but don't get hung up on them. Would rather get this merged and then fixed up later than sit on this PR for ages.

When you merge this, let's keep the merge history (rather than squashing) since this is a pretty big change.

As a general aside, this is a pretty large PR and isn't super easy to review. In the future, try to:

  • Break up something like this into a series of smaller PRs
  • Add PR comments for context on any thing that might be confusing

f"sv_price_per_sqft_deviation_{group_string}"
).get(key)
sq_lower, sq_upper = sq_std_range
if condos == True:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (non-blocking): To the extent possible, I would try to control flow only the parts of this function that depend on having square footage. There's a lot of duplicate code here since you're basically making two copies of the same function with a very marginal difference between them. This also applies in other places i.e. pricing_info(). However, could be that I'm missing a constraint that makes such refactoring unwieldy!

Comment on lines 736 to 756
if condos == True:
conditions = [
(df["sv_short_owner"] == "Short-term owner")
& (df["sv_pricing"].str.contains("High")),
(df["sv_name_match"] != "No match")
& (df["sv_pricing"].str.contains("High")),
(df["sv_transaction_type"] == "legal_entity-legal_entity")
& (df["sv_pricing"].str.contains("High")),
(df["sv_anomaly"] == "Outlier") & (df["sv_pricing"].str.contains("High")),
(df["sv_pricing"].str.contains("High price swing")),
(df["sv_pricing"].str.contains("High")),
(df["sv_short_owner"] == "Short-term owner")
& (df["sv_pricing"].str.contains("Low")),
(df["sv_name_match"] != "No match")
& (df["sv_pricing"].str.contains("Low")),
(df["sv_transaction_type"] == "legal_entity-legal_entity")
& (df["sv_pricing"].str.contains("Low")),
(df["sv_anomaly"] == "Outlier") & (df["sv_pricing"].str.contains("Low")),
(df["sv_pricing"].str.contains("Low price swing")),
(df["sv_pricing"].str.contains("Low")),
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Again, try to reduce the duplication of code here. I would define the set of conditions and labels shared by both res and condos, then just use .insert() to add list elements for condos.


df_condo_flagged = go(
df=df_condo_to_flag,
groups=tuple(stat_groups_list),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: We actually don't want condos to use class as a grouping variable, as there are very few 297s or 399s. Instead, all condo classes should be considered one class, then partitioned/grouped by township. Let's table this for now and break it out into a separate issue so as not to block this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know about this. I'm happy to make an issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wagnerlmichael Please do make a separate issue for this.

@@ -173,7 +172,7 @@ def pricing_info(
) -> pd.DataFrame:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was able to refactor pricing_info() a bit. price_column() was giving me trouble so I skipped it for now

Copy link
Member

@dfsnow dfsnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup @wagnerlmichael! Really condensed the code quite a lot this round. See my minor comments/suggestions. Otherwise, this looks good to go.

Comment on lines 404 to 405
if not condos:
df = deviation_dollars(df, groups)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Can you collapse this into a single conditional with line 399?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it seems like this needs to be run in this order for proper column name creation.

@wagnerlmichael wagnerlmichael merged this pull request into main Sep 21, 2023
@wagnerlmichael wagnerlmichael deleted the 10-add-support-for-condominium-sale-flagging branch September 21, 2023 14:49
jeancochrane pushed a commit that referenced this pull request Sep 21, 2023
…sale-flagging

10 add support for condominium sale flagging
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for condominium sale flagging
2 participants