10 add support for condominium sale flagging #13

wagnerlmichael · 2023-09-06T18:30:32Z

First pass at condo inclusion. Largely incorporated from Billy's changed to the flagging code here with boolean switches.

Everything seems to run well, tested initial_flagging.py, manual_update.py, and the glue job.

dfsnow

Overall good work @wagnerlmichael! My main concern here is that there's already a bit of code duplication across the various flagging scripts, and this PR really only adds to that. I would refactor a little bit to try to reduce some of the larger duplicate bits. Hopefully Jean's concurrent work on #17 will let us reduce some of the duplicate code as well.

See my comments for necessary changes, but don't get hung up on them. Would rather get this merged and then fixed up later than sit on this PR for ages.

When you merge this, let's keep the merge history (rather than squashing) since this is a pretty big change.

As a general aside, this is a pretty large PR and isn't super easy to review. In the future, try to:

Break up something like this into a series of smaller PRs
Add PR comments for context on any thing that might be confusing

glue/flagging_script_glue/flagging_f7265c.py

dfsnow · 2023-09-14T15:38:47Z

glue/flagging_script_glue/flagging_f7265c.py

-            f"sv_price_per_sqft_deviation_{group_string}"
-        ).get(key)
-        sq_lower, sq_upper = sq_std_range
+    if condos == True:


suggestion (non-blocking): To the extent possible, I would try to control flow only the parts of this function that depend on having square footage. There's a lot of duplicate code here since you're basically making two copies of the same function with a very marginal difference between them. This also applies in other places i.e. pricing_info(). However, could be that I'm missing a constraint that makes such refactoring unwieldy!

glue/flagging_script_glue/flagging_f7265c.py

dfsnow · 2023-09-14T15:46:26Z

glue/flagging_script_glue/flagging_f7265c.py

+    if condos == True:
+        conditions = [
+            (df["sv_short_owner"] == "Short-term owner")
+            & (df["sv_pricing"].str.contains("High")),
+            (df["sv_name_match"] != "No match")
+            & (df["sv_pricing"].str.contains("High")),
+            (df["sv_transaction_type"] == "legal_entity-legal_entity")
+            & (df["sv_pricing"].str.contains("High")),
+            (df["sv_anomaly"] == "Outlier") & (df["sv_pricing"].str.contains("High")),
+            (df["sv_pricing"].str.contains("High price swing")),
+            (df["sv_pricing"].str.contains("High")),
+            (df["sv_short_owner"] == "Short-term owner")
+            & (df["sv_pricing"].str.contains("Low")),
+            (df["sv_name_match"] != "No match")
+            & (df["sv_pricing"].str.contains("Low")),
+            (df["sv_transaction_type"] == "legal_entity-legal_entity")
+            & (df["sv_pricing"].str.contains("Low")),
+            (df["sv_anomaly"] == "Outlier") & (df["sv_pricing"].str.contains("Low")),
+            (df["sv_pricing"].str.contains("Low price swing")),
+            (df["sv_pricing"].str.contains("Low")),
+        ]


suggestion: Again, try to reduce the duplication of code here. I would define the set of conditions and labels shared by both res and condos, then just use .insert() to add list elements for condos.

glue/sales_val_flagging.py

dfsnow · 2023-09-14T15:54:17Z

glue/sales_val_flagging.py

+
+        df_condo_flagged = go(
+            df=df_condo_to_flag,
+            groups=tuple(stat_groups_list),


issue: We actually don't want condos to use class as a grouping variable, as there are very few 297s or 399s. Instead, all condo classes should be considered one class, then partitioned/grouped by township. Let's table this for now and break it out into a separate issue so as not to block this PR.

Let me know about this. I'm happy to make an issue.

@wagnerlmichael Please do make a separate issue for this.

manual_flagging/initial_flagging.py

wagnerlmichael · 2023-09-15T17:03:55Z

glue/flagging_script_glue/flagging_f7265c.py

@@ -173,7 +172,7 @@ def pricing_info(
 ) -> pd.DataFrame:


Was able to refactor pricing_info() a bit. price_column() was giving me trouble so I skipped it for now

dfsnow

Nice cleanup @wagnerlmichael! Really condensed the code quite a lot this round. See my minor comments/suggestions. Otherwise, this looks good to go.

dfsnow · 2023-09-18T20:43:05Z

glue/flagging_script_glue/flagging_324938.py

+    if not condos:
        df = deviation_dollars(df, groups)


suggestion: Can you collapse this into a single conditional with line 399?

Unfortunately it seems like this needs to be run in this order for proper column name creation.

glue/flagging_script_glue/flagging_324938.py

glue/sales_val_flagging.py

…sale-flagging 10 add support for condominium sale flagging

wagnerlmichael requested a review from dfsnow September 6, 2023 18:30

wagnerlmichael linked an issue Sep 6, 2023 that may be closed by this pull request

Add support for condominium sale flagging #10

Closed

dfsnow reviewed Sep 14, 2023

View reviewed changes

dfsnow mentioned this pull request Sep 14, 2023

9 add support for min number of sales when performing statistical flagging #14

Merged

wagnerlmichael commented Sep 15, 2023

View reviewed changes

wagnerlmichael requested a review from dfsnow September 15, 2023 19:00

dfsnow approved these changes Sep 18, 2023

View reviewed changes

wagnerlmichael merged this pull request into main Sep 21, 2023

wagnerlmichael deleted the 10-add-support-for-condominium-sale-flagging branch September 21, 2023 14:49

jeancochrane pushed a commit that referenced this pull request Sep 21, 2023

Merge pull request #13 from ccao-data/10-add-support-for-condominium-…

fd79d2e

…sale-flagging 10 add support for condominium sale flagging

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

10 add support for condominium sale flagging #13

10 add support for condominium sale flagging #13

wagnerlmichael commented Sep 6, 2023

dfsnow left a comment

dfsnow Sep 14, 2023

dfsnow Sep 14, 2023

dfsnow Sep 14, 2023

wagnerlmichael Sep 15, 2023

dfsnow Sep 18, 2023

wagnerlmichael Sep 15, 2023

dfsnow left a comment

dfsnow Sep 18, 2023

wagnerlmichael Sep 21, 2023

10 add support for condominium sale flagging #13

10 add support for condominium sale flagging #13

Conversation

wagnerlmichael commented Sep 6, 2023

dfsnow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dfsnow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment