Resolve errors with time to convert bins #5283

neilkakkar · 2021-07-22T13:10:36Z

Changes

Fixes #5116 (patch) and no. 6: #5249 (comment)

For now, we discard all negative and null values in time to convert analysis.

We aren't doing this at a lower level (in, say, the original funnel class) because the other things work fine, and the original safeguards in those places (increasing ordering of event times) ensure we don't run into this issue.

It's hard to write a deterministic test for this, since the issue is intermittent. And if I un-skip the new test now, it will be terribly flakey. I don't want to delete this test, because coming back to it after ages involves re-loading all the context and trying to write a proper test for it.

Checklist

All querysets/queries filter by Organization, by Team, and by User
Django backend tests
Jest frontend tests
Cypress end-to-end tests
Migrations are safe to run at scale (e.g. PostHog Cloud) – present proof if not obvious
New/changed UI is decent on smartphones (viewport width around 360px)

macobo · 2021-07-22T13:20:59Z

ee/clickhouse/queries/funnels/funnel_time_to_convert.py

        query = f"""
            WITH
                step_runs AS (
-                    {steps_per_person_query}
+                    SELECT * FROM (
+                        {steps_per_person_query}


Q: Why not push the predicates in here?

We aren't doing this at a lower level (in, say, the original funnel class) because the other things work fine, and the following safeguards in those places (increasing ordering of event times) ensure we don't run into this issue.

Good question! I see good arguments for either, but opted for "where the problem shows up", as it also keeps things in one place. Doing it in steps_per_person_query would also mean doing it in all types of funnel orderings.

macobo · 2021-07-22T13:33:17Z

ee/clickhouse/queries/funnels/funnel_time_to_convert.py

@@ -60,10 +60,19 @@ def get_query(self) -> str:
        ]
        steps_average_conversion_time_expression_sum = " + ".join(steps_average_conversion_time_identifiers)

+        steps_average_conditional_for_invalid_values = [
+            f"{identifier} >= 0" for identifier in steps_average_conversion_time_identifiers


Are NULLs an issue, should we add a NOT NULL check before >= 0?

NULL values ought to be removed, and this check implies NULL values won't make it through, either!

This is kinda tested by existing tests: test_auto_bin_count_total - step_1 time is > 0, while step_2 is NULL, and as expected, it's removed from consideration.

macobo · 2021-07-22T13:41:39Z

ee/clickhouse/queries/funnels/funnel_time_to_convert.py

+        steps_average_conditional_for_invalid_values = [
+            f"{identifier} >= 0" for identifier in steps_average_conversion_time_identifiers
+        ]
+        # this is protection against the CH bug: https://github.com/ClickHouse/ClickHouse/issues/26580


Nit: Suggestion on comment style:

# :HACK: Protect against CH bug https://github.com/ClickHouse/ClickHouse/issues/26580 # once the issue is resolved, stop skipping the test: test_auto_bin_count_single_step_duplicate_events # and remove this comment

I use the following meta-comments: :TRICKY: :TODO:. They are easier to grep for and convey reader should look out clearer than free-form text.

neilkakkar added 2 commits July 21, 2021 15:04

time to convert test

dc8ce05

rworkaround for time to convert bins

e7a4ec1

timgl temporarily deployed to posthog-pr-5283 July 22, 2021 13:12 Inactive

neilkakkar requested review from EDsCODE and macobo July 22, 2021 13:12

mariusandra mentioned this pull request Jul 22, 2021

Funnels functional bugs #5249

Closed

macobo reviewed Jul 22, 2021

View reviewed changes

make mypy happy + test skip explanation

f3a0b79

timgl temporarily deployed to posthog-pr-5283 July 22, 2021 13:25 Inactive

macobo reviewed Jul 22, 2021

View reviewed changes

macobo approved these changes Jul 22, 2021

View reviewed changes

address comment

7594fa2

timgl temporarily deployed to posthog-pr-5283 July 22, 2021 13:44 Inactive

neilkakkar enabled auto-merge (squash) July 22, 2021 13:44

neilkakkar merged commit ca724e1 into master Jul 22, 2021

neilkakkar deleted the timeconverch branch July 22, 2021 14:04

Twixes mentioned this pull request Oct 27, 2022

fix(funnels): Improve "Time to convert" behavior #12474

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve errors with time to convert bins #5283

Resolve errors with time to convert bins #5283

neilkakkar commented Jul 22, 2021 •

edited

Loading

macobo Jul 22, 2021

neilkakkar Jul 22, 2021

macobo Jul 22, 2021

neilkakkar Jul 22, 2021

macobo Jul 22, 2021

Resolve errors with time to convert bins #5283

Resolve errors with time to convert bins #5283

Conversation

neilkakkar commented Jul 22, 2021 • edited Loading

Changes

Checklist

macobo Jul 22, 2021

Choose a reason for hiding this comment

neilkakkar Jul 22, 2021

Choose a reason for hiding this comment

macobo Jul 22, 2021

Choose a reason for hiding this comment

neilkakkar Jul 22, 2021

Choose a reason for hiding this comment

macobo Jul 22, 2021

Choose a reason for hiding this comment

neilkakkar commented Jul 22, 2021 •

edited

Loading