-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orphaned users for persons on events #11488
Comments
cc: @macobo @tiina303 - how does the underlying data look like when this happens? Need to understand this better to figure out what to do query side. When we alias, we move distinctIDs, and delete the old person. I guess this change wouldn't reflect in |
Let me create an easier to follow example: distinctID: distinctID: Now, Now, The entire events table looks something like: event1: distinctID Now without person on events, total count = 3, and unique persons = 1 (since they're joined to But, with person on events, total count = 3, and unique persons = 2 (two different person IDs), and the person modal will have only 1 person ( I think?, since the other would be lost when doing the postgres filtering) |
@timgl is this a fair representation^ ? |
If that's the scenario, from data side this is expected behavior under person-on-events and is one of the two breaking changes introduced by it. The conversion buffer now exists to avoid creating extra person_ids on their first landing but subsequent logins/logouts would indeed end up with separate person_ids. This is something we've yet to document and need to do a good job on! Note not sure how the person modal now works - we probably be should not showing "invalid" links here. |
D'oh, of course. I think this is more a UX issue: Not only should we have docs about the change, but also show it in practice. When there's a count mismatch, we shouldn't show these invalid users (that seems specific to retention, something borked, because I'm pretty sure the trend graph just doesn't show those people). So when lower than expected people, point them to a link explaining why it's so? (@clarkus for better ideas). It can be because of person on events (expected), CH person mismatch (this is on us) - including person deletions (deleted from postgres but not CH yet). |
Oh, and the other (new) issue with person-on-events will be that our current pagination logic wouldn't work, since persons can be fewer than the limit. (cc: @EDsCODE ) |
cc @pjhul for documentation part - we would see lower counts in insights not higher when persons were merged based on the discussion ^ btw I highlighted this problem in https://docs.google.com/document/d/19KN-WKFH19bpTdKw8nApf8qT9rMPNb9KDUr8fxP2CLY/edit?disco=AAAAdHt6c0Q |
Documentation with a prominent link in product seems like a good place to start. |
Yeah, even in trends it'd be good to somehow acknowledge "40 users in this graph, of which 10 were orphaned" |
Since PoE embrace joins we don't run into this via merges + the UI now reflects this (for person deletions without data deletions). I'm closing this. |
Bug description
If you alias multiple distinct ids together, you create an orphan person_id.
Example:
DB0FD0F1-90EE-4F0E-B0AE-3115DB2ED254
existed first, with person_id0182afac-4b8c-0000-d745-e85707d9208b
8EDA981A-32DA-4B7C-9558-293334D6FA92
, used to have person_id0182c2b9-1a27-0000-090b-69a06b56097b
DB0FD0F1-90EE-4F0E-B0AE-3115DB2ED254
was aliased to96129
, continued using person_id...08b
8EDA981A-32DA-4B7C-9558-293334D6FA92
was aliased to96129
, now also using person_id...08b
The problem is if you do a query which has results for
...FA92
, you get results with no distinct ids. Clicking on a data point in trends means way fewer people appear.If you click on a retention person modal this appears, and clicking on one of these goes to app.posthog.com/persons/undefined

How to reproduce
Environment
Additional context
Thank you for your bug report – we love squashing them!
The text was updated successfully, but these errors were encountered: