Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(insights): Fix timezone date issues #9678

Merged
merged 38 commits into from
May 20, 2022
Merged

Conversation

timgl
Copy link
Collaborator

@timgl timgl commented May 6, 2022

Problem

Lots of random assorted issues with timezones.

Changes

  • Fix loading persons if timezones is active
  • Fix loading same day for timezones
  • We now do not support date_to with a specific time in filters. We will always set date_to to the end of the day (23:59:59) OR to the current time if interval = hour. We don't set a specific time anywhere in the interface, so this shouldn't be an issue.
  • In general clean up how we move dates around. Most of that logic should now live in date_to and be consistent

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

How did you test this code?

unit tests

self.assertEqual(
response[1]["days"],
[
"2019-12-21",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are changing the behaviour of compare here, but 2020-01-04 minus 14 days is 21 so this makes a lot more sense

Copy link
Contributor

@neilkakkar neilkakkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused with how the time handling works now. Sanity check:

  1. Everything is stored in Clickhouse in UTC
  2. We convert datetimes we get into UTC before querying them in clickhouse
  3. .. Except when they have a time component, in which case we assume/coerce its always in UTC, and then don't try and convert these to whichever timezone was set.

Couldn't see it in the tests yet, but assuming the above is true, I'd expect to see timestamps in Clickhouse with starting and ending being something like: '2020-02-03 05:30:00' and '2020-02-05 21:59:59' instead of always starting at 00:00:00 and ending at 23:59:59, to adjust for the timezones 🤔

AND timestamp >= toDateTime('2020-01-02 00:00:00', 'UTC')
AND timestamp <= toDateTime('2020-01-02 23:59:59', 'UTC')
AND timestamp >= toDateTime('2020-01-02 00:00:00')
AND timestamp <= toDateTime('2020-01-03 00:00:00')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not quite sure why this is different from the rest now 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Fixed, slightly niche bug that only happens in tests

HAVING argMax(is_deleted, version) = 0) AS pdi ON e.distinct_id = pdi.distinct_id
WHERE team_id = 2
AND event IN ['step one', 'step three', 'step two']
AND timestamp >= toDateTime('2021-04-30 07:00:00')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neilkakkar I've added snapshots for the timezone specific tests where you can see the timezone wrangling in action. You don't see it in any other test as no other test uses a non-UTC timezone

@timgl
Copy link
Collaborator Author

timgl commented May 11, 2022

Everything is stored in Clickhouse in UTC
We convert datetimes we get into UTC before querying them in clickhouse
.. Except when they have a time component, in which case we assume/coerce its always in UTC, and then don't try and convert these to whichever timezone was set.

Yep that's right. The canonical best way of working with timezones is to use UTC everywhere, and only really use timezones for displaying to the user

Obviously any bucketing queries here are the exception as the bucketing changes depending on timezone

@timgl
Copy link
Collaborator Author

timgl commented May 18, 2022

@neilkakkar ready for re-review!

Copy link
Contributor

@neilkakkar neilkakkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this mostly makes sense!

It's a tricky PR though, and i'd appreciate @EDsCODE also taking a look for things I might've missed

Copy link
Member

@EDsCODE EDsCODE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some clarification on what's changing

ee/clickhouse/queries/trends/util.py Show resolved Hide resolved
posthog/queries/util.py Show resolved Hide resolved
@timgl timgl merged commit 967b50c into master May 20, 2022
@timgl timgl deleted the fix-timezone-date-issues branch May 20, 2022 10:17
fuziontech added a commit that referenced this pull request May 20, 2022
* master:
  chore: start stack once in cloud tests (#9879)
  feat(apps): frontend apps (#9831)
  chore: Fix snapshots on master (#9885)
  chore(apps): rename plugins to apps (#9755)
  refactor: Remove constance library dependency, use json-encoded model (#9852)
  chore(clickhouse): avoid creating kafka_events, events_mv (#9863)
  fix(insights): Fix timezone date issues (#9678)
  refactor(plugin-server): refactor the event pipeline (#9829)
  feat(object storage): add unused object storage (#9846)
  fix: make kafka health check timeout test reliable (#9857)
  fix: query elements from start of day (#9827)
alexkim205 pushed a commit that referenced this pull request May 23, 2022
* fix(insights): Fix timezone date issues

* fix formatting

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix typescript

* Add snapshots for timezone tests

* Update snapshots

* fix with_data issue

* Update snapshots

* Update snapshots

* fix date_to and fix retention

* Fix retention

* Update snapshots

* Always set hourly intervals to 23:59:59, unless no date is given

* Update snapshots

* try always passing hourly through

* ifix

* fix

* fix mayb

* other fix

* fix

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants