Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Start of week setting for date histogram aggregations #14816

Open
jjfalk opened this issue Jul 18, 2024 · 3 comments
Open

[Feature Request] Start of week setting for date histogram aggregations #14816

jjfalk opened this issue Jul 18, 2024 · 3 comments
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Search:Aggregations

Comments

@jjfalk
Copy link

jjfalk commented Jul 18, 2024

Is your feature request related to a problem? Please describe

Some of our date histogram aggregations require calendar aware weekly interval with start of week at Sunday - instead of Monday, which is always assumed by Opensearch. We tried to use offset of "-1d" to get around this, however this offset ignores time zones. What happens is if the results cross DST change weekend, we end up with incorrect result.
Example:

  • Let's say we're in a GMT+1 and for a given weekend this changes to GMT+2.
  • Opensearch start of week after that weekend is Monday 00:00:00, which is in GMT+2.
  • Offset of minus 1 day shifts us back to GMT+1, so what we get as the bucket key is Saturday 23:00:00 instead of Sunday 00:00:00.

Describe the solution you'd like

Either:

  1. A possibility to alter start of week / set it to Sunday,
  2. An option to use calendar and timezone aware offset.

Related component

Search:Aggregations

Describe alternatives you've considered

No response

Additional context

No response

@finnegancarroll
Copy link
Contributor

Hi @jjfalk! If I understand correctly the -1d interval is causing a timestamp which previously fell within DST to exit DST. This results in the shifted timestamp losing an additional hour which ultimately changes the weekday/aggregation bucket? Is data being ingested or queried with a specified time_zone?

@jjfalk
Copy link
Author

jjfalk commented Aug 2, 2024

Hi @finnegancarroll, yes, your description is pretty much correct.
Data is provided ingested in UTC, while aggregation request has relevant timezone ID provided.

@getsaurabh02 getsaurabh02 moved this from Todo to In Progress in Performance Roadmap Aug 19, 2024
@finnegancarroll
Copy link
Contributor

finnegancarroll commented Sep 3, 2024

The core issue here seems to be that the offset field of a date_histogram is always fixed. That is any offset is immediately converted into milliseconds and is eventually used as the offset for an OffsetRounding which the date histogram aggregator uses to determine bucket boundaries.

This initial conversion of offset to ms is a misstep since not every day is the same length. I think the fix here, which will support the above feature, is to ingest the offset as a DateTimeUnit. Once we are able to dynamically lookup interval lengths we can support the 'non-fixed' day, week, month, year offsets.

@mch2 mch2 moved this from In Progress to Todo in Performance Roadmap Sep 16, 2024
@finnegancarroll finnegancarroll moved this from Todo to Not In Plan in Performance Roadmap Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search:Aggregations
Projects
Status: Not In Plan
Status: 🆕 New
Development

No branches or pull requests

5 participants