Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add index on started_at column in alert groups #2516

Merged
merged 6 commits into from
Jul 13, 2023

Conversation

joeyorlando
Copy link
Contributor

What this PR does

Adds an index on the started_at column in the alerts_alertgroup table. For the alert groups query used by the check_escalation_finished_task, this resulted in a huge performance boost, taking the query time from 89mins to 4secs (on our largest production dataset).

Which issue(s) this PR fixes

closes #724
closes https://github.com/grafana/oncall-private/issues/1713

Checklist

  • Unit, integration, and e2e (if applicable) tests updated
  • Documentation added (or pr:no public docs PR label added if not required)
  • CHANGELOG.md updated (or pr:no changelog PR label added if not required)

@joeyorlando joeyorlando requested a review from a team as a code owner July 13, 2023 08:56
@joeyorlando joeyorlando requested review from a team and Konstantinov-Innokentii July 13, 2023 08:56
Comment on lines -99 to -114
def get_auditable_alert_groups_started_at_range() -> typing.Tuple[datetime.datetime, datetime.datetime]:
"""
NOTE: this started_at__range is a bit of a hack..
we wanted to avoid performing a migration on the alerts_alertgroup table to update
alert groups where raw_escalation_snapshot was None. raw_escalation_snapshot being None is a legitimate case,
where the alert group's integration does not have an escalation chain associated with it.

However, we wanted a way to be able to differentiate between "actually None" and "there was an error writing to
raw_escalation_snapshot" (as this is performed async by a celery task).

This field was updated, in the commit that added this comment, to no longer be set to None by default.
As part of this celery task we do a check that this field is in fact not None, so if we were to check older
alert groups, whose integration did not have an escalation chain at the time the alert group was created
we would raise errors
"""
return (datetime.datetime(2023, 3, 25, tzinfo=pytz.UTC), timezone.now() - datetime.timedelta(days=2))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer relevant and is safe to remove. We are only looking at alert groups for the past 2 days. It has been ~4 months since the release where raw_escalation_snapshot is no longer being set to None.

@joeyorlando joeyorlando merged commit 77f6ded into dev Jul 13, 2023
@joeyorlando joeyorlando deleted the jorlando/fix-check-escalation-finished-task branch July 13, 2023 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate/fix check_escalation_finished_task
2 participants