Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix IntegrityErrors occuring when creating ResolutionNoteSlackMessage objects #2933

4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Fixed

- Fix issue with helm chart when specifying `broker.type=rabbitmq` where Redis environment variables
were not longer being injected @joeyorlando ([#2927](https://github.com/grafana/oncall/pull/2927))
were not longer being injected by @joeyorlando ([#2927](https://github.com/grafana/oncall/pull/2927))
- Fix silence for alert groups with empty escalation chain @Ferril ([#2929](https://github.com/grafana/oncall/pull/2929))
- Fixed NPE when migrating legacy Grafana Alerting integrations
([#2908](https://github.com/grafana/oncall/issues/2908))
- Fix `IntegrityError` exceptions that occasionally would occur when trying to create `ResolutionNoteSlackMessage`
objects by @joeyorlando ([#2933](https://github.com/grafana/oncall/pull/2933))

## v1.3.29 (2023-08-29)

Expand Down
21 changes: 21 additions & 0 deletions engine/apps/alerts/migrations/0031_auto_20230831_1445.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Generated by Django 3.2.20 on 2023-08-31 14:45

from django.db import migrations, models


class Migration(migrations.Migration):

dependencies = [
('alerts', '0030_auto_20230731_0341'),
]

operations = [
migrations.AddIndex(
model_name='resolutionnoteslackmessage',
index=models.Index(fields=['ts', 'thread_ts', 'alert_group_id'], name='alerts_reso_ts_08f72c_idx'),
),
migrations.AddIndex(
model_name='resolutionnoteslackmessage',
index=models.Index(fields=['ts', 'thread_ts', 'slack_channel_id'], name='alerts_reso_ts_a9bdf7_idx'),
),
]
5 changes: 5 additions & 0 deletions engine/apps/alerts/models/resolution_note.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,11 @@ class ResolutionNoteSlackMessage(models.Model):
class Meta:
unique_together = ("thread_ts", "ts")

indexes = [
models.Index(fields=["ts", "thread_ts", "alert_group_id"]),
models.Index(fields=["ts", "thread_ts", "slack_channel_id"]),
]

Comment on lines +86 to +90
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added indices to speed up querying these objects. These are the two filter patterns we use when querying these objects.

def get_resolution_note(self) -> typing.Optional["ResolutionNote"]:
try:
return self.resolution_note
Expand Down
54 changes: 20 additions & 34 deletions engine/apps/slack/scenarios/slack_channel_integration.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,46 +82,32 @@ def save_thread_message_for_resolution_note(
if result["permalink"] is not None:
permalink = result["permalink"]

try:
slack_thread_message = ResolutionNoteSlackMessage.objects.get(
ts=message_ts,
thread_ts=thread_ts,
alert_group=alert_group,
)
if len(text) > 2900:
if slack_thread_message.added_to_resolution_note:
self._slack_client.api_call(
"chat.postEphemeral",
channel=channel,
user=slack_user_identity.slack_id,
text=":warning: Unable to update the <{}|message> in Resolution Note: the message is too long ({}). "
"Max length - 2900 symbols.".format(permalink, len(text)),
)
return
slack_thread_message.text = text
slack_thread_message.save()
slack_thread_message, created = ResolutionNoteSlackMessage.objects.get_or_create(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to check len(text) before creating ResolutionNoteSlackMessage. Too long text may cause issues either with saving it to the db or displaying it in Slack 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

do you happen to know why there is this condition?:

if slack_thread_message.added_to_resolution_note:

if that can be removed, we can still use get_or_create

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternative solution is to keep the separate get/create calls as it was, and add another try/except when creating, and catch any possible IntegrityErrors (but this seems a bit messier imo)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResolutionNoteSlackMessage is a message from alert group thread that potentially can be added to resolution notes.

It looks like we post warning message about too long text only either for messages that had already been added to resolution notes or for new messages in thread 🤔
If we post normal message and then edit it to make it long without adding to resolution notes, there won't be any warnings and the "Add Resolution notes" modal window in Slack will still have the old version of the message.
I think we can remove the check if slack_thread_message.added_to_resolution_note and post warning message in any case.

ts=message_ts,
thread_ts=thread_ts,
alert_group=alert_group,
defaults={
"user": self.user,
"added_by_user": self.user,
"text": text,
"slack_channel_id": channel,
"permalink": permalink,
},
)
Comment on lines +95 to +106
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main issue here seemed to be Slack sending duplicate events to our API endpoint, shortly after one another, which led to this issue.

Example from logs (check trace IDs e3691dda80bd9e67476ceafb68c90a8f/2917201084877566781 and 978e7af66c11d0a680ce8175afdbda6f/15112916125666851927)

According to Django's docs, the solution to this is to use get_or_create:

Returns a tuple of (object, created), where object is the retrieved or created object and created is a boolean specifying whether a new object was created.

This is meant to prevent duplicate objects from being created when requests are made in parallel, and as a shortcut to boilerplatish code. For example:

try:
    obj = Person.objects.get(first_name="John", last_name="Lennon")
except Person.DoesNotExist:
    obj = Person(first_name="John", last_name="Lennon", birthday=date(1940, 10, 9))
    obj.save()

Here, with concurrent requests, multiple attempts to save a Person with the same parameters may be made. To avoid this race condition, the above example can be rewritten using get_or_create() like so:


except ResolutionNoteSlackMessage.DoesNotExist:
if len(text) > 2900:
if len(text) > 2900:
if created or slack_thread_message.added_to_resolution_note:
self._slack_client.api_call(
"chat.postEphemeral",
channel=channel,
user=slack_user_identity.slack_id,
text=":warning: The <{}|message> will not be displayed in Resolution Note: "
"the message is too long ({}). Max length - 2900 symbols.".format(permalink, len(text)),
text=":warning: Unable to update the <{}|message> in Resolution Note: the message is too long ({}). "
"Max length - 2900 symbols.".format(permalink, len(text)),
)
return

slack_thread_message = ResolutionNoteSlackMessage(
alert_group=alert_group,
user=self.user,
added_by_user=self.user,
text=text,
slack_channel_id=channel,
thread_ts=thread_ts,
ts=message_ts,
permalink=permalink,
)
return

if not created:
slack_thread_message.text = text
slack_thread_message.save()

def delete_thread_message_from_resolution_note(
Expand Down