Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce alarms noise filtering out duplicate key errors #626

Open
4 tasks
jimleroyer opened this issue Apr 25, 2022 · 6 comments
Open
4 tasks

Reduce alarms noise filtering out duplicate key errors #626

jimleroyer opened this issue Apr 25, 2022 · 6 comments
Assignees
Labels
Blocked l Bloqué Dev Task for implementation of a technical solution Medium Priority | Priorité moyenne Support l Soutien maintenance and bugs while on call

Comments

@jimleroyer
Copy link
Member

jimleroyer commented Apr 25, 2022

Description

We get too much noise in the #notification-ops channel with unique key violation errors. These are expected as we use standard SQS queues which do not guarantee to send one and only one message. Hence finetuning the alarm to filter out a minimal number of duplicate should be be OK while raising an alarm if too many are reported in a short period of time (i.e. a few minutes).

As a person on support,
I need to have less alarms concerning inconsequential duplicate errors key
so that I can achieve focus on higher priority alarms.

WHY are we building?

Less noise in the notification-ops channel and more focus on important alarms.

WHAT are we building?

Better alarms filter.

VALUE created by our solution

More focus and less distraction.

Acceptance Criteria** (Definition of done)

  • Low volume unique key violation errors are not getting reported as an alarm anymore.
  • Medium and high volume unique key violation errors are getting reported as an alarm anymore.

QA Steps

  • Verify that sparse unique key violation errors are not getting reported anymore.
  • If possible, reproduce high volume of key violation errors to trigger the error.

Additional context

Patrick wrote:

Yeah, it’s just a whole wack of the expected unique violation warnings:
(psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "notifications_pkey"
I think it would be a good idea to filter these out into their own alert category so we don’t get all the alarm messages for them:
cds-snc/notification-api#1458 (comment)

@jimleroyer jimleroyer added the Support l Soutien maintenance and bugs while on call label Apr 25, 2022
@yaelberger-commits
Copy link
Collaborator

@jimleroyer Is this still happening, duplicate key errors?

@yaelberger-commits
Copy link
Collaborator

@jimleroyer
Copy link
Member Author

@yaelberger-commits This topic came back this week. Definitely still valid and stealing support focus.

@yaelberger-commits
Copy link
Collaborator

Revisit this in January or February to evaluate if changes to New Relic did the job or if more effort is needed

@yaelberger-commits yaelberger-commits self-assigned this Dec 22, 2022
@yaelberger-commits yaelberger-commits added the Dev Task for implementation of a technical solution label Dec 23, 2022
@yaelberger-commits
Copy link
Collaborator

@jimleroyer @jzbahrai Has this issue been resolved or are we still seeing too many duplicate key errors?

1 similar comment
@yaelberger-commits
Copy link
Collaborator

@jimleroyer @jzbahrai Has this issue been resolved or are we still seeing too many duplicate key errors?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocked l Bloqué Dev Task for implementation of a technical solution Medium Priority | Priorité moyenne Support l Soutien maintenance and bugs while on call
Projects
None yet
Development

No branches or pull requests

2 participants