Update UnhealthyHost Alarm #4233

srtalbot · 2024-08-29T12:58:45Z

if one host is down, send the error message to slack.
If both hosts are down, send an alarm to opsgenie
investigate why we didn't get an error message for the high memory usage that hit our cap.

patheard · 2024-09-10T20:10:12Z

The reason we didn't get an alarm for high memory use is because of how our alarm is configured:
https://github.com/cds-snc/forms-terraform/blob/27de887afba0ce8d80a8bceb3fe22048795781c2/aws/alarms/cloudwatch_app.tf#L26-L30

It is setup to trigger on Average memory use above 50%, and only if that average use is sustained over 4 minutes:

2 evaluation periods * 120 second period

Looking at the period we had expected this alarm to trigger, Average memory use didn't get about 40%, but Maximum memory use hit 100%:

I'd recommend we switch the alarm to Maximum memory use to catch this in the future:

patheard · 2024-09-11T15:45:39Z

Alarms have all been updated.

srtalbot added the Core label Aug 29, 2024

srtalbot assigned patheard Sep 9, 2024

This was referenced Sep 10, 2024

fix: LB alarms trigger when no healthy hosts cds-snc/forms-terraform#817

Closed

fix: add HealthyHostCount alarms to App, IdP, API cds-snc/forms-terraform#818

Merged

fix: remove OK actions from critical alarms cds-snc/forms-terraform#819

Merged

patheard mentioned this issue Sep 10, 2024

fix: use maximum CPU/memory stat for alarms cds-snc/forms-terraform#820

Merged

patheard closed this as completed Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update UnhealthyHost Alarm #4233

Update UnhealthyHost Alarm #4233

srtalbot commented Aug 29, 2024 •

edited by patheard

Loading

patheard commented Sep 10, 2024

patheard commented Sep 11, 2024

Update UnhealthyHost Alarm #4233

Update UnhealthyHost Alarm #4233

Comments

srtalbot commented Aug 29, 2024 • edited by patheard Loading

patheard commented Sep 10, 2024

patheard commented Sep 11, 2024

srtalbot commented Aug 29, 2024 •

edited by patheard

Loading