Skip to content

Commit

Permalink
[terraform] sets up cpu and memory and utilization alarms for blob se…
Browse files Browse the repository at this point in the history
…rvice

Summary: This sets up cpu and memory utilization alarms for when we hit 90% utilization for  a 60 second period.

Test Plan:
Will trigger high utilization by creating a blob service clone on staging, reducing its instance size, and try overloading it with requests

Per Bartek's advice, might need to lower thresholds lower to something like 25% and use `blob_performance_tests`.

Reviewers: bartek, varun

Reviewed By: bartek

Subscribers: ashoat, tomek

Differential Revision: https://phab.comm.dev/D13429
  • Loading branch information
wyilio committed Sep 24, 2024
1 parent ed94863 commit 52da93d
Showing 1 changed file with 34 additions and 0 deletions.
34 changes: 34 additions & 0 deletions services/terraform/remote/aws_cloudwatch_alarms.tf
Original file line number Diff line number Diff line change
Expand Up @@ -248,3 +248,37 @@ resource "aws_cloudwatch_metric_alarm" "blob_error_alarms" {
alarm_actions = [aws_sns_topic.blob_error_topic.arn]
}

resource "aws_cloudwatch_metric_alarm" "blob_memory_utilization" {
alarm_name = "ecs-memory-utilization-90"
comparison_operator = "GreaterThanOrEqualToThreshold"
evaluation_periods = 1
metric_name = "MemoryUtilization"
namespace = "AWS/ECS"
period = 60
statistic = "Average"
threshold = 90
alarm_description = "Alarm when Blob service memory utilization exceeds 90%"
dimensions = {
ClusterName = aws_ecs_cluster.comm_services.name
ServiceName = aws_ecs_service.blob_service.name
}
alarm_actions = [aws_sns_topic.blob_error_topic.arn]
}


resource "aws_cloudwatch_metric_alarm" "blob_cpu_utilization" {
alarm_name = "ecs-cpu-utilization-90"
comparison_operator = "GreaterThanOrEqualToThreshold"
evaluation_periods = 1
metric_name = "CPUUtilization"
namespace = "AWS/ECS"
period = 60
statistic = "Average"
threshold = 90
alarm_description = "Alarm when Blob service CPU utilization exceeds 90%"
dimensions = {
ClusterName = aws_ecs_cluster.comm_services.name
ServiceName = aws_ecs_service.blob_service.name
}
alarm_actions = [aws_sns_topic.blob_error_topic.arn]
}

0 comments on commit 52da93d

Please sign in to comment.