Skip to main content

Observability expanded with Slack delivery, MSK, ECS, and RDS coverage

ยท One min read
Atlas Infra

Atlas observability now includes a broader CloudWatch monitoring layer across the shared runtime stack.

This update adds:

  • a dedicated Slack delivery path for CloudWatch alarms using a separate SNS topic and notifier Lambda
  • expanded MSK alarms for under-replicated partitions, broker disk usage, native CPU, and low free memory
  • per-service ECS alarms for CPU, memory, running-versus-desired task health, and unhealthy ALB targets where applicable
  • RDS alarms for low free storage, CPU utilization, and DBLoad
  • a CloudWatch dashboard per root with organized sections for alarm status, MSK, ECS, and RDS runtime signals

The handbook was updated to reflect the current operating model, including the optional New Relic AWS pull integration already wired in staging.

Slack delivery remains isolated from the earlier environment-operations DevOps Agent test wiring, and production keeps the same observability shape available without forcing the Slack path to be enabled in committed values.