Observability expanded with Slack delivery, MSK, ECS, and RDS coverage
ยท One min read
Atlas observability now includes a broader CloudWatch monitoring layer across the shared runtime stack.
This update adds:
- a dedicated Slack delivery path for CloudWatch alarms using a separate SNS topic and notifier Lambda
- expanded MSK alarms for under-replicated partitions, broker disk usage, native CPU, and low free memory
- per-service ECS alarms for CPU, memory, running-versus-desired task health, and unhealthy ALB targets where applicable
- RDS alarms for low free storage, CPU utilization, and
DBLoad - a CloudWatch dashboard per root with organized sections for alarm status, MSK, ECS, and RDS runtime signals
The handbook was updated to reflect the current operating model, including the optional New Relic AWS pull integration already wired in staging.
Slack delivery remains isolated from the earlier environment-operations DevOps Agent test wiring, and production keeps the same observability shape available without forcing the Slack path to be enabled in committed values.