Skip to main content

Manage secrets

Terraform creates the Secrets Manager resources with safe placeholder values and then intentionally ignores later JSON changes. Operators are expected to populate the real values out-of-band after terraform apply.

For terraform/staging2, the committed secret names append 2: poc-atlas-dev/app-config2, poc-atlas-dev/dashboard-backend2, poc-atlas-dev/scoring2, and poc-atlas-dev/camunda2.

Events ingestion secret

The events ingestion task definition reads its full runtime config from one JSON secret. The current template in terraform/staging/locals.tf includes keys for:

  • HTTP server settings
  • MSK brokers and topic names
  • IAM + TLS Kafka client configuration
  • buffer and replication toggles used by the current task definition
  • log level
  • the NEW_RELIC_* contract expected by the application image and reused by the optional newrelic-infra sidecar

Example update command:

aws secretsmanager put-secret-value \
--secret-id "poc-atlas-dev/app-config" \
--secret-string '{
"APP_NAME": "atlas-events",
"APP_VERSION": "0.1.0",
"APP_ENVIRONMENT": "development",
"HTTP_PORT": "8080",
"HTTP_READ_TIMEOUT": "5s",
"HTTP_WRITE_TIMEOUT": "10s",
"HTTP_SHUTDOWN_TIMEOUT": "30s",
"KAFKA_BROKERS": "b-1.example.kafka.us-east-1.amazonaws.com:9098,b-2.example.kafka.us-east-1.amazonaws.com:9098",
"KAFKA_EVENTS_TOPIC": "atlas.events.raw",
"KAFKA_TOPIC": "atlas.events.raw",
"KAFKA_DLQ_TOPIC": "atlas.events.dlq",
"KAFKA_PRODUCER_MAX_RETRIES": "3",
"KAFKA_REQUIRED_ACKS": "-1",
"KAFKA_DIAL_TIMEOUT": "10s",
"KAFKA_WRITE_TIMEOUT": "10s",
"KAFKA_BATCH_SIZE": "100",
"KAFKA_BATCH_TIMEOUT": "1ms",
"KAFKA_TLS_ENABLED": "true",
"KAFKA_TLS_INSECURE_SKIP_VERIFY": "false",
"KAFKA_SASL_MECHANISM": "AWS_MSK_IAM",
"KAFKA_AWS_REGION": "us-east-1",
"KAFKA_BUFFER_ENABLED": "false",
"KAFKA_REPLICATION_FACTOR": "1",
"LOG_LEVEL": "info",
"NEW_RELIC_ENABLED": "true",
"NEW_RELIC_APP_NAME": "atlas-events",
"NEW_RELIC_LICENSE_KEY": "CHANGE_ME",
"NEW_RELIC_DISTRIBUTED_TRACING_ENABLED": "true",
"NEW_RELIC_WAIT_FOR_CONNECTION": "false",
"NEW_RELIC_CONNECTION_TIMEOUT": "15s",
"NEW_RELIC_AI_MONITORING_ENABLED": "false",
"NEW_RELIC_AI_MONITORING_STREAMING_ENABLED": "true",
"NEW_RELIC_AI_MONITORING_RECORD_CONTENT_ENABLED": "true",
"NEW_RELIC_CUSTOM_INSIGHTS_EVENTS_MAX_SAMPLES_STORED": "10000"
}'

When events_enable_newrelic_sidecar = true, the sidecar reads NRIA_LICENSE_KEY from the same secret document via NEW_RELIC_LICENSE_KEY. The sidecar does not need a dedicated Secrets Manager secret.

Dashboard backend secret

The dashboard backend uses a dedicated secret with database, Auth0, session, CORS, and web app settings.

Useful outputs for building the real DATABASE_URL:

terraform output -raw dashboard_backend_secret_name
terraform output -raw dashboard_db_endpoint
terraform output -raw dashboard_db_port
terraform output -raw dashboard_db_name
terraform output dashboard_db_master_user_secret_arn
terraform output -raw scoring_secret_name
terraform output -raw camunda_secret_name
terraform output -raw camunda_db_endpoint
terraform output -raw camunda_db_port
terraform output -raw camunda_db_name
terraform output camunda_db_master_user_secret_arn
terraform output -raw msk_bootstrap_brokers_tls

Important dashboard keys defined in locals.dashboard_secret_template include:

  • DATABASE_URL
  • WEB_APP_URL
  • AUTH0_DOMAIN
  • AUTH0_CLIENT_ID
  • AUTH0_CLIENT_SECRET
  • AUTH0_MANAGEMENT_CLIENT_ID
  • AUTH0_MANAGEMENT_CLIENT_SECRET
  • AUTH_SESSION_SECRET

The committed dashboard task definition key set is intentionally frozen. You can add extra runtime keys manually in AWS Secrets Manager for the live service without expecting Terraform to continuously reconcile those keys back into the ECS task definition.

Scoring secret

The scoring service uses a dedicated secret with HTTP, Kafka, Camunda, logging, and score-weight settings.

Important scoring keys defined in locals.scoring_secret_template include (producer-only Kafka):

  • CAMUNDA_BASE_URL
  • KAFKA_BROKERS
  • KAFKA_OUTPUT_TOPIC
  • KAFKA_DLQ_TOPIC
  • SCORE_WEIGHT_FINANCIAL
  • SCORE_WEIGHT_AML
  • SCORE_WEIGHT_BEHAVIOR

The same “base contract only” rule applies here: Terraform owns the secret resource and the bootstrap key set, but operators can keep richer scoring runtime values in AWS without forcing Terraform to roll a new ECS task definition.

If the scoring or dashboard services should consume Valkey, read the connection details from:

  • terraform output elasticache_valkey_primary_endpoint
  • terraform output elasticache_valkey_port

Then add the application-specific cache keys manually in the relevant secret JSON. Terraform owns the secret resource and its initial placeholder shape, but not the runtime cache contract used by each application repository.

Camunda secret

Camunda uses a dedicated secret for its PostgreSQL runtime wiring.

Important Camunda keys defined in locals.camunda_secret_template include:

  • DB_DRIVER
  • DB_URL
  • DB_USERNAME
  • DB_PASSWORD

On an existing environment, Terraform reads the current password from the RDS-managed master secret and applies that same value as the static RDS master password for the Camunda instance. After the apply, keep DB_PASSWORD in the Camunda runtime secret aligned with that preserved value. For staging2, Camunda is created directly with a static master password, so rotating or resetting the RDS password out-of-band must be followed by a matching update to poc-atlas-dev/camunda2. For a brand-new environment that starts from the managed-secret flow, you still need a first apply path that creates the database before any static-password migration because there is no existing managed secret to reuse yet.

Guardrails

What Terraform owns

Terraform owns the existence of the secret resource and the initial placeholder document shape.

What operators own

Operators own the real runtime values after apply. Those values should not be committed to Git and should not be encoded directly into Terraform variables.

What to verify after rotation

Confirm the affected ECS service can still start, read the secret, and pass the ALB target-group health check.

warning

The secret names and ARNs are environment-specific. Always query the active root outputs before updating production values.