Deploy environment

Use this flow when bringing up or updating an environment from this repository.

Prerequisites

Terraform >= 1.14.7, < 1.15.0
AWS CLI configured for the target account
Docker or CI pipelines capable of pushing application images to ECR
Remote state already bootstrapped

Step 0: bootstrap backend if needed

If the backend does not exist yet, run:

export AWS_REGION=us-east-1
bash scripts/bootstrap-state.sh

Step 1: create the application ECR repositories

Create the repositories first so the service repositories can push images before the ECS services try to start. terraform/staging2 reuses these same staging repositories and does not create *2 ECR repositories:

cd terraform/staging
terraform apply \
  -target=module.events_ecr \
  -target=module.dashboard_ecr \
  -target=module.scoring_ecr

Expected images:

Events ingestion API image in module.events_ecr
Dashboard backend image in module.dashboard_ecr
Scoring image in module.scoring_ecr

Step 2: prepare environment inputs

Staging
Production
Staging2

cp terraform/staging/terraform.tfvars.example terraform/staging/terraform.tfvars
# edit terraform/staging/terraform.tfvars

# production values are committed in terraform/prod/production.auto.tfvars
# review and adjust environment-specific values before apply

Current committed production ingress values:

alb_certificate_id: fc539f7b-c3ab-4e90-882f-82392fc8b7a3
events_ingestion_host: atlas-ingest.lifters.tech
dashboard_backend_host: atlas-back.lifters.tech
kafka_ui_host: atlas-kafka.lifters.tech

cp terraform/staging2/terraform.tfvars.example terraform/staging2/terraform.tfvars
# edit terraform/staging2/terraform.tfvars

Committed staging2 example ingress values:

events_ingestion_host: atlas-ingest2.twinfo.io
dashboard_backend_host: atlas-back2.twinfo.io
kafka_ui_host: atlas-kafka2.twinfo.io

Review alb_target_group_deregistration_delay_seconds before apply if you want faster ECS rollouts on the public services. Lowering it reduces ALB connection draining time on deregistration, but it also reduces the grace period for in-flight requests.

The root interface also exposes events_task_cpu, events_task_memory, and the events_newrelic_* knobs. The committed staging and production values enable the newrelic-infra sidecar for the events ingestion task.

The same roots now expose the newrelic_* AWS account-integration variables. The committed staging and production values enable the pull integration for ALB, ECS, RDS, MSK, S3, and VPC.

Before planning or applying a root with newrelic_aws_integration_enabled = true, export the New Relic provider credential outside Git:

export TF_VAR_newrelic_api_key="NRAK-REPLACE_ME"

The events, dashboard backend, and scoring ECS task definitions now act as bootstrap contracts only. Terraform intentionally ignores later container_definitions drift so manual image changes, sidecar edits, and out-of-band task definition revisions in AWS are not overwritten on the next apply.

Step 3: plan and apply

Staging
Production
Staging2

cd terraform/staging
terraform plan -out=tfplan
terraform apply tfplan

cd terraform/prod
terraform plan -out=tfplan
terraform apply tfplan

cd terraform/staging
terraform plan -out=tfplan
terraform apply tfplan

cd ../staging2
terraform plan -out=tfplan
terraform apply tfplan

Apply terraform/staging first whenever shared outputs, moved resource addresses, or shared-foundation wiring changed. staging2 reads remote-state outputs from staging for the VPC, subnets, approved security groups, and MSK bootstrap brokers.

Step 4: complete post-apply tasks

After apply:

update the events ingestion secret with real Kafka and New Relic values
update the dashboard backend secret with real Auth0 values and a real DATABASE_URL
update the scoring secret with the final runtime values if the placeholders are not sufficient
update the Camunda secret with the final JDBC connection values and database password; staging2 currently creates Camunda with a static RDS master password, so keep DB_PASSWORD in the secret aligned with whatever password you set or reset on the instance
confirm the service repositories have pushed deployable images to ECR

Special case: optional MSK Connect S3 sink

If enable_msk_s3_sink = true, the plugin artifact must exist before the full apply succeeds.

Create the plugin bucket first:

cd terraform/staging
terraform apply -target=module.msk_connect_plugin_bucket
terraform output -raw msk_connect_plugin_bucket_name

Then upload the Confluent ZIP to the printed bucket and set msk_s3_sink_plugin_file_key before running the full plan.

warning

When enable_msk_s3_sink = true and msk_s3_sink_plugin_file_key = "", Terraform validation fails before apply.

Prerequisites​

Step 0: bootstrap backend if needed​

Step 1: create the application ECR repositories​

Step 2: prepare environment inputs​

Step 3: plan and apply​

Step 4: complete post-apply tasks​

Special case: optional MSK Connect S3 sink​