Three Tips for Easy Container Deployments on AWS
I deployed three microservices to AWS last week using ECS Fargate, an Application Load Balancer, ECR for container images, and GitLab CI for deployments. I've been running variations of this setup for years and it remains my go-to for services that need to be reliable without being over-engineered. No Kubernetes, no service mesh. Just containers behind a load balancer, all managed through Terraform.
Here's the setup and the three things that make it work well.
Architecture Overview
Here's how the pieces fit together:
Clients resolve the domain through Route53, which points to the ALB. The ALB terminates TLS using a certificate from ACM and forwards requests to containers running in ECS Fargate. On the deployment side, GitLab CI builds images and pushes them to ECR, then triggers ECS to pull the new version.
1. Store Container Images in ECR
I store all my container images in ECR rather than Docker Hub or another external registry. Since ECS is pulling images from within AWS, there's no cross-region bandwidth cost and pulls are fast. ECR also integrates directly with IAM, so I can create scoped credentials for CI/CD without managing separate Docker Hub tokens.
The lifecycle policies are particularly useful. When you're deploying multiple times per day, untagged images accumulate quickly. I configure ECR to automatically expire old images and keep only the last five, which keeps storage costs minimal without any manual cleanup.
In my Terraform, I create an IAM user specifically for GitLab CI with permissions scoped to ECR push and pull operations. The access keys come out as Terraform outputs, which I add to GitLab's CI/CD variables.
resource "aws_ecr_repository" "app" {
name = "${local.app_name}-${local.environment}"
image_scanning_configuration {
scan_on_push = true
}
}
resource "aws_ecr_lifecycle_policy" "app" {
repository = aws_ecr_repository.app.name
policy = jsonencode({
rules = [{
rulePriority = 1
description = "Keep last 5 images"
selection = {
tagStatus = "untagged"
countType = "imageCountMoreThan"
countNumber = 5
}
action = { type = "expire" }
}]
})
}I also enable image scanning on push. It's free, runs automatically, and catches known CVEs in your base images. It won't find application-level vulnerabilities, but it's a good baseline.
2. Use Fargate for Serverless Container Execution
ECS supports two launch types: EC2 and Fargate. With EC2, you provision and manage the instances that run your containers. With Fargate, you specify CPU and memory requirements and AWS handles the underlying compute. For most workloads, Fargate is the better choice because there are no instances to patch, no capacity to plan, and no cluster autoscaling to configure.
I define the CPU and memory in my task definition, and Fargate provisions the right amount of compute for each task. If a task fails health checks, ECS replaces it automatically. All I see are the container logs.
Here's the request flow from client to container:
The ALB receives HTTPS traffic on port 443, terminates TLS, and forwards HTTP to the target group on port 80. The target group routes requests to healthy ECS tasks. During deployments, ECS starts new tasks, waits for them to pass health checks, then drains connections from old tasks. This gives you zero-downtime deployments without any additional configuration.
Fargate pricing is per-second based on the vCPU and memory you allocate. A task with 0.5 vCPU and 1GB memory running continuously costs around $30/month. That's more expensive than a small VPS, but you're not responsible for OS updates or instance failures.
resource "aws_ecs_task_definition" "main" {
family = "${local.app_name}-${local.environment}"
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = 512
memory = 1024
container_definitions = jsonencode([{
name = local.app_name
image = "${aws_ecr_repository.app.repository_url}:latest"
portMappings = [{
containerPort = 80
protocol = "tcp"
}]
}])
}
resource "aws_ecs_service" "main" {
name = "${local.app_name}-${local.environment}"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.main.arn
desired_count = 1
launch_type = "FARGATE"
}Scaling is straightforward. To run more instances, I increase desired_count and apply. For automatic scaling, I add a target tracking policy based on CPU utilization or ALB request count. The load balancer distributes traffic across all running tasks.
3. Automate DNS and TLS with Route53 and ACM
I use Route53 for DNS and ACM for TLS certificates. ACM provides free certificates for domains you can prove ownership of, and Route53 lets you prove that ownership automatically through DNS validation. Once configured, certificates renew automatically and you never have to think about expiration.
The validation flow works like this: when I request a certificate from ACM, it generates a CNAME record that I need to create in DNS. My Terraform creates that record in Route53. ACM sees the record, validates domain ownership, and issues the certificate. From then on, renewal happens automatically.
There are two ways to set up DNS for this:
Option A: Migrate your entire domain to Route53. Create a hosted zone for your root domain (example.com) and update the nameserver records at your registrar to point to AWS. This gives you full control over all DNS records in Terraform.
Option B: Delegate a subdomain to Route53. Create a hosted zone for just the subdomain (app.example.com) and add NS records for that subdomain at your existing DNS provider. This works well when you don't control the root domain or other teams manage it.
resource "aws_route53_zone" "main" {
name = local.domain_name
}
resource "aws_acm_certificate" "main" {
domain_name = local.domain_name
validation_method = "DNS"
}
resource "aws_route53_record" "cert_validation" {
for_each = {
for dvo in aws_acm_certificate.main.domain_validation_options :
dvo.domain_name => {
name = dvo.resource_record_name
record = dvo.resource_record_value
type = dvo.resource_record_type
}
}
zone_id = aws_route53_zone.main.zone_id
name = each.value.name
records = [each.value.record]
ttl = 60
type = each.value.type
}
resource "aws_acm_certificate_validation" "main" {
certificate_arn = aws_acm_certificate.main.arn
validation_record_fqdns = [
for record in aws_route53_record.cert_validation : record.fqdn
]
}
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.main.arn
port = 443
protocol = "HTTPS"
certificate_arn = aws_acm_certificate_validation.main.certificate_arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.main.arn
}
}The for_each on the validation record supports multi-domain certificates if needed. For a single domain, it creates one validation record.
The Deployment Pipeline
I use GitLab CI to build and deploy. The goal is simple: push to main, and the new version is live within a few minutes. Here's the flow:
The build stage creates the Docker image and pushes it to ECR with two tags: the git commit ref (for rollbacks) and "latest" (for ECS to pull). The deploy stage triggers an ECS service update.
One important detail: ECS caches the task definition, so pushing a new "latest" tag to ECR doesn't automatically trigger a deployment. I use the --force-new-deployment flag to tell ECS to pull fresh images. Without this flag, ECS will keep running the old image even after you've pushed a new one.
.build_image:
stage: build
script:
- docker build -t $ECR_REPOSITORY_URL:$CI_COMMIT_REF_SLUG .
- docker build -t $ECR_REPOSITORY_URL:latest .
- aws ecr get-login-password | docker login --username AWS --password-stdin $ECR_REPOSITORY_URL
- docker push $ECR_REPOSITORY_URL:$CI_COMMIT_REF_SLUG
- docker push $ECR_REPOSITORY_URL:latest
deploy_to_ecs:
stage: deploy
script:
- aws ecs update-service --cluster "app-prod" --service "app-prod" --force-new-deploymentI run terraform apply manually for infrastructure changes rather than automating it in CI. For application deployments, the pipeline handles everything.
Summary
This setup handles the common requirements for running containers in production: load balancing, TLS termination, health checks, zero-downtime deployments, and automated certificate renewal. It doesn't include service discovery, distributed tracing, or canary deployments. If you need those capabilities, you can add them, but for most internal services and straightforward web applications, this is sufficient.
Because everything is defined in Terraform, I can create an identical environment for staging by changing a few variables and running apply. The full configuration is about 300 lines of HCL, which is small enough to understand completely and debug when something goes wrong.
Monthly cost for a small service is around $50: $16 for the ALB (minimum charge regardless of traffic), $30 for a single Fargate task, and a few dollars for Route53, ECR storage, and data transfer.
Need help setting this up? If you'd like assistance deploying your containers on AWS or building out your infrastructure, get in touch. I'm happy to discuss your project and provide a free estimate.
Full Terraform Configuration
Here's the complete Terraform configuration. You'll need to adjust the locals at the top for your app name, environment, and domain.
terraform {
backend "s3" {
bucket = "your-terraform-state-bucket"
key = "prod/app/terraform.tfstate"
region = "us-east-1"
use_lockfile = true
encrypt = true
}
}
provider "aws" {
region = "us-east-1"
}
locals {
app_name = "myapp"
environment = "prod"
container_port = 80
cpu = 512
memory = 1024
desired_count = 1
domain_name = "${local.app_name}.example.com"
}
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "${local.app_name}-${local.environment}-vpc"
}
}
resource "aws_subnet" "main" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.1.0/24"
availability_zone = "us-east-1a"
map_public_ip_on_launch = true
tags = {
Name = "${local.app_name}-${local.environment}-subnet-1"
}
}
resource "aws_subnet" "secondary" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.2.0/24"
availability_zone = "us-east-1b"
map_public_ip_on_launch = true
tags = {
Name = "${local.app_name}-${local.environment}-subnet-2"
}
}
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${local.app_name}-${local.environment}-igw"
}
}
resource "aws_route_table" "main" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = {
Name = "${local.app_name}-${local.environment}-rt"
}
}
resource "aws_route_table_association" "main" {
subnet_id = aws_subnet.main.id
route_table_id = aws_route_table.main.id
}
resource "aws_route_table_association" "secondary" {
subnet_id = aws_subnet.secondary.id
route_table_id = aws_route_table.main.id
}
resource "aws_security_group" "alb" {
name_prefix = "${local.app_name}-${local.environment}-alb-"
vpc_id = aws_vpc.main.id
ingress {
protocol = "tcp"
from_port = 80
to_port = 80
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
protocol = "tcp"
from_port = 443
to_port = 443
cidr_blocks = ["0.0.0.0/0"]
}
egress {
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_lb" "main" {
name = "${local.app_name}-${local.environment}"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = [aws_subnet.main.id, aws_subnet.secondary.id]
enable_deletion_protection = false
tags = {
Name = "${local.app_name}-${local.environment}-alb"
}
}
resource "aws_lb_target_group" "main" {
name = "${local.app_name}-${local.environment}"
port = local.container_port
protocol = "HTTP"
vpc_id = aws_vpc.main.id
target_type = "ip"
health_check {
healthy_threshold = 2
unhealthy_threshold = 10
timeout = 5
interval = 30
path = "/health"
port = "traffic-port"
}
}
resource "aws_lb_listener" "http" {
load_balancer_arn = aws_lb.main.arn
port = 80
protocol = "HTTP"
default_action {
type = "redirect"
redirect {
port = "443"
protocol = "HTTPS"
status_code = "HTTP_301"
}
}
}
resource "aws_route53_zone" "main" {
name = local.domain_name
tags = {
Name = "${local.app_name}-${local.environment}-zone"
}
}
resource "aws_acm_certificate" "main" {
domain_name = local.domain_name
validation_method = "DNS"
lifecycle {
create_before_destroy = true
}
tags = {
Name = "${local.app_name}-${local.environment}-cert"
}
}
resource "aws_route53_record" "cert_validation" {
for_each = {
for dvo in aws_acm_certificate.main.domain_validation_options : dvo.domain_name => {
name = dvo.resource_record_name
record = dvo.resource_record_value
type = dvo.resource_record_type
}
}
allow_overwrite = true
name = each.value.name
records = [each.value.record]
ttl = 60
type = each.value.type
zone_id = aws_route53_zone.main.zone_id
}
resource "aws_acm_certificate_validation" "main" {
certificate_arn = aws_acm_certificate.main.arn
validation_record_fqdns = [for record in aws_route53_record.cert_validation : record.fqdn]
}
resource "aws_route53_record" "main" {
zone_id = aws_route53_zone.main.zone_id
name = local.domain_name
type = "A"
alias {
name = aws_lb.main.dns_name
zone_id = aws_lb.main.zone_id
evaluate_target_health = true
}
}
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.main.arn
port = 443
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
certificate_arn = aws_acm_certificate_validation.main.certificate_arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.main.arn
}
}
resource "aws_ecs_cluster" "main" {
name = "${local.app_name}-${local.environment}"
}
resource "aws_ecr_repository" "app" {
name = "${local.app_name}-${local.environment}"
image_scanning_configuration {
scan_on_push = true
}
tags = {
Name = "${local.app_name}-${local.environment}-ecr"
}
}
resource "aws_ecr_lifecycle_policy" "app" {
repository = aws_ecr_repository.app.name
policy = jsonencode({
rules = [
{
rulePriority = 1
description = "Keep last 5 images"
selection = {
tagStatus = "untagged"
countType = "imageCountMoreThan"
countNumber = 5
}
action = {
type = "expire"
}
}
]
})
}
resource "aws_ecs_task_definition" "main" {
family = "${local.app_name}-${local.environment}"
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = local.cpu
memory = local.memory
execution_role_arn = aws_iam_role.ecs_execution_role.arn
container_definitions = jsonencode([
{
name = local.app_name
image = "${aws_ecr_repository.app.repository_url}:latest"
portMappings = [
{
containerPort = local.container_port
protocol = "tcp"
}
]
environment = []
}
])
}
resource "aws_ecs_service" "main" {
name = "${local.app_name}-${local.environment}"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.main.arn
desired_count = local.desired_count
launch_type = "FARGATE"
network_configuration {
subnets = [aws_subnet.main.id, aws_subnet.secondary.id]
security_groups = [aws_security_group.ecs.id]
assign_public_ip = true
}
load_balancer {
target_group_arn = aws_lb_target_group.main.arn
container_name = local.app_name
container_port = local.container_port
}
depends_on = [aws_lb_listener.http]
}
resource "aws_security_group" "ecs" {
name_prefix = "${local.app_name}-${local.environment}-ecs-"
vpc_id = aws_vpc.main.id
ingress {
protocol = "tcp"
from_port = local.container_port
to_port = local.container_port
security_groups = [aws_security_group.alb.id]
}
egress {
protocol = "-1"
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_iam_role" "ecs_execution_role" {
name = "${local.app_name}-${local.environment}-ecs-execution-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ecs-tasks.amazonaws.com"
}
}
]
})
}
resource "aws_iam_role_policy_attachment" "ecs_execution_role_policy" {
role = aws_iam_role.ecs_execution_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
resource "aws_iam_role_policy" "ecs_execution_role_ecr" {
name = "${local.app_name}-${local.environment}-ecs-execution-ecr"
role = aws_iam_role.ecs_execution_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage"
]
Resource = "*"
}
]
})
}
resource "aws_iam_user" "gitlab_ci" {
name = "${local.app_name}-${local.environment}-gitlab-ci"
tags = {
Name = "${local.app_name}-${local.environment}-gitlab-ci-user"
}
}
resource "aws_iam_user_policy" "gitlab_ecr_policy" {
name = "${local.app_name}-${local.environment}-gitlab-ecr-policy"
user = aws_iam_user.gitlab_ci.name
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"ecr:InitiateLayerUpload",
"ecr:UploadLayerPart",
"ecr:CompleteLayerUpload",
"ecr:PutImage"
]
Resource = "*"
}
]
})
}
resource "aws_iam_user_policy" "gitlab_ecs_policy" {
name = "${local.app_name}-${local.environment}-gitlab-ecs-policy"
user = aws_iam_user.gitlab_ci.name
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"ecs:UpdateService",
"ecs:DescribeServices",
"ecs:ListServices",
"ecs:DescribeTasks",
"ecs:DescribeTaskDefinition"
]
Resource = "*"
}
]
})
}
resource "aws_iam_access_key" "gitlab_ci" {
user = aws_iam_user.gitlab_ci.name
}
output "alb_dns_name" {
description = "The DNS name of the load balancer"
value = aws_lb.main.dns_name
}
output "ecs_cluster_name" {
description = "The name of the ECS cluster"
value = aws_ecs_cluster.main.name
}
output "ecs_service_name" {
description = "The name of the ECS service"
value = aws_ecs_service.main.name
}
output "ecr_repository_url" {
description = "The URL of the ECR repository"
value = aws_ecr_repository.app.repository_url
}
output "ecr_repository_name" {
description = "The name of the ECR repository"
value = aws_ecr_repository.app.name
}
output "gitlab_aws_access_key_id" {
description = "AWS Access Key ID for GitLab CI"
value = aws_iam_access_key.gitlab_ci.id
sensitive = true
}
output "gitlab_aws_secret_access_key" {
description = "AWS Secret Access Key for GitLab CI"
value = aws_iam_access_key.gitlab_ci.secret
sensitive = true
}
output "route53_zone_name_servers" {
description = "Name servers for the Route53 zone"
value = aws_route53_zone.main.name_servers
}
output "route53_zone_id" {
description = "ID of the Route53 zone"
value = aws_route53_zone.main.zone_id
}
output "acm_certificate_arn" {
description = "ARN of the ACM certificate"
value = aws_acm_certificate.main.arn
}