Back to Insights

Three Tips for Easy Container Deployments on AWS

Christian Scott·

I deployed three microservices to AWS last week using ECS Fargate, an Application Load Balancer, ECR for container images, and GitLab CI for deployments. I've been running variations of this setup for years and it remains my go-to for services that need to be reliable without being over-engineered. No Kubernetes, no service mesh. Just containers behind a load balancer, all managed through Terraform.

Here's the setup and the three things that make it work well.

Architecture Overview

Here's how the pieces fit together:

Clients resolve the domain through Route53, which points to the ALB. The ALB terminates TLS using a certificate from ACM and forwards requests to containers running in ECS Fargate. On the deployment side, GitLab CI builds images and pushes them to ECR, then triggers ECS to pull the new version.

1. Store Container Images in ECR

I store all my container images in ECR rather than Docker Hub or another external registry. Since ECS is pulling images from within AWS, there's no cross-region bandwidth cost and pulls are fast. ECR also integrates directly with IAM, so I can create scoped credentials for CI/CD without managing separate Docker Hub tokens.

The lifecycle policies are particularly useful. When you're deploying multiple times per day, untagged images accumulate quickly. I configure ECR to automatically expire old images and keep only the last five, which keeps storage costs minimal without any manual cleanup.

In my Terraform, I create an IAM user specifically for GitLab CI with permissions scoped to ECR push and pull operations. The access keys come out as Terraform outputs, which I add to GitLab's CI/CD variables.

resource "aws_ecr_repository" "app" {
  name = "${local.app_name}-${local.environment}"
  
  image_scanning_configuration {
    scan_on_push = true
  }
}

resource "aws_ecr_lifecycle_policy" "app" {
  repository = aws_ecr_repository.app.name
  
  policy = jsonencode({
    rules = [{
      rulePriority = 1
      description = "Keep last 5 images"
      selection = {
        tagStatus = "untagged"
        countType = "imageCountMoreThan"
        countNumber = 5
      }
      action = { type = "expire" }
    }]
  })
}

I also enable image scanning on push. It's free, runs automatically, and catches known CVEs in your base images. It won't find application-level vulnerabilities, but it's a good baseline.

2. Use Fargate for Serverless Container Execution

ECS supports two launch types: EC2 and Fargate. With EC2, you provision and manage the instances that run your containers. With Fargate, you specify CPU and memory requirements and AWS handles the underlying compute. For most workloads, Fargate is the better choice because there are no instances to patch, no capacity to plan, and no cluster autoscaling to configure.

I define the CPU and memory in my task definition, and Fargate provisions the right amount of compute for each task. If a task fails health checks, ECS replaces it automatically. All I see are the container logs.

Here's the request flow from client to container:

The ALB receives HTTPS traffic on port 443, terminates TLS, and forwards HTTP to the target group on port 80. The target group routes requests to healthy ECS tasks. During deployments, ECS starts new tasks, waits for them to pass health checks, then drains connections from old tasks. This gives you zero-downtime deployments without any additional configuration.

Fargate pricing is per-second based on the vCPU and memory you allocate. A task with 0.5 vCPU and 1GB memory running continuously costs around $30/month. That's more expensive than a small VPS, but you're not responsible for OS updates or instance failures.

resource "aws_ecs_task_definition" "main" {
  family = "${local.app_name}-${local.environment}"
  network_mode = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu = 512
  memory = 1024
  
  container_definitions = jsonencode([{
    name = local.app_name
    image = "${aws_ecr_repository.app.repository_url}:latest"
    portMappings = [{
      containerPort = 80
      protocol = "tcp"
    }]
  }])
}

resource "aws_ecs_service" "main" {
  name = "${local.app_name}-${local.environment}"
  cluster = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.main.arn
  desired_count = 1
  launch_type = "FARGATE"
}

Scaling is straightforward. To run more instances, I increase desired_count and apply. For automatic scaling, I add a target tracking policy based on CPU utilization or ALB request count. The load balancer distributes traffic across all running tasks.

3. Automate DNS and TLS with Route53 and ACM

I use Route53 for DNS and ACM for TLS certificates. ACM provides free certificates for domains you can prove ownership of, and Route53 lets you prove that ownership automatically through DNS validation. Once configured, certificates renew automatically and you never have to think about expiration.

The validation flow works like this: when I request a certificate from ACM, it generates a CNAME record that I need to create in DNS. My Terraform creates that record in Route53. ACM sees the record, validates domain ownership, and issues the certificate. From then on, renewal happens automatically.

There are two ways to set up DNS for this:

Option A: Migrate your entire domain to Route53. Create a hosted zone for your root domain (example.com) and update the nameserver records at your registrar to point to AWS. This gives you full control over all DNS records in Terraform.

Option B: Delegate a subdomain to Route53. Create a hosted zone for just the subdomain (app.example.com) and add NS records for that subdomain at your existing DNS provider. This works well when you don't control the root domain or other teams manage it.

resource "aws_route53_zone" "main" {
  name = local.domain_name
}

resource "aws_acm_certificate" "main" {
  domain_name = local.domain_name
  validation_method = "DNS"
}

resource "aws_route53_record" "cert_validation" {
  for_each = {
    for dvo in aws_acm_certificate.main.domain_validation_options : 
    dvo.domain_name => {
      name = dvo.resource_record_name
      record = dvo.resource_record_value
      type = dvo.resource_record_type
    }
  }
  
  zone_id = aws_route53_zone.main.zone_id
  name = each.value.name
  records = [each.value.record]
  ttl = 60
  type = each.value.type
}

resource "aws_acm_certificate_validation" "main" {
  certificate_arn = aws_acm_certificate.main.arn
  validation_record_fqdns = [
    for record in aws_route53_record.cert_validation : record.fqdn
  ]
}

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.main.arn
  port = 443
  protocol = "HTTPS"
  certificate_arn = aws_acm_certificate_validation.main.certificate_arn
  
  default_action {
    type = "forward"
    target_group_arn = aws_lb_target_group.main.arn
  }
}

The for_each on the validation record supports multi-domain certificates if needed. For a single domain, it creates one validation record.

The Deployment Pipeline

I use GitLab CI to build and deploy. The goal is simple: push to main, and the new version is live within a few minutes. Here's the flow:

The build stage creates the Docker image and pushes it to ECR with two tags: the git commit ref (for rollbacks) and "latest" (for ECS to pull). The deploy stage triggers an ECS service update.

One important detail: ECS caches the task definition, so pushing a new "latest" tag to ECR doesn't automatically trigger a deployment. I use the --force-new-deployment flag to tell ECS to pull fresh images. Without this flag, ECS will keep running the old image even after you've pushed a new one.

.build_image:
  stage: build
  script:
    - docker build -t $ECR_REPOSITORY_URL:$CI_COMMIT_REF_SLUG .
    - docker build -t $ECR_REPOSITORY_URL:latest .
    - aws ecr get-login-password | docker login --username AWS --password-stdin $ECR_REPOSITORY_URL
    - docker push $ECR_REPOSITORY_URL:$CI_COMMIT_REF_SLUG
    - docker push $ECR_REPOSITORY_URL:latest

deploy_to_ecs:
  stage: deploy
  script:
    - aws ecs update-service --cluster "app-prod" --service "app-prod" --force-new-deployment

I run terraform apply manually for infrastructure changes rather than automating it in CI. For application deployments, the pipeline handles everything.

Summary

This setup handles the common requirements for running containers in production: load balancing, TLS termination, health checks, zero-downtime deployments, and automated certificate renewal. It doesn't include service discovery, distributed tracing, or canary deployments. If you need those capabilities, you can add them, but for most internal services and straightforward web applications, this is sufficient.

Because everything is defined in Terraform, I can create an identical environment for staging by changing a few variables and running apply. The full configuration is about 300 lines of HCL, which is small enough to understand completely and debug when something goes wrong.

Monthly cost for a small service is around $50: $16 for the ALB (minimum charge regardless of traffic), $30 for a single Fargate task, and a few dollars for Route53, ECR storage, and data transfer.

Need help setting this up? If you'd like assistance deploying your containers on AWS or building out your infrastructure, get in touch. I'm happy to discuss your project and provide a free estimate.

Full Terraform Configuration

Here's the complete Terraform configuration. You'll need to adjust the locals at the top for your app name, environment, and domain.

terraform {
  backend "s3" {
    bucket         = "your-terraform-state-bucket"
    key            = "prod/app/terraform.tfstate"
    region         = "us-east-1"
    use_lockfile   = true
    encrypt        = true
  }
}

provider "aws" {
  region = "us-east-1"
}

locals {
  app_name       = "myapp"
  environment    = "prod"
  container_port = 80
  cpu            = 512
  memory         = 1024
  desired_count  = 1
  domain_name    = "${local.app_name}.example.com"
}

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name = "${local.app_name}-${local.environment}-vpc"
  }
}

resource "aws_subnet" "main" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.1.0/24"
  availability_zone       = "us-east-1a"
  map_public_ip_on_launch = true

  tags = {
    Name = "${local.app_name}-${local.environment}-subnet-1"
  }
}

resource "aws_subnet" "secondary" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.2.0/24"
  availability_zone       = "us-east-1b"
  map_public_ip_on_launch = true

  tags = {
    Name = "${local.app_name}-${local.environment}-subnet-2"
  }
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "${local.app_name}-${local.environment}-igw"
  }
}

resource "aws_route_table" "main" {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }

  tags = {
    Name = "${local.app_name}-${local.environment}-rt"
  }
}

resource "aws_route_table_association" "main" {
  subnet_id      = aws_subnet.main.id
  route_table_id = aws_route_table.main.id
}

resource "aws_route_table_association" "secondary" {
  subnet_id      = aws_subnet.secondary.id
  route_table_id = aws_route_table.main.id
}

resource "aws_security_group" "alb" {
  name_prefix = "${local.app_name}-${local.environment}-alb-"
  vpc_id      = aws_vpc.main.id

  ingress {
    protocol    = "tcp"
    from_port   = 80
    to_port     = 80
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    protocol    = "tcp"
    from_port   = 443
    to_port     = 443
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    protocol    = "-1"
    from_port   = 0
    to_port     = 0
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_lb" "main" {
  name               = "${local.app_name}-${local.environment}"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb.id]
  subnets            = [aws_subnet.main.id, aws_subnet.secondary.id]

  enable_deletion_protection = false

  tags = {
    Name = "${local.app_name}-${local.environment}-alb"
  }
}

resource "aws_lb_target_group" "main" {
  name        = "${local.app_name}-${local.environment}"
  port        = local.container_port
  protocol    = "HTTP"
  vpc_id      = aws_vpc.main.id
  target_type = "ip"

  health_check {
    healthy_threshold   = 2
    unhealthy_threshold = 10
    timeout             = 5
    interval            = 30
    path                = "/health"
    port                = "traffic-port"
  }
}

resource "aws_lb_listener" "http" {
  load_balancer_arn = aws_lb.main.arn
  port              = 80
  protocol          = "HTTP"

  default_action {
    type = "redirect"
    redirect {
      port        = "443"
      protocol    = "HTTPS"
      status_code = "HTTP_301"
    }
  }
}

resource "aws_route53_zone" "main" {
  name = local.domain_name

  tags = {
    Name = "${local.app_name}-${local.environment}-zone"
  }
}

resource "aws_acm_certificate" "main" {
  domain_name       = local.domain_name
  validation_method = "DNS"

  lifecycle {
    create_before_destroy = true
  }

  tags = {
    Name = "${local.app_name}-${local.environment}-cert"
  }
}

resource "aws_route53_record" "cert_validation" {
  for_each = {
    for dvo in aws_acm_certificate.main.domain_validation_options : dvo.domain_name => {
      name   = dvo.resource_record_name
      record = dvo.resource_record_value
      type   = dvo.resource_record_type
    }
  }

  allow_overwrite = true
  name            = each.value.name
  records         = [each.value.record]
  ttl             = 60
  type            = each.value.type
  zone_id         = aws_route53_zone.main.zone_id
}

resource "aws_acm_certificate_validation" "main" {
  certificate_arn         = aws_acm_certificate.main.arn
  validation_record_fqdns = [for record in aws_route53_record.cert_validation : record.fqdn]
}

resource "aws_route53_record" "main" {
  zone_id = aws_route53_zone.main.zone_id
  name    = local.domain_name
  type    = "A"

  alias {
    name                   = aws_lb.main.dns_name
    zone_id                = aws_lb.main.zone_id
    evaluate_target_health = true
  }
}

resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.main.arn
  port              = 443
  protocol          = "HTTPS"
  ssl_policy        = "ELBSecurityPolicy-TLS13-1-2-2021-06"
  certificate_arn   = aws_acm_certificate_validation.main.certificate_arn

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.main.arn
  }
}

resource "aws_ecs_cluster" "main" {
  name = "${local.app_name}-${local.environment}"
}

resource "aws_ecr_repository" "app" {
  name = "${local.app_name}-${local.environment}"

  image_scanning_configuration {
    scan_on_push = true
  }

  tags = {
    Name = "${local.app_name}-${local.environment}-ecr"
  }
}

resource "aws_ecr_lifecycle_policy" "app" {
  repository = aws_ecr_repository.app.name

  policy = jsonencode({
    rules = [
      {
        rulePriority = 1
        description  = "Keep last 5 images"
        selection = {
          tagStatus   = "untagged"
          countType   = "imageCountMoreThan"
          countNumber = 5
        }
        action = {
          type = "expire"
        }
      }
    ]
  })
}

resource "aws_ecs_task_definition" "main" {
  family                   = "${local.app_name}-${local.environment}"
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu                      = local.cpu
  memory                   = local.memory
  execution_role_arn       = aws_iam_role.ecs_execution_role.arn

  container_definitions = jsonencode([
    {
      name  = local.app_name
      image = "${aws_ecr_repository.app.repository_url}:latest"
      portMappings = [
        {
          containerPort = local.container_port
          protocol      = "tcp"
        }
      ]
      environment = []
    }
  ])
}

resource "aws_ecs_service" "main" {
  name            = "${local.app_name}-${local.environment}"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.main.arn
  desired_count   = local.desired_count
  launch_type     = "FARGATE"

  network_configuration {
    subnets          = [aws_subnet.main.id, aws_subnet.secondary.id]
    security_groups  = [aws_security_group.ecs.id]
    assign_public_ip = true
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.main.arn
    container_name   = local.app_name
    container_port   = local.container_port
  }

  depends_on = [aws_lb_listener.http]
}

resource "aws_security_group" "ecs" {
  name_prefix = "${local.app_name}-${local.environment}-ecs-"
  vpc_id      = aws_vpc.main.id

  ingress {
    protocol        = "tcp"
    from_port       = local.container_port
    to_port         = local.container_port
    security_groups = [aws_security_group.alb.id]
  }

  egress {
    protocol    = "-1"
    from_port   = 0
    to_port     = 0
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_iam_role" "ecs_execution_role" {
  name = "${local.app_name}-${local.environment}-ecs-execution-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "ecs-tasks.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "ecs_execution_role_policy" {
  role       = aws_iam_role.ecs_execution_role.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}

resource "aws_iam_role_policy" "ecs_execution_role_ecr" {
  name = "${local.app_name}-${local.environment}-ecs-execution-ecr"
  role = aws_iam_role.ecs_execution_role.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "ecr:GetAuthorizationToken",
          "ecr:BatchCheckLayerAvailability",
          "ecr:GetDownloadUrlForLayer",
          "ecr:BatchGetImage"
        ]
        Resource = "*"
      }
    ]
  })
}

resource "aws_iam_user" "gitlab_ci" {
  name = "${local.app_name}-${local.environment}-gitlab-ci"

  tags = {
    Name = "${local.app_name}-${local.environment}-gitlab-ci-user"
  }
}

resource "aws_iam_user_policy" "gitlab_ecr_policy" {
  name = "${local.app_name}-${local.environment}-gitlab-ecr-policy"
  user = aws_iam_user.gitlab_ci.name

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "ecr:GetAuthorizationToken",
          "ecr:BatchCheckLayerAvailability",
          "ecr:GetDownloadUrlForLayer",
          "ecr:BatchGetImage",
          "ecr:InitiateLayerUpload",
          "ecr:UploadLayerPart",
          "ecr:CompleteLayerUpload",
          "ecr:PutImage"
        ]
        Resource = "*"
      }
    ]
  })
}

resource "aws_iam_user_policy" "gitlab_ecs_policy" {
  name = "${local.app_name}-${local.environment}-gitlab-ecs-policy"
  user = aws_iam_user.gitlab_ci.name

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "ecs:UpdateService",
          "ecs:DescribeServices",
          "ecs:ListServices",
          "ecs:DescribeTasks",
          "ecs:DescribeTaskDefinition"
        ]
        Resource = "*"
      }
    ]
  })
}

resource "aws_iam_access_key" "gitlab_ci" {
  user = aws_iam_user.gitlab_ci.name
}

output "alb_dns_name" {
  description = "The DNS name of the load balancer"
  value       = aws_lb.main.dns_name
}

output "ecs_cluster_name" {
  description = "The name of the ECS cluster"
  value       = aws_ecs_cluster.main.name
}

output "ecs_service_name" {
  description = "The name of the ECS service"
  value       = aws_ecs_service.main.name
}

output "ecr_repository_url" {
  description = "The URL of the ECR repository"
  value       = aws_ecr_repository.app.repository_url
}

output "ecr_repository_name" {
  description = "The name of the ECR repository"
  value       = aws_ecr_repository.app.name
}

output "gitlab_aws_access_key_id" {
  description = "AWS Access Key ID for GitLab CI"
  value       = aws_iam_access_key.gitlab_ci.id
  sensitive   = true
}

output "gitlab_aws_secret_access_key" {
  description = "AWS Secret Access Key for GitLab CI"
  value       = aws_iam_access_key.gitlab_ci.secret
  sensitive   = true
}

output "route53_zone_name_servers" {
  description = "Name servers for the Route53 zone"
  value       = aws_route53_zone.main.name_servers
}

output "route53_zone_id" {
  description = "ID of the Route53 zone"
  value       = aws_route53_zone.main.zone_id
}

output "acm_certificate_arn" {
  description = "ARN of the ACM certificate"
  value       = aws_acm_certificate.main.arn
}