Cloud April 12, 2026 · 5 min read

How to Fix ECS Fargate Tasks Unreachable from ALB in Private Subnets

ecs fargate alb private subnet security groups aws networking

ECS Fargate tasks unreachable from an ALB in private subnets usually come down to one of three misconfigurations: the target group uses instance target type instead of ip, the ECS task security group doesn't allow inbound from the ALB security group, or the private subnet lacks a NAT gateway route so the Fargate agent can't pull the container image and the task never starts.

Work through three layers in order: target group type, security groups, then network routing. Here's a systematic debugging approach with the exact CLI commands and Terraform fixes.

Verify That the Target Group Uses IP Target Type

Fargate tasks use awsvpc networking, which assigns each task its own ENI and private IP. The ALB target group must use ip as its target type, not instance. This is the single most common misconfiguration, and the error is silent: the ALB never routes traffic because ECS can't register the task IPs into an instance-type target group.

# Check the target group's target type.
aws elbv2 describe-target-groups \
  --names my-service-tg \
  --query 'TargetGroups[0].TargetType'

# Expected output: "ip"
# If it says "instance", you must recreate the target group.

# Verify registered targets and their health.
aws elbv2 describe-target-health \
  --target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/my-service-tg/abc123

Fix ALB and ECS Task Security Group Rules

You need two security groups that reference each other. The ALB security group allows inbound traffic from the internet (or your CIDR). The ECS task security group allows inbound traffic only from the ALB security group on the container port. The most common mistake is hardcoding a CIDR or forgetting this rule entirely, which causes ALB health checks to fail with a draining or unhealthy status.

resource "aws_security_group" "alb" {
  name   = "alb-sg"
  vpc_id = aws_vpc.main.id

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_security_group" "ecs_task" {
  name   = "ecs-task-sg"
  vpc_id = aws_vpc.main.id

  # Allow traffic from ALB on the container port.
  ingress {
    from_port       = 8080
    to_port         = 8080
    protocol        = "tcp"
    security_groups = [aws_security_group.alb.id]
  }

  # Outbound for image pull and secrets.
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Ensure Private Subnets Have a NAT Gateway Route

Fargate tasks in private subnets need outbound internet access to pull images from ECR, fetch secrets from Secrets Manager, and register with the ECS control plane. Without a NAT gateway (or VPC endpoints), the task fails to start and the target group shows zero registered targets. Check your route table. The private subnet must route 0.0.0.0/0 to a NAT gateway, not an internet gateway.

# List route tables associated with your private subnet.
aws ec2 describe-route-tables \
  --filters "Name=association.subnet-id,Values=subnet-0abc123" \
  --query 'RouteTables[0].Routes'

# You need a route like:
# { "DestinationCidrBlock": "0.0.0.0/0", "NatGatewayId": "nat-xxx" }
# If you see an InternetGatewayId, the subnet is public, not private.
# If there is no 0.0.0.0/0 route at all, add a NAT gateway.

Confirm the ECS Service Network Configuration

The ECS service definition must place tasks in the private subnets (not the ALB's public subnets) and attach the correct security group. The ALB itself must sit in public subnets across at least two availability zones. Mixing these up is a subtle but devastating mistake. If you put both the ALB and tasks in private subnets, the ALB has no public IP and is unreachable from the internet. If you put tasks in public subnets without assigning a public IP, they can't pull images.

resource "aws_ecs_service" "app" {
  name            = "app-service"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.app.arn
  desired_count   = 2
  launch_type     = "FARGATE"

  network_configuration {
    # Tasks go in private subnets.
    subnets          = var.private_subnet_ids
    security_groups  = [aws_security_group.ecs_task.id]
    assign_public_ip = false
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.app.arn
    container_name   = "app"
    container_port   = 8080
  }
}

Debug ALB Health Check Failures for Fargate Tasks

Even when networking is correct, the ALB might mark targets as unhealthy if the health check path returns a non-200 status or the task takes too long to boot. A common pitfall: the target group health check defaults to port traffic-port and path /, but your app might respond only on /health or require 30+ seconds to start. Check stopped task reasons and target health descriptions for the specific failure mode.

# Get the reason targets are unhealthy.
aws elbv2 describe-target-health \
  --target-group-arn "$TG_ARN" \
  --query 'TargetHealthDescriptions[*].{IP:Target.Id,State:TargetHealth.State,Reason:TargetHealth.Reason}'

# Check why tasks stopped (image pull failures, OOM, crash).
aws ecs describe-tasks \
  --cluster my-cluster \
  --tasks "$(aws ecs list-tasks --cluster my-cluster --service-name app-service --desired-status STOPPED --query 'taskArns[0]' --output text)" \
  --query 'tasks[0].stoppedReason'

VPC Endpoints as a NAT Gateway Alternative

NAT gateways cost roughly $32/month plus data processing charges. If your only outbound need is ECR, CloudWatch Logs, and Secrets Manager, you can replace the NAT gateway with VPC interface endpoints. This is cheaper for low-traffic services and more secure because traffic never leaves the AWS network. You need endpoints for ecr.api, ecr.dkr, logs, and the S3 gateway endpoint (ECR stores layers in S3).

# Minimal set of VPC endpoints for Fargate without a NAT gateway.
resource "aws_vpc_endpoint" "ecr_api" {
  vpc_id              = aws_vpc.main.id
  service_name        = "com.amazonaws.us-east-1.ecr.api"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = var.private_subnet_ids
  security_group_ids  = [aws_security_group.vpc_endpoints.id]
  private_dns_enabled = true
}

# Repeat the same pattern for ecr.dkr and logs.
# S3 uses a gateway endpoint, which is free.
resource "aws_vpc_endpoint" "s3" {
  vpc_id            = aws_vpc.main.id
  service_name      = "com.amazonaws.us-east-1.s3"
  vpc_endpoint_type = "Gateway"
  route_table_ids   = var.private_route_table_ids
}

Quick Diagnostic Checklist

When you encounter this problem, run through these checks in order because each one depends on the previous. First, confirm that the target group type is ip. If it's instance, nothing else matters until you fix it. Second, verify that there are registered targets in the target group. If you see zero, the task isn't starting (check stopped task reasons and NAT/VPC endpoint routes). Third, if targets are registered but unhealthy, the problem is security groups or health check configuration. The describe-target-health output gives you the specific reason: Elb.InitialHealthChecking means you should wait, Target.Timeout means the security group is blocking the health check, and Target.ResponseCodeMismatch means your health check path is wrong.

← Back to all articles