Writing about Cloud, architecture, AWS and software engineering.

AWS ECS Auto Scaling

July 12, 2024

When I first started using AWS ECS, I was a bit confused about how to scale the service. I thought (and hoped) it was one simple setting, but it turned out to be a bit more complicated and requires various settings to be configured. This article summarizes the settings needed to enable auto-scaling for an ECS service.

Settings to enable

The following settings should be enabled to allow ECS to scale the service:

EC2 Auto Scaling group is not aware of the containers running on the EC2 instances and therefore ‘EC2 Instance scale-in protection’ on the EC2 Auto Scaling group should be enabled to prevent instances from being terminated with running tasks by the Auto Scaling group.

For the ECS capacity provider ‘Managed scaling’ should be enabled. When the capacity provider is configured with Managed scaling, ECS will manage the number of instances in the Auto Scaling group based on the number of pending tasks.

For every service, Service Auto Scaling should be enabled and configured correctly. Multiple scaling policies can be created for the service, for example, to scale based on CPU and/or memory usage or using CloudWatch Custom Metrics. Service Auto Scaling will adjust the number of tasks running on the ECS service based on the configured scaling policies.

How does the scaling work?

To get full auto scaling of services you need to configure the service auto scaling and capacity provider Managed scaling. This will increase the number of tasks running on the ECS service when the CPU and/or memory usage is above the configured threshold. ECS will try to launch this tasks on the available EC2 instances in the ECS capacity provider.

If the capacity provider has no EC2 instances available for pending tasks, then it will scale out by starting a new EC2 instance in the EC2 Auto Scaling group. After this instance is available, ECS will place the pending tasks on the new instance.

Scale out looks like this:

Scale in looks like this:

Conclusion

ECS Auto Scaling is a powerful feature that allows you to scale your services based on CPU and memory usage. It is not difficult to set up, but it requires multiple settings to be configured correctly.