Running scheduled tasks in AWS

20.02.2024 — AWS, AWS ECS, tech — 7 min read

How many times you have been tasked with building solutions that perform the same job regularly? I guess quite a few times. What do you do with an environment that is not being used between each job run? Let it just sit there till the next run? If not, then how to scale it down automatically? If you have the same questions then this post is for you.

I suggest we first look at the most common approaches for such kind of tasks:

Running an EC2 instance, perhaps even a scheduled one, so you can save costs because it's not running all the time. But how can you guarantee that the task is completed and an instance can be scaled down? Now that I think about it you can probably send a message or write to the DB/SSM parameter that the task is completed and regularly check it and act accordingly. However, it already feels like reinventing a wheel when there is a better tool for the job.
Running Lambda. One problem here - it has a time limit. If your tasks take longer than this limit then they won't be finished properly. Again, this limitation can be bypassed, for example, we can "remember" where exactly Lambda has progressed to and start it again from that place. However. this is not what Lambdas are designed for.

Here we are. Our scheduled tasks are waiting to be run, and we don't know how, or do we? I'm pretty sure that there are more good options for achieving the same goal, but today we are going to play with AWS Elastic Container Service.

First of all, once you decide to go with ECS, think about what base image you would like to use, which should be determined by the task that you want to run. Is it some Java, Python, PHP application, or something else? Practically it is better to go with the base image that would require the smallest effort for adjustments. Should I point out that ideally, it needs to be supported by a reputable maintainer? There is a third option - build your own.

Once you've decided on the image we'd need to push it to AWS Container Registry, so our tasks can use this image when the environments are being scaled up.

AWS Container Registry and Docker:

Inside your future image folder build it with: docker build -t {organization}/{project-name}:{tag} .
Then tag it like: docker tag {organization}/{project-name}:{tag} {aws-account-number}.dkr.ecr.{aws-region}.amazonaws.com/{project-name}:{tag}
Push it to ECR: docker push {aws-account-number}.dkr.ecr.{aws-region}.amazonaws.com/{project-name}:{tag}

If you are unable to push it then it most probably means that you are not logged in Docker CLI. Do it with: aws ecr get-login-password --region {aws-region} | docker login --username AWS --password-stdin {aws-account-number}.dkr.ecr.{aws-region}.amazonaws.com

You can use AWS Vault and pass to the command above different AWS profiles, which is useful when you are working with multiple accounts. Like: aws-vault exec {aws-profile-name} -- aws ecr get-login-password --region {aws-region} | docker login --username AWS --password-stdin {aws-account-number}.dkr.ecr.{aws-region}.amazonaws.com

Now, once our image is in ECR, let's build the infrastructure that will be using it.

AWS CLoudFormation for ECS

Let's describe the most important bits of the template written in yaml. First of all, we need to define parameters that will be passed to the template when it's being pushed to CloudFormation API. These parameters are used to define the environment in which the tasks will be run. For example, we can set how much CPU and memory will be available to the container, in which VPC it will be run, etc.

Parameters:
  # Later in the template I'm using references to different parameters,
  # like Account or Vpc, these should be passed to the template
  # when pushed to Cloudformation API or you can instead use AWS SSM
  # to refer to them dynamically. I'm omitting it in the parameters section
  # but don't forget to add them accordingly to your use case.
  ContainerCpu:
    Type: Number
    Default: 1024
    Description: How much CPU power is available to the container. 1024 is 1 CPU.
  ContainerMemory:
    Type: Number
    Default: 2048
    Description: How much memory in MB is available to the container.

Next, we need to define resources that will be created in AWS. In this case, we need to create a security group, IAM role, log group, ECS cluster, task definition, and scheduled task. Let's start with the security group and IAM role. The security group is used to define which traffic is allowed to the container, and the IAM role is used to define which permissions the container has. In this case, the container will be able to write logs to CloudWatch, run our specific task, and tag resources in ECS. The log group is used to store logs from the container, so we can check them later if needed.

Resources:
  SecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: permit VPC connection
      VpcId: {your-vpc-id}
      Tags:
        - Key: some tag
          Value: tags improve visibility

  ECSTaskExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: {role-name}
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: ecs-tasks.amazonaws.com
            Action: sts:AssumeRole
          - Effect: Allow
            Principal:
              Service: events.amazonaws.com
            Action: sts:AssumeRole
      Path: /
      Policies:
        - PolicyName: ecs-task-execution-policy
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - logs:CreateLogStream
                  - logs:PutLogEvents
                Resource: !GetAtt DataSyncLogGroup.Arn
              - Effect: Allow
                Action: ecs:RunTask
                Resource: !Sub arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:task-definition/${your-stack-name}-*
              - Effect: Allow
                Action: iam:PassRole
                Resource: !Sub arn:aws:iam::${AWS::AccountId}:role/${your-stack-name}
                Condition:
                  StringLike:
                    iam:PassedToService: ecs-tasks.amazonaws.com
              - Effect: Allow
                Action: ecs:TagResource
                Resource: "*"
                Condition:
                  StringEquals:
                    ecs:CreateAction:
                      - RunTask
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy

  DataSyncLogGroup:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: !Sub /ecs/${your-stack-name}/${your-environment}
      RetentionInDays: 7

Now, let's create an ECS cluster, task definition, and scheduled task. The cluster is a logical grouping of tasks or services. The task definition is a blueprint for the tasks that will be run in the cluster. The scheduled task is a rule that will trigger the task to run at a specific time.

DataSyncCluster:
  Type: AWS::ECS::Cluster
  Properties:
    ClusterName: !Sub ${your-stack-name}/${your-environment}
    ClusterSettings:
      - Name: containerInsights
        Value: enabled

DataSyncTaskDefinition:
  Type: AWS::ECS::TaskDefinition
  Properties:
    Family: !Sub ${your-stack-name}
    NetworkMode: awsvpc
    RequiresCompatibilities:
      - FARGATE
    Cpu: !Ref ContainerCpu
    Memory: !Ref ContainerMemory
    ExecutionRoleArn: !Ref ECSTaskExecutionRole
    ContainerDefinitions:
      - Name: !Sub ${your-stack-name}
        Cpu: !Ref ContainerCpu
        Memory: !Ref ContainerMemory
        Image: !Sub ${AWS::AccountId}.dkr.ecr.${AWS::Region}.amazonaws.com/${project-name}:${tag}
        Essential: true
        LogConfiguration:
          LogDriver: awslogs
          Options:
            mode: non-blocking
            awslogs-group: !Ref DataSyncLogGroup
            awslogs-region: !Sub ${AWS::Region}
            awslogs-create-group: true
            awslogs-stream-prefix: {your-stream-prefix}
        HealthCheck:
          Command:
            - "CMD-SHELL"
            - "exit 0"
          Interval: 30
          Timeout: 5
          Retries: 3
          StartPeriod: 60
        Environment:
          - Name: some
            Value: variable
        Command:
          - "--do"
          - "something"

DataSyncScheduledTask:
  Type: AWS::Events::Rule
  Properties:
    Name: !Sub ${your-stack-name}-${your-environment}
    Description: Scheduled task
    ScheduleExpression: rate(20 minutes)
    State: ENABLED
    Targets:
      - Arn: !GetAtt DataSyncCluster.Arn
        Id: DataSyncScheduledTask
        EcsParameters:
          TaskDefinitionArn: !Ref DataSyncTaskDefinition
          TaskCount: 1
          LaunchType: FARGATE
          NetworkConfiguration:
            AwsVpcConfiguration:
              AssignPublicIp: DISABLED
              Subnets:
                - !Sub ${your-subnet}
              SecurityGroups:
                - !Ref SecurityGroup
        RoleArn: !GetAtt ECSTaskExecutionRole.Arn

As you can see, we are using FARGATE launch type, which means that the tasks will be run in a serverless environment. We are passing CPU and memory parameters to the task definition, so we can adjust them when needed. The scheduled task is using the cluster, task definition, and execution role defined earlier. As it's running in a VPC - you need to pass the subnet and security group to the task configuration. The health check is a way to check if the task is running properly. In this case, we are just checking if the task is up, and if it's not - it will be restarted automatically. The image is the one we pushed to ECR earlier.

Environment variables can be passed to the task, so it can use them. The command is the one that will be run when the task is started. I recommend reading through about command and entrypoint in Docker documentation, as it can be a bit confusing at first. AssignPublicIp is set to DISABLED, as in this case, we don't need it, because it is an example task and it doesn't need to be available from the outside world. Lastly, the task is running every 20 minutes, but it can be adjusted to run at any time you need.

Conclusion:

That's it. Now you can push this template to CloudFormation API and it will create all the resources needed to run your tasks. Feel free to add more tasks to the template, so you can run multiple tasks in the same environment. This way you can run scheduled tasks in AWS in a cost-effective way, without the need to worry about scaling environments up or down. The tasks will be run in a serverless environment, meaning you can focus on the tasks themselves. I hope this post was helpful, and you can use it to run your tasks in AWS.

Thank you for reading, and have a great day!