Skip to main content

Manage Batch Jobs with AWS Batch

Manage Batch Jobs with AWS Batch

Manage Batch Jobs with AWS Batch

AWS Batch allows us to run batch workloads without managing any compute resources. Although newer services such as ECS might be more…

Manage Batch Jobs with AWS Batch

AWS Batch

AWS Batch allows us to run batch workloads without managing any compute resources. Although newer services such as ECS might be more appealing, we are going to take a deeper look at what AWS Batch provides and along the way we will deploy a sample Batch example using AWS CDK. In case you haven’t heard CDK, we would recommend checking our series of tutorials related to CDK.

AWS Batch dynamically provisions the optimal quantity and type of computing resources (e.g., CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. Spot instances can also be leveraged to save some money. After your job is finished, Batch can terminate the instances according to your needs.

Let’s deploy a simple Batch application to explore the service little further.

We will create a VPC, necessary IAM roles, Security Group, Job Definition, Job Queue, Compute Environment and Lambda Function for submitting a job every 4 hours via the Cloudwatch Events.

Job Queue: Jobs are submitted to a job queue, where they reside until they can be scheduled to run in a compute environment.
Job Definition: Specifies how jobs are going to run. While each job must reference a job definition, many of the parameters that are specified in the job definition can be overridden at runtime
Computer Environments: Job queues are mapped to one or more compute environment. Has two types: Managed and Unmanaged Compute Environments
Managed Compute Environments: Batch manages the capacity and instance types of the compute resources within the environment, based on the compute resource specification that you define when you create the compute environment
Unmanaged Compute Environments: In an unmanaged compute environment, you manage your own compute resources. You must ensure that the AMI you use for your compute resources meets the AMI specification.

For deploying;

# Install dependencies
npm install
# Edit .env for environment variables
vim .env
# Deploy
cdk deploy

After we’ve deployed the stack, we can head over to the Batch console and see the newly created resources.

Since we’ve have defined minimum vCPU as 0, there will be no compute resources created for us when no jobs running. To see how Batch handles the creation of necessary compute resources, go to the Lambda console and run the scheduler function manually.

You should see the newly created job in the Jobs section under Runnable tab. That means the job is scheduled to run and it is awaiting necessary compute resources to become active.

After some time your job will be executed and you can see it under the Succeeded tab.

Now head over to the EC2 console and check for the current instances, you will see that one instance is running. When Batch finishes the job it will terminate the instance. When a new job is submitted again, a new instance will be created for the job.

You can also check the logs for Batch Job on Cloudwatch Console.

That’s it, we’ve reached the end of this short tutorial. You can create different workloads and see which resources Batch will create for you. Batch handles job execution and compute resource management, allowing us to focus more on developing business-critical applications rather than setting up and managing complex resource combinations. AWS does not charge anything related to Batch, you only pay for the resources that you use.

When you are done with the stack don’t forget to delete it via CDK CLI.

# Destroy the stack
cdk destroy

The completed project can be found here.

Popular posts from this blog

Concurrency With Boto3

Concurrency with Boto3 Concurrency with Boto3 Asyncio provides set of tools for concurrent programming in Python. In a very simple sense it does this by having an event loop execute a… Concurrency in Boto3 Asyncio provides a set of tools for concurrent programming in Python . In a very simple sense, it does this by having an event loop execute a collection of tasks, with a key difference being that each task chooses when to yield control back to the event loop. Asyncio is a good fit for IO-bound and high-level structured network code. Boto3 (AWS Python SDK) falls into this category. A lot of existing libraries are not ready to be used with asyncio out of the box. They may block, or depend on concurrency features not available through the module. It’s still possible to use those libraries in an application based on asyncio by using an executor from concurrent.futures to run the code either in a separate thread or a separate process. The run_in_executor() method of the event...

Manage MongoDB Atlas Deployments with AWS CDK

Manage MongoDB Atlas Deployments with AWS CDK Manage MongoDB Atlas Deployments with AWS CDK MongoDB Atlas is a fully-managed cloud-based database service offered by MongoDB. It offers a variety of features such as automatic… Manage MongoDB Atlas Deployments with AWS CDK MongoDB Atlas is a fully-managed cloud-based database service offered by MongoDB. It offers a variety of features such as automatic backups, automatic scaling, and easy integration with other cloud services. AWS Cloud Development Kit(CDK) is a tool provided by Amazon Web Services (AWS) that allows you to define infrastructure as code using familiar programming languages such as TypeScript, JavaScript, Python, and others. MongoDB recently announced general availability for Atlas Integrations for AWS CloudFormation and CDK. In this article, we will go through the process of deploying MongoDB Atlas with AWS CDK. Prerequisites Before we start, you will need the following: An AWS account AWS CDK installed on your lo...

AWS Lambda Function URLs

AWS Lambda Function URLs AWS Lambda Function URLs AWS Lambda is a Serverless computing service offered by Amazon Web Services (AWS) that allows developers to run code without provisioning… AWS Lambda Function URLs AWS Lambda AWS Lambda is a Serverless computing service offered by Amazon Web Services ( AWS ) that allows developers to run code without provisioning or managing servers. In this tutorial, we will explore AWS Lambda Function URLs , which are the endpoints that allow you to invoke your Lambda functions. AWS Lambda Function URLs are unique HTTP endpoints that you can create using AWS Console, SDK or any other IaC tool. These URLs are used to trigger your Lambda function, and they can be integrated with a variety of workloads. Function URLs are dual stack-enabled, supporting IPv4 and IPv6. After you configure a function URL for your function, you can invoke your function through its HTTP(S) endpoint via a web browser, curl, Postman, or any HTTP client. Once you create ...