This is the sample code for the AWS Batch Blog: Run High Performance Computing Workloads using AWS Batch MultiNode Jobs with Elastic Fabric Adapter
This library is licensed under the MIT-0 License. See the LICENSE file.
To get started, clone this repo locally:
git clone https://github.com/aws-samples/aws-batch-efa.git
cd aws-batch-efa-blogpost/
In part 1, we'll create all the necessary AWS Batch resources.
cd batch-resources/
First we'll create a launch template, this launch template installs EFA on the instance and configures the network interface to use EFA. In the launch_template.json
file substitute <Account Id>
, <Security Group>
, <Subnet Id>
and <KEY-PAIR-NAME>
with your your information.
Now create the launch template:
aws ec2 create-launch-template --cli-input-json file://launch_template.json
To ensure optimal physical locality of instances, we create a placement group, with strategy cluster
.
aws ec2 create-placement-group --group-name "efa" --strategy "cluster" --region [your_region]
Next, we'll create the compute environment, this defines the instance type, subnet and IAM role to be used. Edit the <same-subnet-as-in-LaunchTemplate>
and <account-id>
sections with the pertinent information. Then create the compute environment:
aws batch create-compute-environment --cli-input-json file://compute_environment.json
Finally we need a job queue to point to the compute environment:
aws batch create-job-queue --cli-input-json file://job_queue.json
In part 2, we build the docker image and upload it to Elastic Container Registery (ECR), so we can use it in our job.
cd ..
pwd # you should be in the aws-batch-efa-blogpost/ directory
First we'll build the docker image, to help with this, we included a Makefile
, simply run:
make
Then you can push the docker image to ECR, first modify the top of the Makefile
with your region and account id:
AWS_REGION=[your region]
ACCOUNT_ID=[your account id]
Next, push to ECR, note the Makefile
assumes you have an ECR repo named aws-batch-efa
:
make push # logs in, tags, and pushes to ECR
cd batch-resources/
Now we need a job definition, this defines which docker image to use for the job, edit the job_definition.json
file and substitute <ACCOUNT ID>
and <REGION>
. Then create the job definition:
aws batch register-job-definition --cli-input-json file://job_definition.json
{
"jobDefinitionArn": "arn:aws:batch:us-east-1:<account-id>:job-definition/EFA-MPI-JobDefinition:1",
"jobDefinitionName": "EFA-MPI-JobDefinition",
"revision": 1
}
Go back to the main directory:
cd ..
Finally we can submit a job!
make submit
- Arya Hezarkhani, Software Development Engineer, AWS Batch
- Jason Rupard, Principal Systems Development Engineer, AWS Batch
- Sean Smith, Software Development Engineer II, AWS HPC