AWS Batch with a custom Dockerfile

Jump to bottom

tilne edited this page Jul 24, 2020 · 3 revisions

Background

By default, AWS Batch jobs run in a docker image that we've created. That image is based on Amazon Linux 2 and does two things:

Mount shared directories
Create a hostfile, necessary for mpi jobs

Take a look at the Dockerfile for specifics.

Implementation

Create a AWS Batch Cluster
Go to the ECR Console and find an image with a name similar to paral-docke-t6ayh0ia49nm

Grab that URI, it should look like: 112850485306.dkr.ecr.us-east-1.amazonaws.com/paral-docke-t6ajh0ia39nm
Create a Makefile with the following contents:

# Makefile
distro=alinux
uri=[URI from ECR console]

build:
        docker build -f Dockerfile -t pcluster-$(distro) .

tag:
        docker tag pcluster-$(distro) $(uri):$(distro)

push: build tag
        docker push $(uri):$(distro)

Now, put the Dockerfile in the same directory as the above Makefile. You can then authenticate with ecr and push:

$ $(aws ecr get-login --no-include-email --region [your_region])
$ make push

On the cluster you can submit a job, which will automatically use the new image from your Dockerfile

$ awsbsub -n 2 hostname

Adding to existing Dockerfile

If you instead prefer to add to the existing dockerfile, maybe to add packages your application requires, do the following:

$ git clone https://github.com/aws/aws-parallelcluster.git
$ cd aws-parallelcluster/cli/pcluster/resources/batch/docker/

Create the Makefile shown above and amend the build step to add $(distro)/Dockerfile.

build:
        docker build -f $(distro)/Dockerfile -t pcluster-$(distro) .

Then you can edit the alinux/Dockerfile and scripts/entrypoint.sh files before pushing in the same manner as shown above.