Skip to content

Commit

Permalink
Merge pull request #263 from NBISweden/apptainer
Browse files Browse the repository at this point in the history
Talk about Apptainer as the preferred Docker alternative instead of Singularity
  • Loading branch information
fasterius authored Aug 26, 2024
2 parents 1913673 + d812fc3 commit 9f3d2cd
Show file tree
Hide file tree
Showing 15 changed files with 544 additions and 380 deletions.
1 change: 1 addition & 0 deletions lectures/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ dependencies:
- r-ggplot2
- r-kableextra
- r-palmerpenguins
- r-rmarkdown
- r-stringi
- r-tidyr

Expand Down
113 changes: 78 additions & 35 deletions lectures/introduction/introduction.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion lectures/introduction/introduction.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ format: nbis-revealjs

- How to generate automated reports using [Quarto]{.green} and [Jupyter]{.green}

- How to use [Docker]{.green} and [Singularity]{.green} to distribute containerized
- How to use [Docker]{.green} and [Apptainer]{.green} to distribute containerized
computational environments
:::

Expand Down
103 changes: 70 additions & 33 deletions lectures/putting-it-together/putting-it-together.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion lectures/putting-it-together/putting-it-together.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ _Track the **full** environment and connect your code in a workflow_
[Containerization / virtualization]{.green}

- **Docker** – Used for packaging and isolating applications in containers. Dockerhub allows for convenient sharing. Requires root access.
- **Singularity/Apptainer** – Simpler Docker alternative geared towards high performance computing. Does not require root.
- **Apptainer/Singularity** – Simpler Docker alternative geared towards high performance computing. Does not require root.
- **Podman** - open source daemonless container tool similar to docker in many regards
- **Shifter** – Similar ambition as Singularity, but less focus on mobility and more on resource management.
- **VirtualBox/VMWare** – Virtualisation rather than containerization. Less lightweight, but no reliance on host kernel.
Expand Down
6 changes: 3 additions & 3 deletions lectures/snakemake/snakemake.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -865,15 +865,15 @@ $ snakemake --configfile config.yaml
- Run rules with specific conda environments

```{.bash code-line-numbers=false}
$ snakemake --use-conda
$ snakemake --software-deployment-method conda
```
:::

::: {.fragment}
- Run rules with specific Singularity or Docker containers
- Run rules with specific Apptainer or Docker containers

```{.bash code-line-numbers=false}
$ snakemake --use-singularity
$ snakemake --software-deployment-method apptainer
```
:::

Expand Down
46 changes: 39 additions & 7 deletions lectures/wrapups/containers-wrapup.html

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions lectures/wrapups/containers-wrapup.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ format: nbis-revealjs
- How to build docker images from Dockerfiles
- How to mount volumes inside containers
- How to push/pull docker images to/from DockerHub
- How to convert a docker image to singularity/apptainer format
- How to convert a docker image to Apptainer format

## If you want to learn more {.smaller}

- Docker documentation: [https://docs.docker.com/](https://docs.docker.com/)
- Dockerfile reference: [https://docs.docker.com/engine/reference/builder/](https://docs.docker.com/engine/reference/builder/)
- Apptainer documentation: [https://apptainer.org/docs/user/latest/](https://apptainer.org/docs/user/latest/)
- Apptainer documentation: [https://apptainer.org/docs/user/latest/](https://apptainer.org/docs/user/latest/)
18 changes: 9 additions & 9 deletions pages/containers/containers-1-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Container-based technologies are designed to make it easier to create, deploy,
and run applications by isolating them in self-contained software units (hence
their name). The idea is to package software and/or code together with
everything it needs (other packages it depends, various environment settings,
*etc.*) into one unit, *i.e.* a container. This way we can ensure that the
_etc._) into one unit, _i.e._ a container. This way we can ensure that the
software or code functions in exactly the same way regardless of where it's
executed. Containers are in many ways similar to virtual machines but more
lightweight. Rather than starting up a whole new operating system, containers
Expand All @@ -15,22 +15,22 @@ Containers have also proven to be a very good solution for packaging, running
and distributing scientific data analyses. Some applications of containers
relevant for reproducible research are:

* When publishing, package your analyses in a container image and let it
- When publishing, package your analyses in a container image and let it
accompany the article. This way interested readers can reproduce your analysis
at the push of a button.
* Packaging your analysis in a container enables you to develop on *e.g.* your
- Packaging your analysis in a container enables you to develop on _e.g._ your
laptop and seamlessly move to cluster or cloud to run the actual analysis.
* Say that you are collaborating on a project and you are using Mac while your
- Say that you are collaborating on a project and you are using Mac while your
collaborator is using Windows. You can then set up a container image specific
for your project to ensure that you are working in an identical environment.

One of the largest and most widely used container-based technologies is
*Docker*. Just as with Git, Docker was designed for software development but is
_Docker_. Just as with Git, Docker was designed for software development but is
rapidly becoming widely used in scientific research. Another container-based
technology is *Singularity*, which was developed to work well in computer
cluster environments, such as Uppmax. We will cover both Docker and Singularity
in this course, but the focus will be be on the former (since that is the most
widely used and runs on all three operating systems).
technology is _Apptainer_ (and the related _Singularity_), which was developed
to work well in computer cluster environments such as Uppmax. We will cover both
Docker and Apptainer in this course, but the focus will be be on the former
(since that is the most widely used and runs on all three operating systems).

This tutorial depends on files from the course GitHub repo. Take a look at the
[setup](pre-course-setup) for instructions on how to install Docker if you
Expand Down
134 changes: 63 additions & 71 deletions pages/containers/containers-7-singularity.md
Original file line number Diff line number Diff line change
@@ -1,89 +1,83 @@
## Singularity as an alternative container tool

Singularity is a container software alternative to Docker. It was originally
developed by researchers at Lawrence Berkeley National Laboratory with focus on
security, scientific software, and HPC clusters. One of the ways in which
Singularity is more suitable for HPC is that it very actively restricts
permissions so that you do not gain access to additional resources while inside
the container. Singularity also, unlike Docker, stores images as single files
using the *Singularity Image Format* (SIF). A SIF file is self-contained and can
be moved around and shared like any other file, which also makes it easy to work
with on an HPC cluster.
## Apptainer as an alternative container tool

Apptainer is a container software alternative to Docker. It was originally
developed as _Singularity_ by researchers at Lawrence Berkeley National
Laboratory (read more about this below) with focus on security, scientific
software, and HPC clusters. One of the ways in which Apptainer is more suitable
for HPC is that it very actively restricts permissions so that you do not gain
access to additional resources while inside the container. Apptainer also,
unlike Docker, stores images as single files using the _Singularity Image
Format_ (SIF). A SIF file is self-contained and can be moved around and shared
like any other file, which also makes it easy to work with on an HPC cluster.

> **Singularity and Apptainer** <br>
> The open source Singularity project was recently renamed to *Apptainer*.
> Confusingly, the company *Sylabs* still keeps their commercial branch of
> The open source Singularity project was renamed to _Apptainer_ in 2021.
> Confusingly, the company _Sylabs_ still keeps their commercial branch of
> the project under the Singularity name, and offer a free 'Community
> Edition' version. The name change was done in order to clarify the
> distinction between the open source project and the various commercial
> versions.
> At the moment there is virtually no difference to you as a user whether you
> use Singularity or Apptainer, but eventually it's very likely that the two
> will diverge.
> We have opted to stick with the original name in the material for now,
> while the change is still being adopted by the community and various
> documentation online. In the future we will however move to using only
> Apptainer to follow the open source route of the project.
While it is possible to define and build Singularity images from scratch, in a
> versions. At the moment there is virtually no difference to you as a user
> whether you use Singularity or Apptainer, but eventually it's very likely that
> the two will diverge.
While it is possible to define and build Apptainer images from scratch, in a
manner similar to what you've already learned for Docker, this is not something
we will cover here (but feel free to read more about this in *e.g.* the
[Singularity docs](https://sylabs.io/guides/master/user-guide/) or the [Uppmax
Singularity user guide](https://www.uppmax.uu.se/support/user-guides/singularity-user-guide/)).
we will cover here (but feel free to read more about this in _e.g._ the
[Apptainer docs](https://apptainer.org/docs/user/main/index.html).

The reasons for not covering Singularity more in-depth are varied, but it
The reasons for not covering Apptainer more in-depth are varied, but it
basically boils down to it being more or less Linux-only, unless you use Virtual
Machines (VMs). Even with this you'll run into issues of incompatibility of
various kinds, and these issues are further compounded if you're on one of the
new ARM64-Macs. You also need `root` (admin) access in order to actually *build*
Singularity images regardless of platform, meaning that you can't build them on
*e.g.* Uppmax, even though Singularity is already installed there. You can,
however, use the `--remote` flag, which runs the build on Singularity's own
new ARM64-Macs. You also need `root` (admin) access in order to actually _build_
Apptainer images regardless of platform, meaning that you can't build them on
_e.g._ Uppmax, even though Apptainer is already installed there. You can,
however, use the `--remote` flag, which runs the build on Apptainer's own
servers. This doesn't work in practice a lot of the time, though, since most
scientist will work in private Git repositories so that their research and code
is not available to anybody, and the `--remote` flag requires that *e.g.* the
is not available to anybody, and the `--remote` flag requires that _e.g._ the
`environment.yml` file is publicly available.

There are very good reasons to use Singularity, however, the major one being
There are very good reasons to use Apptainer, however, the major one being
that you aren't allowed to use Docker on most HPC systems! One of the nicer
features of Singularity is that it can convert Docker images directly for use
within Singularity, which is highly useful for the cases when you already built
your Docker image or if you're using a remotely available image stored on *e.g.*
features of Apptainer is that it can convert Docker images directly for use
within Apptainer, which is highly useful for the cases when you already built
your Docker image or if you're using a remotely available image stored on _e.g._
DockerHub. For a lot of scientific work based in R and/or Python, however, it is
most often the case that you build your own images, since you have a complex
dependency tree of software packages not readily available in existing images.
So, we now have another problem for building our own images:

1. Only Singularity is allowed on HPC systems, but you can't build images there
1. Only Apptainer is allowed on HPC systems, but you can't build images there
due to not having `root` access.
2. You can build Singularity images locally and transfer them to HPCs, but this
2. You can build Apptainer images locally and transfer them to HPCs, but this
is problematic unless you're running Linux natively.

Seems like a "catch 22"-problem, right? There are certainly workarounds (some of
which we have already mentioned) but most are roundabout or difficult to get
working for all use-cases. Funnily enough, there's a simple solution: run
Singularity locally from inside a Docker container! Conceptually very meta, yes,
Apptainer locally from inside a Docker container! Conceptually very meta, yes,
but works very well in practice. What we are basically advocating for is that
you stick with Docker for most of your container-based work, but convert your
Docker images using Singularity-in-Docker whenever you need to work on an HPC.
Docker images using Apptainer-in-Docker whenever you need to work on an HPC.
This is of course not applicable to Linux users or those of you who are fine
with working through using VMs and managing any issues that arise from doing
that.

> **Summary** <br>
> Singularity/Apptainer is a great piece of software that is easiest to use if
> you're working on a Linux environment. Docker is, however, easier to use from
> a cross-platform standpoint and covers all use-cases except running on HPCs.
> Apptainer is a great piece of software that is easiest to use if you're
> working on a Linux environment. Docker is, however, easier to use from a
> cross-platform standpoint and covers all use-cases except running on HPCs.
> Running on HPCs can be done by converting existing Docker images at runtime,
> while building images for use on HPCs can be done using local Docker images
> and Singularity-in-Docker.
> and Apptainer-in-Docker.
## Singularity-in-Docker
## Apptainer-in-Docker

By creating a bare-bones, Linux-based Docker image with Singularity you can
build Singularity images locally on non-Linux operating systems. There is
By creating a bare-bones, Linux-based Docker image with Apptainer you can
build Apptainer images locally on non-Linux operating systems. There is
already a good image setup for just this, and it is defined in this [GitHub
repository](https://github.com/kaczmarj/singularity-in-docker). Looking at the
repository](https://github.com/kaczmarj/apptainer-in-docker). Looking at the
instructions there we can see that we need to do the following:

```bash
Expand All @@ -101,53 +95,51 @@ listens to by default, meaning that it is needed for us to be able to
specify the location of the Docker container we want to convert to a SIF
file. The `kaczmarj/apptainer` part after the bind mounts is the image
location hosted at [DockerHub](https://hub.docker.com/r/kaczmarj/apptainer),
while the last line is the Singularity/Apptainer command that actually does
the conversion. All we need to do is to replace the `<IMAGE>` part with the
Docker image we want to convert, *e.g.* `my_docker_image`.
while the last line is the Apptainer command that actually does the conversion.
All we need to do is to replace the `<IMAGE>` part with the Docker image we want
to convert, _e.g._ `my_docker_image`.

* Replace `<IMAGE>` and `<TAG>` with one of your locally available Docker images
- Replace `<IMAGE>` and `<TAG>` with one of your locally available Docker images
and one of its tags and run the command - remember that you can use `docker
image ls` to check what images you have available.
image ls` to check what images you have available.

In the end you'll have a SIF file (*e.g.* `my_docker_image.sif`) that you can
In the end you'll have a SIF file (_e.g._ `my_docker_image.sif`) that you can
transfer to an HPC such as Uppmax and run whatever analyses you need. If you
want to be able to do this without having to remember all the code you can check
out the [this script](https://github.com/fasterius/dotfiles/blob/main/scripts/apptainer-in-docker.sh).

## Running Singularity
## Running Apptainer

The following exercises assume that you have a login to the Uppmax HPC cluster
in Uppsala, but will also work for any other system that has Singularity
installed - like if you managed to install Singularity on your local system or
in Uppsala, but will also work for any other system that has Apptainer
installed - like if you managed to install Apptainer on your local system or
have access to some other HPC cluster. Let's try to convert the Docker image for
this course directly from DockerHub:

```bash
singularity pull mrsa_proj.sif docker://nbisweden/workshop-reproducible-research
apptainer pull mrsa_proj.sif docker://nbisweden/workshop-reproducible-research
```

This should result in a SIF file called `mrsa_proj.sif`.

In the Docker image we included the code needed for the workflow in the
`/course` directory of the image. These files are of course also available in
the Singularity image. However, a Singularity image is read-only (unless using
the [sandbox](https://sylabs.io/guides/master/user-guide/build_a_container.html#creating-writable-sandbox-directories)
feature). This will be a problem if we try to run the workflow
within the `/course` directory, since the workflow will produce files and
Snakemake will create a `.snakemake` directory. Instead, we need to provide
the files externally from our host system and simply use the Singularity image
as the environment to execute the workflow in (*i.e.* all the software and
dependencies).
the Apptainer image. However, a Apptainer image is read-only. This will be a
problem if we try to run the workflow within the `/course` directory, since the
workflow will produce files and Snakemake will create a `.snakemake` directory.
Instead, we need to provide the files externally from our host system and simply
use the Apptainer image as the environment to execute the workflow in (_i.e._
all the software and dependencies).

In your current working directory
(`workshop-reproducible-research/tutorials/containers/`) the vital MRSA project
files are already available (`Snakefile`, `config.yml` and
`code/supplementary_material.qmd`). Since Singularity bind mounts the current
`code/supplementary_material.qmd`). Since Apptainer bind mounts the current
working directory we can simply execute the workflow and generate the output
files using:

```bash
singularity run mrsa_proj.sif
apptainer run mrsa_proj.sif
```

This executes the default run command, which is `snakemake -rp -c 1 --configfile
Expand All @@ -158,6 +150,6 @@ directory, including the `results/` directory containing the final HTML report.
> **Quick recap** <br>
> In this section we've learned:
>
> - How to build a Singularity image using Singularity inside Docker.
> - How to convert Docker images to Singularity images.
> - How to run Singularity images.
> - How to build a Apptainer image using Apptainer inside Docker.
> - How to convert Docker images to Apptainer images.
> - How to run Apptainer images.
Loading

0 comments on commit 9f3d2cd

Please sign in to comment.