Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracker for "physically bound" containers #644

Open
cgwalters opened this issue Jun 26, 2024 · 11 comments
Open

tracker for "physically bound" containers #644

cgwalters opened this issue Jun 26, 2024 · 11 comments
Labels
area/client Related to the client/CLI area/install Issues related to `bootc install` enhancement New feature or request

Comments

@cgwalters
Copy link
Collaborator

cgwalters commented Jun 26, 2024

Splitting this out from #128 and also from CentOS/centos-bootc#282

What we want to support is an opinionated way to "physically" embed (app) containers inside a (bootc) container.

From the UX point of view, a really key thing is there is one container image - keeping the problem domain of "versioning/mirroring" totally simple.

There's things like

FROM quay.io/centos-bootc/centos-bootc:stream9
RUN podman --storage-driver=vfs --root=/usr/share/containers/storage pull <someimage>
COPY somecontainer.container /usr/share/containers/systemd

as one implementation path. A lot of sub-issues here around nested overlayfs and whiteouts.

I personally think what would work best here actually is a model where we have an intelligent build process that does something like this - basically we should support a flow that takes the underlying layers (tarballs), and renames all the files to prefix with /usr/share/containers/storage/overlay or so, plus a bit that adds all the metadata as a final layer - this would help ensure that we never re-pull unchanged layers even for "physically" bound images.

IOW it'd look like

[base image layer 1]
[base image layer 2]
...
[embedded content layer 1, but with all files included renamed to prefix with /usr/share/containers/storage/overlay/<blobid>]
...
[embedded layer with everything else in /usr/share/containers/storage *except* the layers]
...

The big difference between this and RUN podman --root pull is that inherently that is going to result in a single "physical" layer in the bootc image, even if the input container image has multiple layers.

A reason I argue for this is that inherently RUN podman pull is (without forcing on stuff like podman build --timestamp going to be highly subject to "timestamp churn" on the random json files that podman creates, and that is going to mean every time the base image changes the client has to download these "physically embedded" images, even if logically they didn't change. Of course there's still outstanding bugs like containers/buildah#5592 that defeat layer caching in general.

However...note that this model "squashes" all the layers in the app images into one layer in the base image, so on the network, e.g. the base image used by an app changes, it will force a re-fetch of the entire app (all its layers), even if some of the app layers didn't change.

In other words, IMO this model breaks some of the advantages of the content-addressed storage in OCI by default. We'd need deltas to mitigate.

(For people using ostree-on-the-network for the host today, this is mitigated because ostree always behaves similarly to zstd:chunked and has static deltas; but I think we want to make this work with OCI)

Longer term though, IMO this approach clashes with the direction I think we need to take for e.g. configmaps - we really will need to get into the business of managing more than just one bootable container image, which leads to:

@cgwalters cgwalters added enhancement New feature or request area/install Issues related to `bootc install` area/client Related to the client/CLI labels Jun 26, 2024
@vrothberg
Copy link
Member

Maybe additional image stores can play a role here. Images are read-only.
One difficulty would be to decide when to remove old images.

@cgwalters
Copy link
Collaborator Author

OK people are going to keep doing RUN podman pull (ref e.g. this commit) and since one can kind of hack it into working (actually, regardless we basically need to support what people are doing today) I think we should probably promote this into a bootc-supported verb where it feels more declarative.

Something like RUN bootc image pull-to-embed or something? A super messy thing here is of course where the additional image store is, today usr/lib/containers/storage is just a "convention".

But probably this command could: query the storage config and take the first AIS location in /usr it finds, and error out if it's not present?

Also, something we should definitely do here is canonicalize all the timestamps in the image store to help with reproducible builds. (Today of course, containers/buildah#4242 rains on that parade, but we can prepare for the day it gets fixed...)

@vrothberg
Copy link
Member

I am undecided. It would be a very opinionated way of pulling/embedding images. If users wanted more flexibility, we'd might end up with similar flags to Podman. In that case, users could very well just use Podman.

But probably this command could: query the storage config and take the first AIS location in /usr it finds, and error out if it's not present?

Reading the configuration of another tool is somehow risky. I would prefer if podman info could display additional image stores.

@vrothberg
Copy link
Member

Additional image stores are currently not correctly reported by podman info. I'll take a look at it ✔️

@vrothberg
Copy link
Member

Opened containers/storage#2094 which will soon get addressed.

@cgwalters
Copy link
Collaborator Author

In that case, users could very well just use Podman.

Yeah I'd agree in the end fixes here should live in podman.

Reading the configuration of another tool is somehow risky.

The conf isn't wholly owned just by podman, I think the risk is more "parse it without using the c/storage Go code" right?

(This isn't like the quadlet files which do only have parsing code that lives in podman)

@vrothberg
Copy link
Member

The conf isn't wholly owned just by podman, I think the risk is more "parse it without using the c/storage Go code" right?

That plus making sure that bootc and podman always use the same version of c/storage. If a new feature was added to the parsing code and the two tools differ, we can run into trouble.

@cgwalters
Copy link
Collaborator Author

That plus making sure that bootc and podman always use the same version of c/storage. If a new feature was added to the parsing code and the two tools differ, we can run into trouble.

That problem already exists today though with all the other things consuming c/storage, including buildah and skopeo but most notably crio which has a different release cadence - and did break in practice a while back.

This discussion may seem like a digression but I think it's quite relevant as we look towards unified storage for example and how that would work.

@ggiguash
Copy link

ggiguash commented Sep 17, 2024

When running podman pull command in a bootc container build procedure, I'm seeing the following errors.
I should add that I'm running 2 builds in parallel pulling the same images - one for rhel and one for centos. I'm not sure if it's related to the problem or not.
Any idea how to work around those?

STEP 1/14: FROM localhost/cos9-bootc-source:latest
STEP 2/14: RUN --mount=type=secret,id=pullsecret,dst=/run/secrets/pull-secret.json     podman pull     --authfile /run/secrets/pull-secret.json     --root=/var/lib/containers/storage-preloaded     "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0c12ddea1c28a7287700c19e623c4bf66ce9931a6b8a8c882fd80204393996c6"
time="2024-09-17T13:05:47Z" level=warning msg="\"/\" is not a shared mount, this could cause issues or missing mounts with rootless containers"
Trying to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0c12ddea1c28a7287700c19e623c4bf66ce9931a6b8a8c882fd80204393996c6...
Getting image source signatures
Copying blob sha256:4f05dc73d94cbbd0cd91587184e671ba4112087f93f86139ebdcb3afa3b1aae7
Copying blob sha256:d8e5548c38deb03c17871de35e63257444e05109754c25725185b8bf2630cfce
Copying blob sha256:0ea779877a0c9ea774298c27e45ffa6589e7b157fef28dad9d6869859cf8077f
Copying blob sha256:abced7ee58448ba09b150b6a5d0083ceb67e2e834226c951cc0c678891596db9
Copying config sha256:fd1a10b5a8443d0436fb6447b24b6aee75848995909962aa1c681e5942c70a72
Writing manifest to image destination
fd1a10b5a8443d0436fb6447b24b6aee75848995909962aa1c681e5942c70a72
--> 79b9daa6d479
STEP 3/14: RUN --mount=type=secret,id=pullsecret,dst=/run/secrets/pull-secret.json     podman pull     --authfile /run/secrets/pull-secret.json     --root=/var/lib/containers/storage-preloaded     "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:69219ccc460d06c01cfbbf5617795ed28eb3834704af811e8bf3469ace763bd3"
cannot set user namespace
Error: building at STEP "RUN --mount=type=secret,id=pullsecret,dst=/run/secrets/pull-secret.json podman pull     --authfile /run/secrets/pull-secret.json     --root=/var/lib/containers/storage-preloaded     "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:69219ccc460d06c01cfbbf5617795ed28eb3834704af811e8bf3469ace763bd3"": while running runtime: exit status 1

The errors are sporadic and the build process built-in retry may make one/two steps forward.

STEP 1/14: FROM localhost/cos9-bootc-source:latest
STEP 2/14: RUN --mount=type=secret,id=pullsecret,dst=/run/secrets/pull-secret.json     podman pull     --authfile /run/secrets/pull-secret.json     --root=/var/lib/containers/storage-preloaded     "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0c12ddea1c28a7287700c19e623c4bf66ce9931a6b8a8c882fd80204393996c6"
time="2024-09-17T13:06:55Z" level=warning msg="\"/\" is not a shared mount, this could cause issues or missing mounts with rootless containers"
Trying to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:0c12ddea1c28a7287700c19e623c4bf66ce9931a6b8a8c882fd80204393996c6...
Getting image source signatures
Copying blob sha256:4f05dc73d94cbbd0cd91587184e671ba4112087f93f86139ebdcb3afa3b1aae7
Copying blob sha256:abced7ee58448ba09b150b6a5d0083ceb67e2e834226c951cc0c678891596db9
Copying blob sha256:0ea779877a0c9ea774298c27e45ffa6589e7b157fef28dad9d6869859cf8077f
Copying blob sha256:d8e5548c38deb03c17871de35e63257444e05109754c25725185b8bf2630cfce
Copying config sha256:fd1a10b5a8443d0436fb6447b24b6aee75848995909962aa1c681e5942c70a72
Writing manifest to image destination
fd1a10b5a8443d0436fb6447b24b6aee75848995909962aa1c681e5942c70a72
--> 2f0c0b7adb00
STEP 3/14: RUN --mount=type=secret,id=pullsecret,dst=/run/secrets/pull-secret.json     podman pull     --authfile /run/secrets/pull-secret.json     --root=/var/lib/containers/storage-preloaded     "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:69219ccc460d06c01cfbbf5617795ed28eb3834704af811e8bf3469ace763bd3"
time="2024-09-17T13:07:23Z" level=warning msg="\"/\" is not a shared mount, this could cause issues or missing mounts with rootless containers"
Trying to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:69219ccc460d06c01cfbbf5617795ed28eb3834704af811e8bf3469ace763bd3...
Getting image source signatures
Copying blob sha256:6cc34d95786b642867c7c3b45ea1916c6c6f51d4226421304d10e73f29ed9168
Copying blob sha256:abced7ee58448ba09b150b6a5d0083ceb67e2e834226c951cc0c678891596db9
Copying blob sha256:d8e5548c38deb03c17871de35e63257444e05109754c25725185b8bf2630cfce
Copying blob sha256:0ea779877a0c9ea774298c27e45ffa6589e7b157fef28dad9d6869859cf8077f
Copying config sha256:16fae6ed108982dff474bed16dd2f410e34560f453e9a1ffaca5dd904c3bfd5f
Writing manifest to image destination
16fae6ed108982dff474bed16dd2f410e34560f453e9a1ffaca5dd904c3bfd5f
--> 366d6aacc433
STEP 4/14: RUN --mount=type=secret,id=pullsecret,dst=/run/secrets/pull-secret.json     podman pull     --authfile /run/secrets/pull-secret.json     --root=/var/lib/containers/storage-preloaded     "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:1ff1ff6c483757ff38b1ec392dd3bdb0386e75546c74e52a459fe39628b16914"
cannot set user namespace
Error: building at STEP "RUN --mount=type=secret,id=pullsecret,dst=/run/secrets/pull-secret.json podman pull     --authfile /run/secrets/pull-secret.json     --root=/var/lib/containers/storage-preloaded     "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:1ff1ff6c483757ff38b1ec392dd3bdb0386e75546c74e52a459fe39628b16914"": while running runtime: exit status 1

@cgwalters
Copy link
Collaborator Author

No idea honestly...when reporting errors like this it'd be super useful to know details like the environment (virt/physical, OS version, environment in general (is there any nested containerization going on?))

Just searching for the error string turns up containers/podman#10135

It's not even obviously clear to me whether the error message is from the outer build process or the inner RUN podman command?

@ggiguash
Copy link

ggiguash commented Sep 19, 2024

It's not even obviously clear to me whether the error message is from the outer build process or the inner RUN
podman command?

I think It's from the inner podman pull command.

Adding podman debug logs as a follow-up. Does the last error give any clues about the possible problem?
Note the high parallel job count. This is because we're running CI on a RHEL 9.4 AWS c5n.metal machine with 72 cores.
72*3+1=217 as set by the podman code.

STEP 11/24: RUN --mount=type=secret,id=pullsecret,dst=/run/secrets/pull-secret.json     podman pull --log-level=debug     --authfile /run/secrets/pull-secret.json     --root=/var/lib/containers/storage-preloaded     "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bc8368dbf15113fa74e6875f70df37f4cde587fd549f00cc56ac22fa6f23bbf5"
time="2024-09-19T17:09:25Z" level=info msg="podman filtering at log level debug"
time="2024-09-19T17:09:25Z" level=debug msg="Called pull.PersistentPreRunE(podman pull --log-level=debug --authfile /run/secrets/pull-secret.json --root=/var/lib/containers/storage-preloaded quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:bc8368dbf15113fa74e6875f70df37f4cde587fd549f00cc56ac22fa6f23bbf5)"
time="2024-09-19T17:09:25Z" level=debug msg="Using conmon: \"/usr/bin/conmon\""
time="2024-09-19T17:09:25Z" level=info msg="Using sqlite as database backend"
time="2024-09-19T17:09:25Z" level=debug msg="Using graph driver overlay"
time="2024-09-19T17:09:25Z" level=debug msg="Using graph root /var/lib/containers/storage-preloaded"
time="2024-09-19T17:09:25Z" level=debug msg="Using run root /run/containers/storage"
time="2024-09-19T17:09:25Z" level=debug msg="Using static dir /var/lib/containers/storage-preloaded/libpod"
time="2024-09-19T17:09:25Z" level=debug msg="Using tmp dir /run/libpod"
time="2024-09-19T17:09:25Z" level=debug msg="Using volume path /var/lib/containers/storage-preloaded/volumes"
time="2024-09-19T17:09:25Z" level=debug msg="Using transient store: false"
time="2024-09-19T17:09:25Z" level=debug msg="Not configuring container store"
time="2024-09-19T17:09:25Z" level=debug msg="Initializing event backend file"
time="2024-09-19T17:09:25Z" level=debug msg="Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument"
time="2024-09-19T17:09:25Z" level=debug msg="Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument"
time="2024-09-19T17:09:25Z" level=debug msg="Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument"
time="2024-09-19T17:09:25Z" level=debug msg="Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument"
time="2024-09-19T17:09:25Z" level=debug msg="Configured OCI runtime crun-vm initialization failed: no valid executable found for OCI runtime crun-vm: invalid argument"
time="2024-09-19T17:09:25Z" level=debug msg="Configured OCI runtime crun-wasm initialization failed: no valid executable found for OCI runtime crun-wasm: invalid argument"
time="2024-09-19T17:09:25Z" level=debug msg="Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument"
time="2024-09-19T17:09:25Z" level=debug msg="Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument"
time="2024-09-19T17:09:25Z" level=debug msg="Using OCI runtime \"/usr/bin/crun\""
time="2024-09-19T17:09:25Z" level=debug msg="Initialized SHM lock manager at path /libpod_lock"
time="2024-09-19T17:09:25Z" level=info msg="Setting parallel job count to 217"
time="2024-09-19T17:09:25Z" level=debug msg="Could not move to subcgroup: mkdir /sys/fs/cgroup/init: permission denied"
cannot set user namespace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/client Related to the client/CLI area/install Issues related to `bootc install` enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants