Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: use ko to build a multi-platform lifecycle image #631

Closed
imjasonh opened this issue Jun 11, 2021 · 5 comments
Closed

Proposal: use ko to build a multi-platform lifecycle image #631

imjasonh opened this issue Jun 11, 2021 · 5 comments

Comments

@imjasonh
Copy link
Member

Splitting off discussion from #435 as this is only one possible path forward for supporting Arm.

Background

The process for building the lifecycle image today, as I understand it, roughly speaking, is to:

  1. go build ./cmd/lifecycle*
  2. add symlinks so that the lifecycle binary is available as lifecycle/detector, lifecycle/analyzer, etc.
  3. run tools/packager/main.go to:
    1. templatize lifecycle.toml to set the version
    2. add the output from that, and the lifecycle binary, and its symlinks, to a gzipped tar file**.
  4. run tools/image/main.go to add that gzipped tar to a base image and either push it to a registry, or load it in the local docker daemon.
    • this process also sets working dir, adds labels, etc.

This process is described in the Makefile roughly twice, once for linux/amd64 images and once for windows/amd64. The release process then combines these those two images into the multi-platform manifest list that ends up being available at buildpacksio/lifecycle.

* for Linux today this is done by docker running an alpine to go build with musl libc available, since some code uses cgo (though it might not after #630 🤞 )
** tar file generation is slightly complicated by having to build an image for Windows. For Windows, paths in the tar file get prepended with Files/, some PAXRecords are set on executable files, probably lots more I don't understand yet.

This process clearly works fine today, the project is able to produce a lifecycle image for linux/amd64 and windows/amd64, which is no small feat. But adding support for other platforms (e.g., arm), might become cumbersome without refactoring.


Proposal

This model of go building, targzing and appending to a base image is really powerful, since Go makes cross-compilation really easy. If you don't require cgo, you can just GOOS=windows go build./ from any platform. Building and tarring and pushing an image doesn't even require a Docker daemon.

This is, at its core, what ko does to build multi-arch images (and soon, Windows images too 🤞) -- it takes a base image (distroless, alpine, nanoserver, etc.), finds all the os/arch combinations it supports, runs go build for each of those, appends the binary as a layer on each platform base image, and pushes the resulting manifest list.

I'd like to propose using ko to build the lifecycle image. Instead of two parallel processes that build for linux/amd64 and windows/amd64, you could ko publish --platform=all atop a base image manifest list supporting all your desired platforms*

One wrinkle is the addition of symlinks. ko doesn't support this workflow today, and since it's pretty unique to lifecycle I'm not sure it would. Luckily, you can achieve this by appending a new layer to the ko-built image containing your symlinks, and the templatized lifecycle.toml. Even this could be achieved without writing new code, by using crane append.

* you'd have to stitch together a base image that supports both Linux and Windows from existing base images -- I wrote a simple tool to produce a stitched manifest list. Tekton is exploring a similar approach, since they already use ko to build multi-arch Linux images, but are starting to look into Windows support.


Concretely, the build process for multi-platform lifecycle could be:

$ ko publish ./cmd/lifecycle --platform=all
2021/06/11 09:40:04 Building github.com/buildpacks/lifecycle/cmd/lifecycle for linux/amd64
2021/06/11 09:40:09 Building github.com/buildpacks/lifecycle/cmd/lifecycle for linux/arm
2021/06/11 09:40:12 Building github.com/buildpacks/lifecycle/cmd/lifecycle for linux/arm64
2021/06/11 09:40:15 Building github.com/buildpacks/lifecycle/cmd/lifecycle for linux/ppc64le
2021/06/11 09:40:18 Building github.com/buildpacks/lifecycle/cmd/lifecycle for linux/s390x
2021/06/11 09:40:21 Building github.com/buildpacks/lifecycle/cmd/lifecycle for windows/amd64
2021/06/11 09:40:24 Building github.com/buildpacks/lifecycle/cmd/lifecycle for windows/arm
some.registry/lifecycle@sha256:abcdef  # <- multi-platform image

# add symlinks
$ ln -s lifecycle layer/lifecycle/detector
$ ln -s lifecycle layer/lifecycle/analyzer
...etc...

# "templatize" lifecycle.toml
$ cat lifecycle.toml > layer/lifecycle.toml
$ cat < EOF >> layer/lifecycle.toml
[lifecycle]
  version = "$(git log ...)"
EOF

# append new layer with symlinks and lifecycle.toml
$ crane append -f <(tar -f - -c layer/) -t some.registry/lifecycle@sha256:abcdef
some.registry/lifecycle@sha256:badface  # <- image with binary+symlinks+lifecycle.toml 🎉

Obviously there's things I've missed and changing the release process for a project like this is no small undertaking. ko is just one option for accomplishing this, and if it's not the right one I completely understand. At the very least removing the cgo depdendency should allow the linux/windows build processes to become more similar, which should make adding other variants simpler.

As a maintainer of ko and go-containerregistry (which both ko and buildpacks use heavily), I'm happy to make any reasonable changes or additions to either to enable this. Especially since I suspect Tekton will benefit greatly from this, both in terms of taking advantage of multi-platform buildpacks images, and in cribbing from your release processes. 😄

cc @ekcasey @micahyoung

@micahyoung
Copy link
Member

micahyoung commented Jun 14, 2021

Just had a couple questions below but my feeling is this could be a big win in reducing our build process code/complexity, especially for upcoming proliferation of lifecycle image architectures. Having our foundational (and trusted) lifecycle binaries and images being built each time from the same ko binary, all at once in the same runtime environment would be nice for audibility too. Plus, any reduction in LOCs that buildpacks need to learn and maintain would be a plus.

A side benefit is also having another Windows OCI image implementation in the wild to both increase image prevalence and solidify support from Microsoft.

After taking a look, a couple questions came up:

  • Can ko build and add two binaries to the same image? The lifecycle image has lifecycle (which is the target of the symlinks) and also launcher which is dependency of lifecycle and resides within the same image and should also ideally be built by the same ko process.
  • For buildpacks maintainers, is losing cgo support is a show-stopper? Are there any other current or upcoming features I'm not thinking of that would make using ko more difficult?

@imjasonh
Copy link
Member Author

Can ko build and add two binaries to the same image? The lifecycle image has lifecycle (which is the target of the symlinks) and also launcher which is dependency of lifecycle and resides within the same image and should also ideally be built by the same ko process.

No. To accomplish this you'd probably have to ko publish an image with one binary, then build and append the other as a separate layer. This might mean it's not worth using ko in the first place, but I think you can still get enough value from using crane append etc that this path is worth exploring.

In other words, whether or not buildpacks ends up literally adopting ko, I think there's a real benefit to adopting a ko-like build process, in terms of same-ifying per-platform builds so that adding linux/arm64 isn't difficult, and adding whatever the next three platforms are becomes downright easy.

@micahyoung
Copy link
Member

Thanks, that makes sense - ko's feature set lets us get pretty close to what we have now, since we distribute lifecycle as an image. For the cost of additional layers and dropping cgo, we'd get simplified binary + image generation of all our OS/Arch permutations which does feel very valuable.

I put this on the agenda for the implementations sub-team discussion this Thurs 11:30am EST/EDT (15:30 UTC). You're more than welcome to join and speak to it, and I'll be sure to bring it up otherwise. We'll see if we can get some thoughts from maintainers and others.

@natalieparellano natalieparellano added status/discussion-needed type/research Issue intended to be exploratory. type/enhancement New feature or request and removed type/research Issue intended to be exploratory. labels Sep 14, 2021
@natalieparellano
Copy link
Member

From what I recall of the mentioned sub-team discussion, @ekcasey was of the opinion that we can't build the lifecycle with ko for the same reason we can't build the lifecycle with the lifecycle - namely, the requirement that things end up in a certain location in the final image that may already be occupied. I could be misunderstanding and/or misremembering.

@imjasonh
Copy link
Member Author

Yeah that sounds familiar. I'm fine to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants