Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop Travis CI and replace with GitHub Actions #78

Merged
merged 6 commits into from
Aug 2, 2022

Conversation

NucciTheBoss
Copy link
Contributor

Hello there! I noticed that there have been some outstanding issues/pull requests on this repository. I myself have been trying to navigate around #77 recently for running OpenStack functional tests with Tempest. I saw that the major block right now is issue #76 which is that CirrOS does not have its original pipeline anymore. This pull request aims to add new GitHub Actions pipelines so that we can start landing updates on CirrOS again.

build-cirros-bionic.yaml

This pipeline builds CirrOS images for aarch64, arm, i386, x86_64, powerpc, ppc64, and ppc64le in parallel, and then uploads the build artifacts to the workspace. These artifacts can be downloaded and the images can be tested further before release.

test-cirros-bionic.yaml

This pipeline tests new pull requests made to master. It checks that the proposed changes do not introduce any regressions that prevent CirrOS from building on the supported architectures. The main difference between this pipeline and the build pipeline is that this pipeline only produces the build log as an artifact, not any resulting images.

Next steps

Right now I am sticking with using bionic as the build environment even though pull request #68 updates the build instructions to Focal for local environments. Bionic is currently being supported until 2023, so this gives us time to land other critical updates.

@hrw
Copy link
Collaborator

hrw commented Jul 28, 2022

Consider adding cache for ccache and downloads:

    - name: cache downloads
      uses: actions/cache@v3
      with:
        key: cache-downloads
        path: download/

    - name: cache ccache
      uses: actions/cache@v3
      with:
        key: cache-ccache
        path: ccache/

@NucciTheBoss
Copy link
Contributor Author

@hrw I added the caching for both ccache and download to both pipelines. If I read the caching documentation for actions correctly, the way I have it should load in ccache and download before building CirrOS if there is a cache hit.

Let me know if there is anything else we should include in the pipeline before merging.

@hrw
Copy link
Collaborator

hrw commented Jul 28, 2022

Please check how cache is done - from what I saw it is done once and then reused without generating new one (or I did not found a way to force it).

So you would need cache for each ARCH/ccache and for downloads/ one should be enough.

@NucciTheBoss
Copy link
Contributor Author

Hmm. I'll look into this further tomorrow morning, but upon further scrutiny, I'm not sure the cache action is going to do the trick with how CirrOS builds currently.

Reason saying is that looking at the documentation for caching further, it looks like they keep the cache fresh by pinning it to a certain file in the repository (package-lock.json, Cargo.lock, etc.) and using its hash in the key. Then, when that file is updated and the hash changes, the action has a cache miss and then rebuilds a new cache. The problem is that all these files that the cache is pinned to already exist in the repository before building. I do not think that we can do this with the ccache and download directories since they are created at build time; I cannot hash them before the build step because they do not exist yet within the workspace on the runner.

It also seems like a new cache will only be created when there is a cache miss; it will not be updated each subsequent run if there is a modification. Therefore, to get the cache working, I think we would need some sort of flat file that already exists in the CirrOS repository that we can use to effectively track the downloads by CirrOS and/or ccache for each of the architectures. We could then use this file(s) to pin the caches so a new cache is created when we add a new download or modify ccache.

As for using only one download folder, what I could do is add a preparatory job that runs before the parallel build jobs for each of the architectures. This preparatory job could check if there is a downloads cache. If it hits, it can pass the download directory as an artifact to the parallel build jobs, and then the each respective build job can check for their own ccache. But I need something I can pin the caches too.

That being said, it feels like we need some more advanced tooling to get the caching to behave the way we want with GitHub Actions. What might be best for the short-term is to merge the pipelines as is (without the caching), and fix #77 and maybe #75 since it is causing regressions in Tempest and Octavia. After fixing the rsa2 issue, we can then prioritize adding in the caching tooling for GitHub Actions. Let me know what your thoughts on this are and if maybe you have a different approach we could try for ensuring that the cache stays fresh. Luckily, each of the builds only takes about 20-35 minutes currently.

@hrw
Copy link
Collaborator

hrw commented Jul 29, 2022

actions/cache@v3 adds two steps into build:

  • unpack cache (before build)
  • create cache (after successful build)

If there is no cache then nothing gets unpacked. When build is done then cache is created:

/usr/bin/tar --posix --use-compress-program zstd -T0 -cf cache.tzst --exclude cache.tzst -P -C /home/runner/work/fork-cirros/fork-cirros --files-from manifest.txt

Note: that requires space in runner so some kind of "make clean" would be needed before.

You can have "hashless" cache but adding some kind of such is handy to have a way to force cache recreation.

25-30 minutes per build may looks fine. But that's 25-30 per build and you make several of them at once. Check what kind of time limits Github Actions provide. We had caching on Travis CI to make sure that we use as less time as possible.

@NucciTheBoss
Copy link
Contributor Author

Yes, I saw in your fork that the SSD runs out of space when you try to create the cache. Not sure if you have the exact numbers but it seems like the GitHub-hosted bionic and focal runners are pretty bare bones aside from having a relatively decent amount of RAM:

  • 2-core CPU (x86_64)
  • 7 GB of RAM
  • 14 GB of SSD space

With the way I have it, CirrOS building in parallel, it looks like we can potentially skirt around the storage limit because each job is given its own 14 GB SSD. If you build CirrOS for each architecture in the same job, it looks like you will run out of space trying to do any larger steps after all the images have been built. However, looking at the docs for the cache action, I believe a new cache is only created at the end of the job if there is a cache miss. That's why I think we need some sort of requirements and/or manifest file to pin the cache(s) to. This will allow us to force a cache miss which will update the downloads folder

Regarding the run time, luckily, since CirrOS is an open-source, public project, GitHub is very permissive with what we do with the actions. Each job is allowed to run for maximum of 6 hours each, and an entire workflow can run for up to 35 days. We can also have 20 jobs running concurrently at a time. This where I am getting the information for the limits:

Given this information, that's why I feel confident merging this pipeline for the short-term to at least have something to test a fix for #77 (it will break Tempest functional tests for any distro that uses rsa2 as the default), and then get the caching/bumping the CirrOS build instructions to the focal series working afterwards. The only thing I will need to do before the merge is drop the current cache mechanism so that it does not produce any unexpected behavior for new pull requests while we work on the caching.

@smoser
Copy link
Member

smoser commented Jul 30, 2022

Hi, i just wanted to say thank you for looking at this. I will try to review in the next few days.
thank you.

@hrw
Copy link
Collaborator

hrw commented Aug 1, 2022

Check also work done by @osfrickler (Dr. Jens Harbott) on migrating from Github to Opendev.

Zuul jobs doing builds, update to newer buildroot, Ubuntu 22.04 kernels etc.

https://review.opendev.org/q/project:cirros/cirros

Maybe this is better way.

@NucciTheBoss
Copy link
Contributor Author

Hi, i just wanted to say thank you for looking at this. I will try to review in the next few days.
thank you.

Hi @smoser! Thank you for looking at this pull request! I also sent you a ping over Libera a few days ago (from #ubuntu-devel), but I'm not sure if you saw that. Let me know what you think of the pipeline. Unfortunately, I had to drop the caching feature due to limitations with the GitHub caching action found earlier. The action seems more geared towards things like npm or cargo given that the cache key uses file hashes to keep the cache fresh.

I'm hoping that once this is merged, we can push a fix for the RSA2 compatibility issue. I have used OpenDev for other OpenStack-specific projects, but I cannot seem to find anywhere on Gitea where we can pull the build artifacts and/or publish releases/daily builds that other folks can easily pull. Maybe a conversation for the future, or at least backport the new commits from OpenDev to here.

@smoser
Copy link
Member

smoser commented Aug 1, 2022

Hi, i just wanted to say thank you for looking at this. I will try to review in the next few days.
thank you.

Hi @smoser! Thank you for looking at this pull request! I also sent you a ping over Libera a few days ago (from #ubuntu-devel), but I'm not sure if you saw that.

I just did now. sorry, I'm not using irc daily now for work, so checking is less frequent.

You should join #cirros on libera. @hrw and @osfrickler are there also.

@smoser
Copy link
Member

smoser commented Aug 1, 2022

I commented in #cirros that I'm fine to delegate the decision to @hrw and @osfrickler as they're more present than I am here.

What I'd like to have is:

  1. a branch for 0.5 that builds on $NEW_CI_PLATFORM
  2. move towards 0.6.0 release with all the things @osfrickler has taken care to add
  3. source available on github
  4. doc on how to contribute in README.md
  5. releases still published to github (that is free fast network with cdn)

@NucciTheBoss
Copy link
Contributor Author

Thanks for responding to me 😃. I'll hop into #cirros then to see how we can adapt this pull request to fit the project's needs.

@hrw
Copy link
Collaborator

hrw commented Aug 2, 2022

So when it comes to cache it looks like ccache-per-arch + downloads-per-arch hashed by bin/build-release should solve caching problem.

There are differences in what is downloaded on each architecture.

.github/workflows/build-cirros-bionic.yaml Outdated Show resolved Hide resolved
.github/workflows/build-cirros-bionic.yaml Outdated Show resolved Hide resolved
.github/workflows/test-cirros-bionic.yaml Outdated Show resolved Hide resolved
Copy link
Collaborator

@hrw hrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice. Would move some commands to separate steps and do all in one file

.github/workflows/build-cirros-bionic.yaml Outdated Show resolved Hide resolved
.github/workflows/build-cirros-bionic.yaml Outdated Show resolved Hide resolved
.github/workflows/build-cirros-bionic.yaml Outdated Show resolved Hide resolved
.github/workflows/build-cirros-bionic.yaml Outdated Show resolved Hide resolved
.github/workflows/test-cirros-bionic.yaml Outdated Show resolved Hide resolved
.github/workflows/build-cirros-bionic.yaml Outdated Show resolved Hide resolved
.github/workflows/test-cirros-bionic.yaml Outdated Show resolved Hide resolved
.github/workflows/build-cirros-bionic.yaml Outdated Show resolved Hide resolved
@hrw
Copy link
Collaborator

hrw commented Aug 2, 2022

Here is something I am testing now in https://github.com/hrw/fork-cirros/pull/2

name: CirrOS image builder

on:
  push:
    branches:
      - master
  pull_request:
    branches:
      - master

jobs:
  build_matrix:
    strategy:
      matrix:
        arch: ["aarch64", "arm", "ppc64le", "x86_64"]
    runs-on: ubuntu-20.04

    env:
      ARCHES: "${{ matrix.arch }}"
      BOOTTEST: "true"
      QUIET: 1

    steps:
      - name: Pull cirros source artifacts
        uses: actions/checkout@v3

      - name: cache downloads
        uses: actions/cache@v3
        with:
          key: "downloads-${{ matrix.arch }}-${{ hashFiles('bin/build-release') }}"
          path: download/

      - name: cache ccache
        uses: actions/cache@v3
        with:
          key: "ccache-${{ matrix.arch }}-${{ hashFiles('bin/build-release') }}"
          path: ccache/

      - name: prepare system
        run: bin/system-setup

      - name: install job dependencies
        run: sudo apt install cloud-utils qemu-system openbios-ppc

      - name: Build image
        run: bin/build-release daily

      - name: Upload cirros-aarch64 build artifacts
        uses: actions/upload-artifact@v3
        with:
          name: "image-${{ matrix.arch }}"
          path: release/

@NucciTheBoss
Copy link
Contributor Author

@hrw this looks good, but just a quick question. Noticed that you dropped i386, powerpc, and ppc64. smoser had mentioned something about dropping those architectures over IRC, but I missed that conversation in #cirros. Is it official that we want to drop those arches due to the Ubuntu kernel we borrow not supporting them?

@hrw
Copy link
Collaborator

hrw commented Aug 2, 2022

In 0.5 we keep them.

And then drop in 0.6 one.

@NucciTheBoss
Copy link
Contributor Author

In 0.5 we keep them.

And then drop in 0.6 one.

Sounds good, I will hang onto them for the time being then. I also added a comment so anyone looking at the workflow will know for later. Also, are we going to hold this pull request until #81 has been merged?

.github/workflows/build-cirros.yaml Show resolved Hide resolved
@hrw hrw merged commit 0a0c89d into cirros-dev:master Aug 2, 2022
@NucciTheBoss NucciTheBoss deleted the add-github-ci branch August 4, 2022 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants