Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully-automate merges for the simplest class of package publications PRs #26106

Open
3 tasks
shonfeder opened this issue Jun 17, 2024 · 3 comments
Open
3 tasks

Comments

@shonfeder
Copy link
Contributor

Even in the imperfect state of our current CI results, there are some package updates that are straightforward enough that we could fully automate their publication, provided we organize our CI sufficiently. We should do so.

Benefits

I proposed this in a maintainer meeting, and the sentiment was that the packages that currently fall into this bucket are so easy to review, that it should not be a high-priority. I think I disagree here: it is easy to underestimate the benefits of automating away small friction points, and it sets us up the processes for something we are going to need before long anyhow.

Main benefits I have in mind:

  • Better publication experience for package authors making these kinds of updates.
  • Less busywork for maintainers, allowing us to focus on the packages that actually need human review.
  • Lays the basis of a fully automated publication pipeline, upon which we can gradually expand.

Criteria

The current requirements for a fully automated publication would be, at least:

  1. The PR upates an existing package.
  2. All CI checks pass
  3. The author of the PR is a known maintainer/owner of the existing package

(3) Could be addressed via something like a .codeowners file in the package dir, or a linting check that compares the package data with the author of the PR triggering the CI process. Either of these would amount to piggybacking on github's auth, and using the repo contents as our source for permissioning configuration.

Required

Related


Additional thoughts and considerations on the requirements etc. would be helpful. :)

@hannesm
Copy link
Member

hannesm commented Jun 19, 2024

I'm not sure why #23789 is related.

My though is: has opam-repo-ci been reviewed and developed in a style to fail on errors? I remember there have been CI issues where it resulted in "ok" but didn't do anything. With the perspective of an attacker, if I manage to figure out something that leads to such CI confusion, and auto-merge is enabled -- I could easily take over the opam-repository. It may help to have a separate CI system, e.g. using GitHub action or likewise, that does a very small amount of checks (such as maintainer metadata, the diff is straightfotward (doesn't touch too many packages, checksums aren't changed, ...). You can see that I have some reservations with "ocurrent" and "opam-repo-ci" -- which is great to have, but at the same time a pretty big chunk of code - and I'm not aware of any design documents, neither a full system specification.

With the idea of "author/maintainer" of a package, this is still very much undefined. I worked years ago to find out "who's actually maintaining package X" by using the metadata from git. I guess this is a much more resilient approach than trying to decode the "maintainer" and "author" fields of opam -- which are very inconsistent. Turns out, in the long run, for conex we need that information about packages in any case.

@shonfeder
Copy link
Contributor Author

shonfeder commented Jun 20, 2024

Thank you for the considerations :)

I'm not sure why #23789 is related.

I guess you have a narrower topic in mind with the question in the title, "how does opam-repository scale", but the amount of manual work required is a bottleneck to scaling for a more active repo.

opam-repo-ci been reviewed and developed in a style to fail on errors?

That is certainly the hope! I'm sure there is room for improvement.

It may help to have a separate CI system, e.g. using GitHub action or likewise, that does a very small amount of checks

I agree that it would make sense for this kind of check to be in a simple, dedicated CI job, that was very easy to reason about.

With the idea of "author/maintainer" of a package, this is still very much undefined.

I don't think I get this point. We have a maintainer field, no? But using something like a .codeowners file may be better for the reasons you mention. One way of thinking about this effort is as a way of giving established package authors more autonomy over package publication, when everything is on the happy path. So I think it is worth considering adding support for that in way that may be orthogonal to opam's package meta data.

@shonfeder shonfeder changed the title Fully-automate merges for the simpelest class of package publications PRs Fully-automate merges for the simplest class of package publications PRs Jun 24, 2024
@yawaramin
Copy link
Contributor

Isn't checking the GitHub username good enough?

  • If the package is a new version of an existing one
  • And the GitHub username of the PR author is the same as the username in the opam file's dev-repo field
  • And the build is passing

Then auto-merge the PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants