-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use HEAD versions in presubmit CI #782
Conversation
…now lives in default values for URLREF_* build arguments
… the workflow is explicitly launched with it set to false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall clear to me, was not able to spot issues. Also ran a quick test in a separate branch testing the logic to pin commits for reproducing older builds. LGTM.
I just think - as mentioned below - that some more documentation about manifest.yaml
would be helpful. Especially now where it became more sparse it is difficult to learn by examples.
.github/workflows/ci.yaml
Outdated
@@ -155,15 +159,15 @@ jobs: | |||
# converts manifest yaml to a json object of {SOFTWARE_NAME: URL#REF, ...} | |||
urlrefs=$( | |||
cat .github/container/manifest.yaml |\ | |||
yq -o=json 'to_entries | .[] | select(.value.mode == "git-clone") | {( .key | upcase | sub("-", "_") ): .value.url + "#" + .value.latest_verified_commit}' |\ | |||
yq -o=json 'to_entries | .[] | select(.value.mode == "git-clone") | {( .key | upcase | sub("-", "_") ): .value.url + "#" + .value.commit}' |\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a bit suprising to me, that I had to add mode: git-clone
in order to pull a non-HEAD commit until I found this line. Maybe some documentation about manifest.yaml
would be helpful at some point but no high prio as most devs here are familiar with its mechanics I suppose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the mode is still needed at all. But needing better documentation is an evergreen comment...
echo "${line}@${ref}" | ||
# Find the VCS installs, which are of the form | ||
# PACKAGE @ git+GIT_REPO_URL | ||
for line in $(sed -n -e 's/^\([^#].*\) @ git+\(.*\)$/\1=\2/p' requirements.pre); do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did the regex expression change? I do understand the logic afterwards (19:27) but just not how records in requirements.pre
differ such that the regrex matcher needs to be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before: it was just extracting PACKAGE
. Now: it's extracting the GIT_REPO_URL
as well.
This is because in the case that the package is not listed in the manifest file, we need the URL to query the current HEAD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late notice, but there is a small issue around feeding URLREFs into the build-args. See down below for an example/explanation.
# SOURCE_OVERRIDES is a comma-separated list of package=urlref pairs | ||
IFS=, read -ra overrides <<< "${{ inputs.SOURCE_OVERRIDES }}" | ||
for override in "${overrides[@]}"; do | ||
PACKAGE=$(cut -d= -f 1 <<< "${override}" | tr '[:lower:]' '[:upper:]' | tr '-' '_') | ||
URLREF=$(cut -d= -f 2- <<< "${override}") | ||
urlrefs=$(echo "$urlrefs" | jq -c ". + {\"$PACKAGE\": \"$URLREF\"}") | ||
urlrefs=$(echo "$urlrefs" | jq -c ". + {\"$PACKAGE\": \"URLREF_${PACKAGE}=$URLREF\"}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @olupton , can you walk me through the idea of how manifest.yaml and SOURCE_OVERRIDES
are meant to be used together?
If I want to install FLAX on an older commit, I have two options: I can change the manifest.yaml, or use SOURCE_OVERRIDES
. When is it recommended to use what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea here is that there are two modes.
First, when you're using this machinery to test some feature or fix, you don't particularly want to commit/push to JAX-Toolbox, so you can inject your feature/fix branch using SOURCE_OVERRIDES
, URLREF_XXX
, etc.
Second, when you are preparing a release, you gradually want to converge on a fixed and reproducible set of versions. In that case, you would take some seed manifest.yaml
output (perhaps from a nightly) and commit it to a release branch of JAX-Toolbox, and then iterate from there.
Ultimately I don't see why you would want to mix the two modes, and the second one ought to be pedantic. In mode one, you default to taking the HEAD and recording what that was. In mode two, forgetting to specify a version of something should maybe be an error.
echo "SOURCE_URLREFS=${urlrefs}" >> $GITHUB_OUTPUT | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
urlrefs
is currently just a mapping of $PACKAGE: $URL which we feed into the build-args. However, when we feed them into the build-args, the actual build-arg-key is missing. The mapping should be $PACKAGE: URLREF_$PACKAGE=$PACKAGE
@@ -1,32 +1,14 @@ | |||
jax: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, but you suggest to use default values for jax ref#commit
from Dockerfile rather than from manifest file?
What is the scenario for NGC release? Update Dockerfiles accordingly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely yes for the normal CI: by default we take the Dockerfile defaults, which is basically "HEAD".
When preparing releases, one option (noted here: #782 (comment)) is to commit a manifest file to the release branch and use that. Your suggestion of changing the default values in the Docker files in the release branches would also work, I think, and would require less machinery (to translate manifest --> URLREF build arg) but might be a little more tedious to seed at the beginning of the release process.
cc: @terrykong @ko3n1g
LGTM |
Why this change? It was a request to have PR be tested with fixed dependency. Why change this now? |
Because it didn't work. The previous architecture relied on the same commit of JAX-Toolbox being able to build both the nightlies and the "last known good" state of pinned versions. In reality, those two states of the world did not track closely enough to one another for that requirement to be viable. It was annoying in other ways too, but I think the above is the fundamental point. |
Following #770:
Flip the switch so that presubmit CIs run
bump.sh
. Because now theURLREF_XXX
build argument default values encode the upstream repository names + default branches, there is no need to duplicate that information in themanifest.yaml
committed to this repository.Leave the workflow for generating patches for Rosetta builds untouched for now, even though there is still some duplication.
The logic for tracking commits of
pip-vcs
transitive dependencies moves topip-finalize.sh
, which writes the commits used to a new/opt/manifest.d/pip-finalize.yaml
. Commits pinned in the "input" manifest file are still respected.Note that the Dockerfile changes included in #770 cause a new manifest to be written inside the containers, for example here is the file contained in the upstream-pax container from today's nightly (post-#770, pre-this-PR):
which can be fed back into the machinery if an old build needs to be reproduced.