Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-9685: daemon: Always remove pending deployment before we do updates #3599

Merged
merged 2 commits into from
Mar 9, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 22 additions & 16 deletions pkg/daemon/update.go
Original file line number Diff line number Diff line change
Expand Up @@ -2120,6 +2120,28 @@ func (dn *CoreOSDaemon) applyLayeredOSChanges(mcDiff machineConfigDiff, oldConfi
defer os.Remove(extensionsRepo)
}

// Always clean up pending, because the RT kernel switch logic below operates on booted,
// not pending.
if err := removePendingDeployment(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what ways can cleanup -p fail? I assume it won't error if there isn't a pending deployment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it doesn't give error. Tested it on local machine

$ rpm-ostree cleanup -p
Deployments unchanged.
$ echo $?
0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's idempotent. Also crucially, this code path is executed by CI, so if it didn't work, CI would fail.

return fmt.Errorf("failed to remove pending deployment: %w", err)
}

defer func() {
// Operations performed by rpm-ostree on the booted system are available
// as staged deployment. It gets applied only when we reboot the system.
// In case of an error during any rpm-ostree transaction, removing pending deployment
// should be sufficient to discard any applied changes.
if retErr != nil {
// Print out the error now so that if we fail to cleanup -p, we don't lose it.
glog.Infof("Rolling back applied changes to OS due to error: %v", retErr)
if err := removePendingDeployment(); err != nil {
errs := kubeErrs.NewAggregate([]error{err, retErr})
retErr = fmt.Errorf("error removing staged deployment: %w", errs)
return
}
}
}()

// If we have an OS update *or* a kernel type change, then we must undo the RT kernel
// enablement.
if mcDiff.osUpdate || mcDiff.kernelType {
Expand Down Expand Up @@ -2148,22 +2170,6 @@ func (dn *CoreOSDaemon) applyLayeredOSChanges(mcDiff machineConfigDiff, oldConfi
// if we're here, we've successfully pivoted, or pivoting wasn't necessary, so we reset the error gauge
mcdPivotErr.Set(0)

defer func() {
// Operations performed by rpm-ostree on the booted system are available
// as staged deployment. It gets applied only when we reboot the system.
// In case of an error during any rpm-ostree transaction, removing pending deployment
// should be sufficient to discard any applied changes.
if retErr != nil {
// Print out the error now so that if we fail to cleanup -p, we don't lose it.
glog.Infof("Rolling back applied changes to OS due to error: %v", retErr)
if err := removePendingDeployment(); err != nil {
errs := kubeErrs.NewAggregate([]error{err, retErr})
retErr = fmt.Errorf("error removing staged deployment: %w", errs)
return
}
}
}()

if mcDiff.kargs {
if err := dn.updateKernelArguments(oldConfig.Spec.KernelArguments, newConfig.Spec.KernelArguments); err != nil {
return err
Expand Down