Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/exec/util: check if unit exists before disabling #1621

Merged
merged 1 commit into from
May 9, 2023

Conversation

prestist
Copy link
Collaborator

@prestist prestist commented May 4, 2023

Ignition depends on systemctl disable for disabling units. Currently
if the unit does not exist systemctl disable exits 1; however, before
systemd 252 systemctl disable exits 0 if --root is specified. Since
Ignition depends on systemctl's exit code the new behavior caused a
regression, causing the unit.remove.symlinks blackbox test to fail with:

removing enablement symlink(s) for "enoent.service": cannot remove symlink(s) for enoent.service: exit status 1: "Failed to disable unit, unit enoent.service does not exist.\n"

Before disabling a unit, use systemctl is-enabled to verify that the
unit exists and is enabled.

Fixes #1614.

@prestist prestist force-pushed the non-existing-systemd-check branch 2 times, most recently from 1d9815a to 3932159 Compare May 4, 2023 21:18
Copy link
Contributor

@bgilbert bgilbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit message, let's specify the systemd version that changed the behavior. Bonus points for linking to the commit.

We should also link to the Ignition issue, and mention the blackbox test that this fixes.

internal/exec/util/unit.go Outdated Show resolved Hide resolved
internal/exec/util/unit.go Outdated Show resolved Hide resolved
internal/exec/util/unit.go Outdated Show resolved Hide resolved
@prestist prestist force-pushed the non-existing-systemd-check branch 2 times, most recently from f5fe758 to f659a9e Compare May 8, 2023 21:32
@prestist
Copy link
Collaborator Author

prestist commented May 8, 2023

In the commit message, let's specify the systemd version that changed the behavior. Bonus points for linking to the commit.

So re: the commit, I put the hash, but would it be better to include the whole link? (didnt know what convention was)

We should also link to the Ignition issue, and mention the blackbox test that this fixes.

11000001% for both. I need to track down the blackbox test really quick but I will add that as well, (sorry did not include it in the latest push)

internal/exec/util/unit.go Outdated Show resolved Hide resolved
docs/release-notes.md Outdated Show resolved Hide resolved
internal/exec/util/unit.go Outdated Show resolved Hide resolved
internal/exec/util/unit.go Outdated Show resolved Hide resolved
@bgilbert
Copy link
Contributor

bgilbert commented May 8, 2023

So re: the commit, I put the hash, but would it be better to include the whole link? (didnt know what convention was)

A link is the most future-proof; you can shorten the commit hash in the link to 10 characters or so. Since systemd is on GitHub, a middle ground is to say systemd/systemd@7a6c73d, which becomes a link on the GitHub site but doesn't get linkified in git log etc.

Also, you have a couple sentence fragments in the commit message.

@prestist prestist force-pushed the non-existing-systemd-check branch 2 times, most recently from 2b8c592 to be65cb4 Compare May 9, 2023 04:02
@bgilbert
Copy link
Contributor

bgilbert commented May 9, 2023

I agree that the problem was introduced in systemd 252. Fedora 37 has 251 and doesn't have the problem, while #1614 says that 252 is affected. However, systemd/systemd@7a6c73d can't have been the culprit, since that's in 251. If all we have to pinpoint the commit is the report in #1614, let's just drop the commit hash from the message. It's not worth trying to track down the exact change.

For the commit message, how about something like this?

internal/exec/util: check if unit exists before disabling

`systemctl disable` exits 1 if the unit does not exist, but on systemd
before 252, it exits 0 if `--root` is specified.  Ignition depended on that
behavior, causing the unit.remove.symlinks blackbox test to fail with:

    removing enablement symlink(s) for "enoent.service": cannot remove symlink(s) for enoent.service: exit status 1: "Failed to disable unit, unit enoent.service does not exist.\n"

Before disabling a unit, use `systemctl is-enabled` to verify that the
unit exists and is enabled.

Fixes https://github.com/coreos/ignition/issues/1614.

It's useful to record the details of the failing test for searchability.

docs/release-notes.md Outdated Show resolved Hide resolved
@prestist
Copy link
Collaborator Author

prestist commented May 9, 2023

I agree that the problem was introduced in systemd 252. Fedora 37 has 251 and doesn't have the problem, while #1614 says that 252 is affected. However, systemd/systemd@7a6c73d can't have been the culprit, since that's in 251. If all we have to pinpoint the commit is the report in #1614, let's just drop the commit hash from the message. It's not worth trying to track down the exact change.

Yeah I did not track down the exact change outside of the report. I will drop that detail thank you for fact checking, I should have done that.

For the commit message, how about something like this?

internal/exec/util: check if unit exists before disabling

`systemctl disable` exits 1 if the unit does not exist, but on systemd
before 252, it exits 0 if `--root` is specified.  Ignition depended on that
behavior, causing the unit.remove.symlinks blackbox test to fail with:

    removing enablement symlink(s) for "enoent.service": cannot remove symlink(s) for enoent.service: exit status 1: "Failed to disable unit, unit enoent.service does not exist.\n"

Before disabling a unit, use `systemctl is-enabled` to verify that the
unit exists and is enabled.

Fixes https://github.com/coreos/ignition/issues/1614.

yes that def sounds more clear :(

It's useful to record the details of the failing test for searchability.

Ok, I was on the edge (before your message) of not including the blackbox test, mostly because it felt like a large amount of focus on the cause of the change rather then the change.
As an example we generally dont add details of why we added a feature, but instead the description of the new feature.

Though after reading your commit message I do like it.

@prestist prestist force-pushed the non-existing-systemd-check branch from be65cb4 to 08741d9 Compare May 9, 2023 13:59
@prestist prestist marked this pull request as ready for review May 9, 2023 14:01
@prestist
Copy link
Collaborator Author

prestist commented May 9, 2023

I ended up taking some of your suggestion and adjusted it to my taste I hope that I did not make it worse lol.

@bgilbert
Copy link
Contributor

bgilbert commented May 9, 2023

Ok, I was on the edge (before your message) of not including the blackbox test, mostly because it felt like a large amount of focus on the cause of the change rather then the change.
As an example we generally dont add details of why we added a feature, but instead the description of the new feature.

Yeah, that's true. But, when there's a concise failure (error message or test failure), I think it's useful to include that to help with reproducing the issue later (or finding the previous fix if the fix regresses).

I like the additional detail in the commit message, but I can't let systemctrl stand. 😛 A few edits:

internal/exec/util: check if unit exists before disabling

Ignition depends on `systemctl disable` for disabling units. Currently
if the unit does not exist `systemctl disable` exits 1; however, before
systemd 252 `systemctl disable` exits 0 if `--root` is specified. Since
Ignition depends on systemctl's exit code the new behavior caused a
regression, causing the unit.remove.symlinks blackbox test to fail with:

    removing enablement symlink(s) for "enoent.service": cannot remove symlink(s) for enoent.service: exit status 1: "Failed to disable unit, unit enoent.service does not exist.\n"

Before disabling a unit, use `systemctl is-enabled` to verify that the
unit exists and is enabled.

Fixes coreos#1614.

And remember to update the PR title/description also.

Ignition depends on `systemctl disable` for disabling units. Currently
if the unit does not exist `systemctl disable` exits 1; however, before
systemd 252 `systemctl disable` exits 0 if `--root` is specified. Since
Ignition depends on systemctl's exit code the new behavior caused a
regression, causing the unit.remove.symlinks blackbox test to fail with:

    removing enablement symlink(s) for "enoent.service": cannot remove symlink(s) for enoent.service: exit status 1: "Failed to disable unit, unit enoent.service does not exist.\n"

Before disabling a unit, use `systemctl is-enabled` to verify that the
unit exists and is enabled.

Fixes coreos coreos#1614
@prestist prestist force-pushed the non-existing-systemd-check branch from 08741d9 to 13ecdf1 Compare May 9, 2023 17:41
@prestist prestist changed the title unit: check if unit exists before disabling internal/exec/util: check if unit exists before disabling May 9, 2023
@prestist prestist enabled auto-merge May 9, 2023 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

handle non-existing systemd units deactivation
2 participants