Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

featured: use run() to run cli commands in place of check_call() #177

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

anamehra
Copy link
Contributor

Signed-off-by: anamehra [email protected]

Fixes: sonic-net/sonic-buildimage#20662

During some reboots, it was observed that some times featured.service script command fails to start the services like pmon, snmp, lldp etc.

From logs, it was observed that 'sudo systemctl enable ' command failed with errorcode 13 (SIGPIPE.

2024 Oct 29 01:31:26.191236 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'unmask', 'pmon.service']'
2024 Oct 29 01:31:26.211167 aaa14-rp INFO systemd[1]: Reloading.
2024 Oct 29 01:31:27.212381 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'enable', 'pmon.service']'
2024 Oct 29 01:31:27.232428 aaa14-rp INFO systemd[1]: Reloading.
2024 Oct 29 01:31:28.135667 aaa14-rp ERR featured: ['sudo', 'systemctl', 'enable', 'pmon.service'] - failed: return code - -13, output:#012None
2024 Oct 29 01:31:28.135746 aaa14-rp ERR featured: Feature 'pmon.service' failed to be enabled and started

2024 Oct 29 01:34:08.661711 aaa14-rp INFO featured: Running cmd: '['sudo', 'systemctl', 'enable', 'gnmi.service']'
2024 Oct 29 01:34:08.677242 aaa14-rp INFO systemd[1]: Reloading.
2024 Oct 29 01:34:09.316554 aaa14-rp ERR featured: ['sudo', 'systemctl', 'enable', 'gnmi.service'] - failed: return code - -13, output:#012None
2024 Oct 29 01:34:09.316791 aaa14-rp ERR featured: Feature 'gnmi.service' failed to be enabled and started

The issue does not recover and the pmon and other services never starts. On supervisor this also leads to swss, syncd and other related docker to stay down.

In general systemctl enable does not work for some services like pmon, snmp, lldp etc as there is no WantBy directive set for these services in unit file.

The command returns stderr :

"The unit files have no installation config (WantedBy=, RequiredBy=, Also=,
Alias= settings in the [Install] section, and DefaultInstance= for template
units). This means they are not meant to be enabled using systemctl.

Possible reasons for having this kind of units are:
• A unit may be statically enabled by being symlinked from another unit's
  .wants/ or .requires/ directory.
• A unit's purpose may be to act as a helper for some other unit which has
  a requirement dependency on it.
• A unit may be started when needed via activation (socket, path, timer,
  D-Bus, udev, scripted systemctl call, ...).
• In case of template units, the unit is meant to be enabled with some
  instance name specified.
”

featured python script uses subprocess.check_call() function to invoke the command which looks like is not very reliable at handling the stderr and may cause SIGPIPE with big buffer data.

Modifying the function to use subprocess.run() resolves this issue.

run() is more reliable at handing the return data.

Validated the change with multiple reboots.

@anamehra
Copy link
Contributor Author

Hi @abdosi , please review.

@anamehra anamehra marked this pull request as draft October 31, 2024 22:08
@anamehra anamehra force-pushed the anamehra/featured_1 branch 2 times, most recently from feffc38 to 7b38356 Compare October 31, 2024 23:19
With run(), seeign extra data in buffer and causing order check failure:
2024-11-01T00:00:02.3225936Z E               Actual: [call(['sudo', 'systemctl', 'daemon-reload'], capture_output=True, check=True, text=True),
2024-11-01T00:00:02.3226361Z E                call().stdout.__str__(),
2024-11-01T00:00:02.3226594Z E                call().stderr.__str__(),
2024-11-01T00:00:02.3227055Z E                call(['sudo', 'systemctl', 'unmask', 'dhcp_relay.service'], capture_output=True, check=True, text=True),
2024-11-01T00:00:02.3227342Z E                call().stdout.__str__(),
2024-11-01T00:00:02.3227570Z E                call().stderr.__str__(),
@anamehra anamehra marked this pull request as ready for review November 1, 2024 00:29
@rlhui rlhui requested a review from judyjoseph November 6, 2024 18:22
@anamehra
Copy link
Contributor Author

anamehra commented Nov 8, 2024

Hi @judyjoseph , please help with this PR review. Thanks

Copy link
Contributor

@judyjoseph judyjoseph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[202405]: featured enable/start failed for pmon, snmp, lldp services with SIGPIPE
2 participants