Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NUT service randomly stops connecting to UPS despite service active, restarting service workaround (Cyberpower CP1500PFCLCD) #2667

Open
chocmake opened this issue Oct 31, 2024 · 4 comments
Labels
Connection stability issues Issues about driver<->device and/or networked connections (upsd<->upsmon...) going AWOL over time CyberPower (CPS) impacts-release-2.7.4 Issues reported against NUT release 2.7.4 (maybe vanilla or with minor packaging tweaks) Linux Some issues are specific to Linux as a platform raspberry

Comments

@chocmake
Copy link

chocmake commented Oct 31, 2024

Twice in the last couple months I happened to notice that the NUT server wasn't resolving info about the UPS when queried, that would be resolved when either the nut-server service was restarted or the server rebooted. Below is the most recent experience.

Note that this may have occurred months earlier too but since nothing occured (no powerouts, nothing set up to push to me as logs) I didn't notice any potential prior events.

SSH'ing into the server I checked the nut-server and nut-monitor logs and they were reporting that the connection to the UPS was 'unavaiable' / timing out. Both services reported active status though.

journalctl -u nut-server -n 20 -f output (these messages repeated and latest are from a week ago, for some reason)

Oct 22 03:48:45 pi upsd[467]: Connected to UPS [cyberpower]: usbhid-ups-cyberpower
Oct 22 03:49:11 pi upsd[467]: Data for UPS [cyberpower] is stale - check driver
Oct 22 04:35:04 pi upsd[467]: Send ping to UPS [cyberpower] failed: Resource temporarily unavailable

journalctl -u nut-monitor -n 20 -f output (this message just repeats):

Oct 30 14:45:59 pi upsmon[478]: UPS cyberpower@localhost is unavailable
Oct 30 14:48:14 pi upsmon[478]: UPS [cyberpower@localhost]: connect failed: Connection failure: Connection timed out

When I tried to restart nut-server via sudo systemctl restart nut-server it failed:

Job for nut-server.service failed because a timeout was exceeded.
See "systemctl status nut-server.service" and "journalctl -xe" for details.  

systemctl status nut-server.service output:

● nut-server.service - Network UPS Tools - power devices information server
     Loaded: loaded (/lib/systemd/system/nut-server.service; enabled; vendor preset: enabled)
     Active: failed (Result: timeout) since <date>; 1min 3s ago
    Process: 115904 ExecStart=/sbin/upsd (code=exited, status=0/SUCCESS)
        CPU: 49ms

Oct 30 15:22:15 pi upsd[115904]: fopen /run/nut/upsd.pid: No such file or directory
Oct 30 15:22:15 pi upsd[115904]: listening on 0.0.0.0 port 3493
Oct 30 15:22:15 pi upsd[115904]: listening on 0.0.0.0 port 3493
Oct 30 15:23:45 pi systemd[1]: nut-server.service: start operation timed out. Terminating.
Oct 30 15:23:45 pi upsd[115904]: Can't connect to UPS [cyberpower] (usbhid-ups-cyberpower): Interrupted system call
Oct 30 15:23:45 pi upsd[115904]: Can't connect to UPS [cyberpower] (usbhid-ups-cyberpower): Interrupted system call
Oct 30 15:23:45 pi upsd[115918]: Startup successful
Oct 30 15:23:45 pi upsd[115918]: Signal 15: exiting
Oct 30 15:23:45 pi systemd[1]: nut-server.service: Failed with result 'timeout'.
Oct 30 15:23:45 pi systemd[1]: Failed to start Network UPS Tools - power devices information server.

However, immediately upon starting the service using sudo systemctl start nut-server the connection to the UPS was logged as being established again (both in the server and client terminal SSH sessions) and everything went back to normal.


Environment:

  • OS: Raspberry Pi OS (Debian 11/Bullseye)
  • NUT: v2.7.4-13 (via apt package install)
  • Server HW: Raspberry Pi Zero 2 W
  • UPS: Cyberpower CP1500PFCLCD

Should be noted nothing else is running on the Pi server beside NUT. Also using a wired connection on the server (Wi-Fi is disabled).


/etc/nut/ups.conf

maxretry = 3
pollinterval = 2

[cyberpower]
    desc = "Cyberpower CP1500PFCLCD"
    driver = "usbhid-ups"
    port = "auto"
    vendorid = "0764"
    productid = "0501"
    product = "CRJB103.551"
    serial = "CPS"
    vendor = "CP1500EPFCLCD"
    bus = "001"
    offdelay = 120
    ondelay = 0
@desertwitch
Copy link
Contributor

desertwitch commented Oct 31, 2024

NUT 2.7.4 is a couple of years old now, do you have any chance to try a newer version?
There have been numerous improvements in all areas since then, so it might be worth a shot.
See here on how to INSTALL from source: https://github.com/networkupstools/nut/blob/master/INSTALL.nut.adoc

@chocmake
Copy link
Author

chocmake commented Oct 31, 2024

NUT 2.7.4 is a couple of years old now, do you have any chance to try a newer version?

On Bullseye 2.7.4-13 is the only latest version available per apt-cache. It looks like Bookworm's latest is v2.8.0-7 (though have read various issues with Bookworm on Zero 2 W models).

Compiling from source looks a bit more involved, though if that's the only way around this I may have to try it sometime.

I suppose a workaround would be scheduling some script to periodically read the service's logs and trigger a service restart? Though I ran into the timeout issues with that above, so I'd also have to implement fallbacks.

@jimklimov
Copy link
Member

Well, 2.7.4 is actually close to 8.5 years old now (Mar 2016).

As for building, check also https://github.com/networkupstools/nut/wiki/Building-NUT-for-in%E2%80%90place-upgrades-or-non%E2%80%90disruptive-tests - it lists dependencies/tools as well as the methodology; current recipes have a good chance to inherit build settings detected from the older installation to become a sort of in-place ad-hoc replacement.

With CPS, it may also help to increase the polling rate - their controllers apparently go into power-saving or something, if poked only every half a minute (default).

@jimklimov jimklimov added CyberPower (CPS) raspberry Linux Some issues are specific to Linux as a platform impacts-release-2.7.4 Issues reported against NUT release 2.7.4 (maybe vanilla or with minor packaging tweaks) Connection stability issues Issues about driver<->device and/or networked connections (upsd<->upsmon...) going AWOL over time labels Nov 1, 2024
@chocmake
Copy link
Author

chocmake commented Nov 2, 2024

Thanks. My pollinterval is already rather low (2 seconds), so upped my maxretry to 5 per the Arch wiki suggestion. I've also implemented the usb_resetter suggestion and I'll see how it fares over the coming month.

Otherwise I'll probably have to try the source route.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Connection stability issues Issues about driver<->device and/or networked connections (upsd<->upsmon...) going AWOL over time CyberPower (CPS) impacts-release-2.7.4 Issues reported against NUT release 2.7.4 (maybe vanilla or with minor packaging tweaks) Linux Some issues are specific to Linux as a platform raspberry
Projects
None yet
Development

No branches or pull requests

3 participants