Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUPS get stuck after restart on active cups-browsed print jobs #27

Open
Matze1224 opened this issue Apr 2, 2024 · 1 comment
Open

Comments

@Matze1224
Copy link

Describe the bug
Printer queues getting stuck or disabled and sometimes yield a status message "No suitable destination host found by cups-browsed, retrying later" or "No destination host name supplied by cups-browsed for printer , is cups-browsed running?". Printing on other printer queues from cups-browsed works successful, at least most of the time.

We tried solving the problem by clearing all print queues on the affected workstations at first (stopping both daemons and clear printers.conf), but it didn't stop it. The patch 57d9351 from #23 didn't really solved it, too.

At the moment, I suspect a line of shell script in our configuration management (fai) which restarts the cups-browsed daemon after a configuration change.

To Reproduce
Steps to reproduce the behavior:

  1. Print a job to a printer queue managed by cups-browsed. Helpful if the print job is bigger than a few pages so you got more time to react.
  2. Wait till the program printed and the job is processed by CUPS.
  3. Restart the cups or cups-browsed systemd unit. Both should result in some or an other error described above.
  4. Printing again to the same queue yields to the same error. Maybe a single print job passes through, but the same error is excepted.

Expected behavior
Printing works even through restart from one of the responsible daemons or the avoidance of persisting the error.

System Information:

  • OS: Ubuntu
  • Version 22.04 LTS

cups-browsed and cups-filters are backported from Ubuntu mantic (Version 2.0.0-0ubuntu2) because of trouble in earlier versions in junction to our CUPS server on Debian 11.
Also added the patch 57d9351

Additional context
From my current understanding, it's a problem when CUPS want to print but cups-browsed hasn't detected the remote printer. It could be because of those cases:

  1. cups-browsed is in restart and therefor, printing isn't possible.
  2. CUPS started and have unfinished jobs from the previous run. The daemon tries to print but cups-browsed didn't detected the remote printers yet, so no target to print.

When restarting CUPS, these problematic printing queues persist while other queues appear after cups-browsed detected them. Don't know if thats just a CUPS problem because theres a print job for him also.

On most workstations that reported the problem, we found log messages that systemd killed the service at some time and cups-browsed reports the following message for all print queues at the next start:

Timeout happened during creation of the queue <name>, turn on DebugLogging for more info.

We now tried to temporarily solve this problem by the following systemd unit override for cups.service:

# /etc/systemd/system/cups.service.d/20-cups-jobs.conf
[Service]
ExecStartPre=/usr/bin/find /var/spool/cups -maxdepth 1 -type f -delete

It cleans the printing queue before the start so it wont trigger an undetected cups-browsed printer.
I can give feedback if this solves it. At least at the next restart.

Would be nice if cups-browsed would be more resilient with this.

@Matze1224
Copy link
Author

Additional to the reproducable error, we still had problems with this issue. Looks like 87 network printers and a VPN connection isnt the best usecase. In the debug log, I discovered longer waiting for the update_netifs function, especially if the client is connected via VPN. Maybe our latest VPN performance problem made it exponential.

On inhouse workstations, rechecking all printer queues took around 3s, for VPN clients it took around 1-2m. Now, it constantly take 0s with the following configuration option (which makes it a lot more stable):

FrequentNetifUpdate No

The trouble which a long printer queue refresh makes is quite interesting, as the daemon cant respond to implicitclass which expects the daemon to respond the URL for the printer. Because the daemon doesnt interrupt its queue refresh for this, implicitclass times out and cups reacts on this error with its policy (disable printer etc.).

I better not ask why this function needs to be called for each printer queue individually rather than wish I would found this configuration option more quickly ;-)
I will look if the one configuration option ruled them all or the original errors are still reproducible with those better timings. Mixed feelings about this because inhouse workstations where affected with the original problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant