Sorry for posting a test case only focusing on the busy-loop bug going away. Important is also that cups-browsed is still doing its original functionality of automatically creating print queues for network printers (IPP printers, remote CUPS queues, Printer Applications) correctly.
At least nobody commented here that their printing ceased to work after the update. I checked the functionality on my Oracular system with my printers, but it is also checked, for both Oracular and Noble, by it passing its autopkgtest (otherwise the package had not made it into -proposed). My autopkgtest is the script test/run-tests.sh in the upstream source of cups-browsed. While cups-browsed and also cupsd is running this script - creates two software emulations of driverless IPP printers (Printer Applications) with unique names, and printing jobs into a file - waits until cups-browsed has auto-created CUPS queues for them, checking the presence of the queues with "lpstat -v" - sends a print job to one of them - waits until the job appears in the CUPS queue and disappears from it - checks the presence of the print job output file - kills the Printer Applications - checks whether cups-browsed removes the queues - shuts down cups-browsed This test can be easily done by anyone of you, no printer required. Install it via sudo apt install cups-browsed-tests Then run (no root or sudo required): run-tests.sh Answer the questions with "3" (use system's cups-browsed) and "N" (do not use Valgrind). The above-mentioned test sequence is done, verbosely telling on the screen what is happening. If all got done correctly, the exit status is 0. Run echo $? to check the exit status. ** Description changed: [ Impact ] During the past months it often happened that a user observed that cups- browsed takes 100% or 200% (2 cores) of CPU and ceases to do its actual work. This even happens for users who do not print anything, just printers shared by other computers in the local network, triggering cups-browsed to make them available on the local machine, can make cups- browsed getting stuck hogging 1 or 2 CPU cores. To free the CPU cores and make cups-browsed working again one needs to kill and restart it. The bug is not easily reproducible. It only occurs sporadically. Restarting a stuck cups-browsed will not end up getting it stuck immediately again. It is annoying for the user that suddenly a significant part of their CPU power gets hogged and the machine's fan producing noise. Especially most users do not know the root cause of it and how to stop it. The problem is caused by concurrent use of one global HTTP connection to CUPS by several sub-threads. This corrupts the data structure which makes the httpGets() function of CUPS fall into an infinite loop. The proposed packages (for Noble and Oracular) contain a backport of the fix of the upstream version 2.1.1 (already uploaded to Plucky). This fix lets each function create its own HTTP connection to CUPS instead of using one single global one. None of these is used by multiple threads and therefore the problem should go away. The fix I did solely do by taking a few backtraces (by reporters of the bug and one where I observed the bug by myself) to locate it (gets always stuck in httpGets() of libcups), and reviewing the HTTP-related code of libcups and of cups-browsed, discovering the described problem and remedying it as described. I did not do any before/after testing. I only based my self on my observations and code reviews. What happens in httpGets() is described in my comments here (especially near the end): https://github.com/OpenPrinting/cups/issues/879 [ Test Plan ] UPDATE Please everybody do this test, especially users of Oracular (24.10) as we are still needing verification there. Paste the following script into a file ``` #!/bin/sh while true; do service cups-browsed restart printf . sleep 15s done ``` and make the file executable. Then execute the file and leave it running for some hours. Without the update applied, cups-browsed will sooner or later get stuck with 100% CPU. Once this happens, it will require a SIGKILL signal ("kill -9") to be stopped. As the "service cups-browsed restart" command only sends SIGTERM, the stuck cups-browsed will keep running on 100% CPU while the script is spinning and trying to restart cups-browsed every 15 sec. This way you will see the failure whenever you come back and check, no need to be present through the whole process. Now stop the script with Ctrl+C and do ``` killall -9 cups-browsed ``` After that update to package with the fix proposed here. Now run the script again and leave it running for some hours. cups- browsed should not get stuck. Thanks a lot, Jeffrey Knockel (jeff250), for providing this testing method (comment #40). UPDATE 2 Unfortunately, a failure of cups-browsed does not stop the script, cups- browsed gets killed after a timeout of 90 seconds. So to capture failures I ran the following command from another terminal: while true; do sleep 2; ps aux | grep /usr/sbin/cups-browsed | grep -v grep; done | tee log.txt This produces a line like cups-br+ 949545 0.2 0.0 817024 20968 ? Ssl 08:58 0:00 /usr/sbin/cups- browsed every 2 seconds. With the ¨0:00" right before "/usr/sbin/cups-browsed" being the accumulated CPU time of the process. Not getting stuck, cups-browsed never accumulates visible CPU time when running only 15 seconds, but hanging in a busy loop for op to 90 seconds the CPU time gets visible. So grep -v ' 0:00 /usr/sbin/cups-browsed' log.txt easily reveals the fact. + + UPDATE 3 + + To test whether cupd-browsed's original functionality did not get broken + by the update see comment #45. Especially the fact that the update has + passed its autopkgtest on both Oracular and Noble is an evidence that + cups-browsed is still doing what it was designed for. ORIGINAL TEXT Due to the problem only occuring sporadically it is not easy to make a test, install the proposed, fixed version, do the same test again and see that the problem has disappeared. But if somebody of you observes the bug with a certain frequency, like in 1 of 10 attempts for example, you could try until getting the bug with the old version, update, and then try again, if you reach a reasonably high number of tests without the bug occuring again, you could consider it as fixed. The bug requires cups-browsed to create or remove local CUPS queues for remote printers, so that it interacts with the local CUPS, which it does by IPP, using libcups' HTTP API. This requires the appearing and disappearing of network printers, emulations of them with tools like ippeveprinter, or shared remote CUPS queues. Also disruption in the network connection between a remote server (printer, CUPS) and the client, like shutting down network connection or suspending the machine could cause the problem. A possible situation where it happened but we have no proof was on a Canonical Sprint (where all of Canonical's engineers meet physically). I have some print queues on my laptop which are shared (and so other people could see them in their print dialogs) and during the event I often had to get from one room to another and for that I closed my laptop, it suspended, and went to the other room where I opened again. Other people on the event observed the bug. I already tried to cause it by myself, suspending a laptop which shares printers and observing cups- browsed on another laptop but I was not able to reproduce it. Probably the Sprint with a big network and many people is a different situation. So unfortunately I am not able to force the occurrence of this bug. A possible way could be brute-forcing with many printers, writing a script starting 100s of instances of ippeveprinter or so. To test cups-browsed without having a printer one can use cups-browsed's own test script, test/run-tests.sh in the source code of cups-browsed. The test script, applying to the installed cups-browsed can be run most easily as follows: $ sudo apt install cups-browsed-tests $ mkdir test $ cd test $ cp /usr/share/cups-browsed-tests/* . $ /usr/bin/run-tests.sh 3 no The script creates 2 printers with ippeveprinter, checks whether cups- browsed creates queues for them, printes on one of them, checks completion of the job, then stops the ippeveprinter instances one by one and checks whether cups-browsed removes the queues. This is the autopkgtest of cups-browsed. Running it manually as described I was not able to trigger this bug, but modifying it to run 100 instances of ippeveprinter or letting it do "kill -9" on ippeveprinter instances could perhaps cause the bug. The script also serves as regression test and on cups-browsed 2.1.1 (which contains the fix) it works as described here. Also "make check" (uses this script, too) on the Noble and Oracular packages proposed here passes and so does not reveal any regressions. So I am also asking any of the reporters of these bugs, whether they observe the bug with enough frequency or know how to force the bug to occur (please tell, how, then), to check whether the proposed packages make the bug not appearing any more and tell their results here. [ Where problems could occur ] The patch is rather long and I have done only basic tests to check whether cups-browsed is still working as designed. So there is a regression potential. So I also ask everybody reading this, including those who did not observe or are not able to reproduce the bug reported here to test whether the fixed cups-browsed is still working as they expected, or if there is some regression. [ Original description ] After waking up from standby cups-browsed runs incessantly on two cores: 18243 cups-br+ 20 0 432256 26348 17848 R 99.7 0.2 66:54.73 cups-br+ 85147 cups-br+ 20 0 432256 26348 17848 R 99.7 0.2 66:52.08 cups-br+ cups-br+ 18243 18.9 0.1 432256 26348 ? Rsl 08:30 135:06 /usr/sbin/cups-browsed Best regards Heinrich ProblemType: Bug DistroRelease: Ubuntu 24.04 Package: cups-browsed 2.0.0-0ubuntu2 ProcVersionSignature: Ubuntu 6.6.0-14.14-generic 6.6.3 Uname: Linux 6.6.0-14-generic x86_64 ApportVersion: 2.27.0-0ubuntu6 Architecture: amd64 CasperMD5CheckResult: pass CurrentDesktop: KDE Date: Sun Jan 14 20:19:22 2024 InstallationDate: Installed on 2021-07-01 (927 days ago) InstallationMedia: Kubuntu 21.04 "Hirsute Hippo" - Release amd64 (20210420) MachineType: {report['dmi.sys.vendor']} {report['dmi.product.name']} Papersize: a4 ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-6.6.0-14-generic root=/dev/mapper/vgubuntu-root ro SourcePackage: cups-browsed UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 02/08/2023 dmi.bios.release: 1.63 dmi.bios.vendor: LENOVO dmi.bios.version: R0UET83W (1.63 ) dmi.board.asset.tag: Not Available dmi.board.name: 20KV0008GE dmi.board.vendor: LENOVO dmi.board.version: SDK0J40697 WIN dmi.chassis.asset.tag: No Asset Information dmi.chassis.type: 10 dmi.chassis.vendor: LENOVO dmi.chassis.version: None dmi.ec.firmware.release: 1.63 dmi.modalias: dmi:bvnLENOVO:bvrR0UET83W(1.63):bd02/08/2023:br1.63:efr1.63:svnLENOVO:pn20KV0008GE:pvrThinkPadE585:rvnLENOVO:rn20KV0008GE:rvrSDK0J40697WIN:cvnLENOVO:ct10:cvrNone:skuLENOVO_MT_20KV_BU_Think_FM_ThinkPadE585: dmi.product.family: ThinkPad E585 dmi.product.name: 20KV0008GE dmi.product.sku: LENOVO_MT_20KV_BU_Think_FM_ThinkPad E585 dmi.product.version: ThinkPad E585 dmi.sys.vendor: LENOVO -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2049315 Title: cups-browsed running non-stop on two cores To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cups-browsed/+bug/2049315/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs