Sorry for posting a test case only focusing on the busy-loop bug going
away. Important is also that cups-browsed is still doing its original
functionality  of automatically creating print queues for network
printers (IPP printers, remote CUPS queues, Printer Applications)
correctly.

At least nobody commented here that their printing ceased to work after
the update.

I checked the functionality on my Oracular system with my printers, but
it is also checked, for both Oracular and Noble, by it passing its
autopkgtest (otherwise the package had not made it into -proposed).

My autopkgtest is the script test/run-tests.sh in the upstream source of
cups-browsed.

While cups-browsed and also cupsd is running this script

- creates two software emulations of driverless IPP printers (Printer 
Applications) with unique names, and 
  printing jobs into a file
- waits until cups-browsed has auto-created CUPS queues for them, checking the 
presence of the queues with 
  "lpstat -v"
- sends a print job to one of them
- waits until the job appears in the CUPS queue and disappears from it
- checks the presence of the print job output file
- kills the Printer Applications
- checks whether cups-browsed removes the queues
- shuts down cups-browsed

This test can be easily done by anyone of you, no printer required.
Install it via

    sudo apt install cups-browsed-tests

Then run (no root or sudo required):

    run-tests.sh

Answer the questions with "3" (use system's cups-browsed) and "N" (do
not use Valgrind).

The above-mentioned test sequence is done, verbosely telling on the
screen what is happening. If all got done correctly, the exit status is
0.

Run

echo $?

to check the exit status.


** Description changed:

  [ Impact ]
  
  During the past months it often happened that a user observed that cups-
  browsed takes 100% or 200% (2 cores) of CPU and ceases to do its actual
  work. This even happens for users who do not print anything, just
  printers shared by other computers in the local network, triggering
  cups-browsed to make them available on the local machine, can make cups-
  browsed getting stuck hogging 1 or 2 CPU cores. To free the CPU cores
  and make cups-browsed working again one needs to kill and restart it.
  
  The bug is not easily reproducible. It only occurs sporadically.
  Restarting a stuck cups-browsed will not end up getting it stuck
  immediately again.
  
  It is annoying for the user that suddenly a significant part of their
  CPU power gets hogged and the machine's fan producing noise. Especially
  most users do not know the root cause of it and how to stop it.
  
  The problem is caused by concurrent use of one global HTTP connection to
  CUPS by several sub-threads. This corrupts the data structure which
  makes the httpGets() function of CUPS fall into an infinite loop. The
  proposed packages (for Noble and Oracular) contain a backport of the fix
  of the upstream version 2.1.1 (already uploaded to Plucky). This fix
  lets each function create its own HTTP connection to CUPS instead of
  using one single global one. None of these is used by multiple threads
  and therefore the problem should go away.
  
  The fix I did solely do by taking a few backtraces (by reporters of the
  bug and one where I observed the bug by myself) to locate it (gets
  always stuck in httpGets() of libcups), and reviewing the HTTP-related
  code of libcups and of cups-browsed, discovering the described problem
  and remedying it as described. I did not do any before/after testing. I
  only based my self on my observations and code reviews.
  
  What happens in httpGets() is described in my comments here (especially
  near the end):
  
  https://github.com/OpenPrinting/cups/issues/879
  
  [ Test Plan ]
  
  UPDATE
  
  Please everybody do this test, especially users of Oracular (24.10) as
  we are still needing verification there.
  
  Paste the following script into a file
  
  ```
  #!/bin/sh
  
  while true; do
      service cups-browsed restart
      printf .
      sleep 15s
  done
  ```
  
  and make the file executable. Then execute the file and leave it running
  for some hours.
  
  Without the update applied, cups-browsed will sooner or later get stuck
  with 100% CPU. Once this happens, it will require a SIGKILL signal
  ("kill -9") to be stopped. As the "service cups-browsed restart" command
  only sends SIGTERM, the stuck cups-browsed will keep running on 100% CPU
  while the script is spinning and trying to restart cups-browsed every 15
  sec. This way you will see the failure whenever you come back and check,
  no need to be present through the whole process.
  
  Now stop the script with Ctrl+C and do
  
  ```
  killall -9 cups-browsed
  ```
  
  After that update to package with the fix proposed here.
  
  Now run the script again and leave it running for some hours. cups-
  browsed should not get stuck.
  
  Thanks a lot, Jeffrey Knockel (jeff250), for providing this testing
  method (comment #40).
  
  UPDATE 2
  
  Unfortunately, a failure of cups-browsed does not stop the script, cups-
  browsed gets killed after a timeout of 90 seconds.
  
  So to capture failures I ran the following command from another
  terminal:
  
  while true; do sleep 2; ps aux | grep /usr/sbin/cups-browsed | grep -v
  grep; done | tee log.txt
  
  This produces a line like
  
  cups-br+ 949545 0.2 0.0 817024 20968 ? Ssl 08:58 0:00 /usr/sbin/cups-
  browsed
  
  every 2 seconds.
  
  With the ¨0:00" right before "/usr/sbin/cups-browsed" being the
  accumulated CPU time of the process. Not getting stuck, cups-browsed
  never accumulates visible CPU time when running only 15 seconds, but
  hanging in a busy loop for op to 90 seconds the CPU time gets visible.
  
  So
  
  grep -v ' 0:00 /usr/sbin/cups-browsed' log.txt
  
  easily reveals the fact.
+ 
+ UPDATE 3
+ 
+ To test whether cupd-browsed's original functionality did not get broken
+ by the update see comment #45. Especially the fact that the update has
+ passed its autopkgtest on both Oracular and Noble is an evidence that
+ cups-browsed is still doing what it was designed for.
  
  ORIGINAL TEXT
  
  Due to the problem only occuring sporadically it is not easy to make a
  test, install the proposed, fixed version, do the same test again and
  see that the problem has disappeared.
  
  But if somebody of you observes the bug with a certain frequency, like
  in 1 of 10 attempts for example, you could try until getting the bug
  with the old version, update, and then try again, if you reach a
  reasonably high number of tests without the bug occuring again, you
  could consider it as fixed.
  
  The bug requires cups-browsed to create or remove local CUPS queues for
  remote printers, so that it interacts with the local CUPS, which it does
  by IPP, using libcups' HTTP API. This requires the appearing and
  disappearing of network printers, emulations of them with tools like
  ippeveprinter, or shared remote CUPS queues. Also disruption in the
  network connection between a remote server (printer, CUPS) and the
  client, like shutting down network connection or suspending the machine
  could cause the problem.
  
  A possible situation where it happened but we have no proof was on a
  Canonical Sprint (where all of Canonical's engineers meet physically). I
  have some print queues on my laptop which are shared (and so other
  people could see them in their print dialogs) and during the event I
  often had to get from one room to another and for that I closed my
  laptop, it suspended, and went to the other room where I opened again.
  Other people on the event observed the bug. I already tried to cause it
  by myself, suspending a laptop which shares printers and observing cups-
  browsed on another laptop but I was not able to reproduce it. Probably
  the Sprint with a big network and many people is a different situation.
  
  So unfortunately I am not able to force the occurrence of this bug.
  
  A possible way could be brute-forcing with many printers, writing a
  script starting 100s of instances of ippeveprinter or so.
  
  To test cups-browsed without having a printer one can use cups-browsed's
  own test script, test/run-tests.sh in the source code of cups-browsed.
  
  The test script, applying to the installed cups-browsed can be run most
  easily as follows:
  
      $ sudo apt install cups-browsed-tests
      $ mkdir test
      $ cd test
      $ cp /usr/share/cups-browsed-tests/* .
      $ /usr/bin/run-tests.sh 3 no
  
  The script creates 2 printers with ippeveprinter, checks whether cups-
  browsed creates queues for them, printes on one of them, checks
  completion of the job, then stops the ippeveprinter instances one by one
  and checks whether cups-browsed removes the queues. This is the
  autopkgtest of cups-browsed. Running it manually as described I was not
  able to trigger this bug, but modifying it to run 100 instances of
  ippeveprinter or letting it do "kill -9" on ippeveprinter instances
  could perhaps cause the bug.
  
  The script also serves as regression test and on cups-browsed 2.1.1
  (which contains the fix) it works as described here. Also "make check"
  (uses this script, too) on the Noble and Oracular packages proposed here
  passes and so does not reveal any regressions.
  
  So I am also asking any of the reporters of these bugs, whether they
  observe the bug with enough frequency or know how to force the bug to
  occur (please tell, how, then), to check whether the proposed packages
  make the bug not appearing any more and tell their results here.
  
  [ Where problems could occur ]
  
  The patch is rather long and I have done only basic tests to check
  whether cups-browsed is still working as designed. So there is a
  regression potential. So I also ask everybody reading this, including
  those who did not observe or are not able to reproduce the bug reported
  here to test whether the fixed cups-browsed is still working as they
  expected, or if there is some regression.
  
  [ Original description ]
  
  After waking up from standby cups-browsed runs incessantly on two cores:
  
    18243 cups-br+  20   0  432256  26348  17848 R  99.7   0.2  66:54.73 
cups-br+
    85147 cups-br+  20   0  432256  26348  17848 R  99.7   0.2  66:52.08 
cups-br+
  
  cups-br+   18243 18.9  0.1 432256 26348 ?        Rsl  08:30 135:06
  /usr/sbin/cups-browsed
  
  Best regards
  
  Heinrich
  
  ProblemType: Bug
  DistroRelease: Ubuntu 24.04
  Package: cups-browsed 2.0.0-0ubuntu2
  ProcVersionSignature: Ubuntu 6.6.0-14.14-generic 6.6.3
  Uname: Linux 6.6.0-14-generic x86_64
  ApportVersion: 2.27.0-0ubuntu6
  Architecture: amd64
  CasperMD5CheckResult: pass
  CurrentDesktop: KDE
  Date: Sun Jan 14 20:19:22 2024
  InstallationDate: Installed on 2021-07-01 (927 days ago)
  InstallationMedia: Kubuntu 21.04 "Hirsute Hippo" - Release amd64 (20210420)
  MachineType: {report['dmi.sys.vendor']} {report['dmi.product.name']}
  Papersize: a4
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-6.6.0-14-generic 
root=/dev/mapper/vgubuntu-root ro
  SourcePackage: cups-browsed
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 02/08/2023
  dmi.bios.release: 1.63
  dmi.bios.vendor: LENOVO
  dmi.bios.version: R0UET83W (1.63 )
  dmi.board.asset.tag: Not Available
  dmi.board.name: 20KV0008GE
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40697 WIN
  dmi.chassis.asset.tag: No Asset Information
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.ec.firmware.release: 1.63
  dmi.modalias: 
dmi:bvnLENOVO:bvrR0UET83W(1.63):bd02/08/2023:br1.63:efr1.63:svnLENOVO:pn20KV0008GE:pvrThinkPadE585:rvnLENOVO:rn20KV0008GE:rvrSDK0J40697WIN:cvnLENOVO:ct10:cvrNone:skuLENOVO_MT_20KV_BU_Think_FM_ThinkPadE585:
  dmi.product.family: ThinkPad E585
  dmi.product.name: 20KV0008GE
  dmi.product.sku: LENOVO_MT_20KV_BU_Think_FM_ThinkPad E585
  dmi.product.version: ThinkPad E585
  dmi.sys.vendor: LENOVO

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2049315

Title:
  cups-browsed running non-stop on two cores

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cups-browsed/+bug/2049315/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to