This bug was fixed in the package cups-browsed - 2.0.0-0ubuntu10.3

---------------
cups-browsed (2.0.0-0ubuntu10.3) noble; urgency=medium

  * 0003-separate-http-connections.patch: Fixed cups-browsed getting
    stuck with 100% CPU on 1 or 2 cores (LP: #2049315, LP: #2067918,
    LP: #2073504, CUPS upstream issue #879).

 -- Till Kamppeter <till.kamppe...@gmail.com>  Sun, 12 Jan 2025 14:39:33
+0100

** Changed in: cups-browsed (Ubuntu Noble)
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to cups-browsed in Ubuntu.
https://bugs.launchpad.net/bugs/2049315

Title:
  cups-browsed running non-stop on two cores

Status in cups-browsed package in Ubuntu:
  Fix Released
Status in cups-browsed source package in Noble:
  Fix Released
Status in cups-browsed source package in Oracular:
  Fix Released
Status in cups-browsed source package in Plucky:
  Fix Released

Bug description:
  [ Impact ]

  During the past months it often happened that a user observed that
  cups-browsed takes 100% or 200% (2 cores) of CPU and ceases to do its
  actual work. This even happens for users who do not print anything,
  just printers shared by other computers in the local network,
  triggering cups-browsed to make them available on the local machine,
  can make cups-browsed getting stuck hogging 1 or 2 CPU cores. To free
  the CPU cores and make cups-browsed working again one needs to kill
  and restart it.

  The bug is not easily reproducible. It only occurs sporadically.
  Restarting a stuck cups-browsed will not end up getting it stuck
  immediately again.

  It is annoying for the user that suddenly a significant part of their
  CPU power gets hogged and the machine's fan producing noise.
  Especially most users do not know the root cause of it and how to stop
  it.

  The problem is caused by concurrent use of one global HTTP connection
  to CUPS by several sub-threads. This corrupts the data structure which
  makes the httpGets() function of CUPS fall into an infinite loop. The
  proposed packages (for Noble and Oracular) contain a backport of the
  fix of the upstream version 2.1.1 (already uploaded to Plucky). This
  fix lets each function create its own HTTP connection to CUPS instead
  of using one single global one. None of these is used by multiple
  threads and therefore the problem should go away.

  The fix I did solely do by taking a few backtraces (by reporters of
  the bug and one where I observed the bug by myself) to locate it (gets
  always stuck in httpGets() of libcups), and reviewing the HTTP-related
  code of libcups and of cups-browsed, discovering the described problem
  and remedying it as described. I did not do any before/after testing.
  I only based my self on my observations and code reviews.

  What happens in httpGets() is described in my comments here
  (especially near the end):

  https://github.com/OpenPrinting/cups/issues/879

  [ Test Plan ]

  UPDATE

  Please everybody do this test, especially users of Oracular (24.10) as
  we are still needing verification there.

  Paste the following script into a file

  ```
  #!/bin/sh

  while true; do
      service cups-browsed restart
      printf .
      sleep 15s
  done
  ```

  and make the file executable. Then execute the file and leave it
  running for some hours.

  Without the update applied, cups-browsed will sooner or later get
  stuck with 100% CPU. Once this happens, it will require a SIGKILL
  signal ("kill -9") to be stopped. As the "service cups-browsed
  restart" command only sends SIGTERM, the stuck cups-browsed will keep
  running on 100% CPU while the script is spinning and trying to restart
  cups-browsed every 15 sec. This way you will see the failure whenever
  you come back and check, no need to be present through the whole
  process.

  Now stop the script with Ctrl+C and do

  ```
  killall -9 cups-browsed
  ```

  After that update to package with the fix proposed here.

  Now run the script again and leave it running for some hours. cups-
  browsed should not get stuck.

  Thanks a lot, Jeffrey Knockel (jeff250), for providing this testing
  method (comment #40).

  UPDATE 2

  Unfortunately, a failure of cups-browsed does not stop the script,
  cups-browsed gets killed after a timeout of 90 seconds.

  So to capture failures I ran the following command from another
  terminal:

  while true; do sleep 2; ps aux | grep /usr/sbin/cups-browsed | grep -v
  grep; done | tee log.txt

  This produces a line like

  cups-br+ 949545 0.2 0.0 817024 20968 ? Ssl 08:58 0:00 /usr/sbin/cups-
  browsed

  every 2 seconds.

  With the ¨0:00" right before "/usr/sbin/cups-browsed" being the
  accumulated CPU time of the process. Not getting stuck, cups-browsed
  never accumulates visible CPU time when running only 15 seconds, but
  hanging in a busy loop for op to 90 seconds the CPU time gets visible.

  So

  grep -v ' 0:00 /usr/sbin/cups-browsed' log.txt

  easily reveals the fact.

  UPDATE 3

  To test whether cupd-browsed's original functionality did not get
  broken by the update see comment #45. Especially the fact that the
  update has passed its autopkgtest on both Oracular and Noble is an
  evidence that cups-browsed is still doing what it was designed for.

  ORIGINAL TEXT

  Due to the problem only occuring sporadically it is not easy to make a
  test, install the proposed, fixed version, do the same test again and
  see that the problem has disappeared.

  But if somebody of you observes the bug with a certain frequency, like
  in 1 of 10 attempts for example, you could try until getting the bug
  with the old version, update, and then try again, if you reach a
  reasonably high number of tests without the bug occuring again, you
  could consider it as fixed.

  The bug requires cups-browsed to create or remove local CUPS queues
  for remote printers, so that it interacts with the local CUPS, which
  it does by IPP, using libcups' HTTP API. This requires the appearing
  and disappearing of network printers, emulations of them with tools
  like ippeveprinter, or shared remote CUPS queues. Also disruption in
  the network connection between a remote server (printer, CUPS) and the
  client, like shutting down network connection or suspending the
  machine could cause the problem.

  A possible situation where it happened but we have no proof was on a
  Canonical Sprint (where all of Canonical's engineers meet physically).
  I have some print queues on my laptop which are shared (and so other
  people could see them in their print dialogs) and during the event I
  often had to get from one room to another and for that I closed my
  laptop, it suspended, and went to the other room where I opened again.
  Other people on the event observed the bug. I already tried to cause
  it by myself, suspending a laptop which shares printers and observing
  cups-browsed on another laptop but I was not able to reproduce it.
  Probably the Sprint with a big network and many people is a different
  situation.

  So unfortunately I am not able to force the occurrence of this bug.

  A possible way could be brute-forcing with many printers, writing a
  script starting 100s of instances of ippeveprinter or so.

  To test cups-browsed without having a printer one can use cups-
  browsed's own test script, test/run-tests.sh in the source code of
  cups-browsed.

  The test script, applying to the installed cups-browsed can be run
  most easily as follows:

      $ sudo apt install cups-browsed-tests
      $ mkdir test
      $ cd test
      $ cp /usr/share/cups-browsed-tests/* .
      $ /usr/bin/run-tests.sh 3 no

  The script creates 2 printers with ippeveprinter, checks whether cups-
  browsed creates queues for them, printes on one of them, checks
  completion of the job, then stops the ippeveprinter instances one by
  one and checks whether cups-browsed removes the queues. This is the
  autopkgtest of cups-browsed. Running it manually as described I was
  not able to trigger this bug, but modifying it to run 100 instances of
  ippeveprinter or letting it do "kill -9" on ippeveprinter instances
  could perhaps cause the bug.

  The script also serves as regression test and on cups-browsed 2.1.1
  (which contains the fix) it works as described here. Also "make check"
  (uses this script, too) on the Noble and Oracular packages proposed
  here passes and so does not reveal any regressions.

  So I am also asking any of the reporters of these bugs, whether they
  observe the bug with enough frequency or know how to force the bug to
  occur (please tell, how, then), to check whether the proposed packages
  make the bug not appearing any more and tell their results here.

  [ Where problems could occur ]

  The patch is rather long and I have done only basic tests to check
  whether cups-browsed is still working as designed. So there is a
  regression potential. So I also ask everybody reading this, including
  those who did not observe or are not able to reproduce the bug
  reported here to test whether the fixed cups-browsed is still working
  as they expected, or if there is some regression.

  [ Original description ]

  After waking up from standby cups-browsed runs incessantly on two
  cores:

    18243 cups-br+  20   0  432256  26348  17848 R  99.7   0.2  66:54.73 
cups-br+
    85147 cups-br+  20   0  432256  26348  17848 R  99.7   0.2  66:52.08 
cups-br+

  cups-br+   18243 18.9  0.1 432256 26348 ?        Rsl  08:30 135:06
  /usr/sbin/cups-browsed

  Best regards

  Heinrich

  ProblemType: Bug
  DistroRelease: Ubuntu 24.04
  Package: cups-browsed 2.0.0-0ubuntu2
  ProcVersionSignature: Ubuntu 6.6.0-14.14-generic 6.6.3
  Uname: Linux 6.6.0-14-generic x86_64
  ApportVersion: 2.27.0-0ubuntu6
  Architecture: amd64
  CasperMD5CheckResult: pass
  CurrentDesktop: KDE
  Date: Sun Jan 14 20:19:22 2024
  InstallationDate: Installed on 2021-07-01 (927 days ago)
  InstallationMedia: Kubuntu 21.04 "Hirsute Hippo" - Release amd64 (20210420)
  MachineType: {report['dmi.sys.vendor']} {report['dmi.product.name']}
  Papersize: a4
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-6.6.0-14-generic 
root=/dev/mapper/vgubuntu-root ro
  SourcePackage: cups-browsed
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 02/08/2023
  dmi.bios.release: 1.63
  dmi.bios.vendor: LENOVO
  dmi.bios.version: R0UET83W (1.63 )
  dmi.board.asset.tag: Not Available
  dmi.board.name: 20KV0008GE
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40697 WIN
  dmi.chassis.asset.tag: No Asset Information
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.ec.firmware.release: 1.63
  dmi.modalias: 
dmi:bvnLENOVO:bvrR0UET83W(1.63):bd02/08/2023:br1.63:efr1.63:svnLENOVO:pn20KV0008GE:pvrThinkPadE585:rvnLENOVO:rn20KV0008GE:rvrSDK0J40697WIN:cvnLENOVO:ct10:cvrNone:skuLENOVO_MT_20KV_BU_Think_FM_ThinkPadE585:
  dmi.product.family: ThinkPad E585
  dmi.product.name: 20KV0008GE
  dmi.product.sku: LENOVO_MT_20KV_BU_Think_FM_ThinkPad E585
  dmi.product.version: ThinkPad E585
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cups-browsed/+bug/2049315/+subscriptions


-- 
Mailing list: https://launchpad.net/~desktop-packages
Post to     : desktop-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~desktop-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to