That didn't fix it.

So, next I uninstalled nscd.  At this point, nothing else in apt's
purview explicitly depended on it - though it was still showing up in
straces.

After that, the behavior has been significantly better so far.  So...
maybe a bug in nscd?


On that note, just curious: why is nscd showing up in an strace when the root 
user - having logged in at the console at ctrl-alt-F2, with no GUI - is trying 
to open a file using vim in its home directory?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1886022

Title:
  multiple processes intermittently stall at same point in strace

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  If this isn't a kernel bug, my apologies.  I didn't know where else to
  put it.  It affects seemingly unrelated processes, so there wasn't an
  obvious 'package'.  Possibly nscd, which is in glibc, which is close
  to the kernel...

  I noticed frequent, but intermittent hangs/stalls in multiple
  processes.  The processes usually go on to completion after a few
  seconds or so.  Occasionally, I get timeouts that I think might be
  this same symptom but aren't always strace-able.

  At first, I thought it was an authentication issue because I got hangs
  with "sudo -i" as well as intermittently very slow logins via ssh with
  ldap.  But, then it happened with "vim dum", which could still be
  authentication somehow, but I don't know enough to determine that.
  Htop will also hang intermittently, but I don't have a trace of it.

  I started running strace on the processes that are hanging.  The hang
  happened after the same few lines each time that I was able to strace
  at the same time of a hang (it is intermittent).

  Here is an example of the last lines before hang where a normal user
  issues "strace id username":

  socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
  connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = 0
  sendto(3, "\2\0\0\0\v\0\0\0\7\0\0\0passwd\0", 19, MSG_NOSIGNAL, NULL, 0) = 19
  poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 5000^Cstrace: Process 2816 
detached
   <detached ...>

  
  I hit ctrl-C to stop execution so I could easily copy that info, hence 
detached.

  
  Here is an example from root using "strace vim dum":

                                                                                
              socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
                                                         connect(3, 
{sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = 0
                               sendto(3, "\2\0\0\0\v\0\0\0\7\0\0\0passwd\0", 
19, MSG_NOSIGNAL, NULL, 0) = 19
         poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 5000

  
  ...that took practice to capture.

  I suspect that LDAP is involved somehow, but it might be victim rather
  than culprit.  Obviously, nscd has some involvement per the lines
  above.  Or possibly, these lines are about strace itself?

  I can capture full straces with timings, etc.  Just say what you need.
  Or direct me to the proper venue for this report.

  I looked in kern.log for evidence of a hardware issue, but didn't see
  anything that looked significantly unusual.  If it seems like
  hardware, I will appreciate hints as to the component.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-image-5.4.0-40-generic 5.4.0-40.44
  ProcVersionSignature: Ubuntu 5.4.0-40.44-generic 5.4.44
  Uname: Linux 5.4.0-40-generic x86_64
  NonfreeKernelModules: nvidia_modeset nvidia
  ApportVersion: 2.20.11-0ubuntu27.3
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/controlC0:  installer   1846 F.... pulseaudio
  CasperMD5CheckResult: skip
  CurrentDesktop: LXDE
  Date: Thu Jul  2 04:36:12 2020
  InstallationDate: Installed on 2020-06-25 (7 days ago)
  InstallationMedia: Ubuntu-MATE 20.04 LTS "Focal Fossa" - Release amd64 
(20200423)
  MachineType: HP ProLiant DL380 G7
  ProcFB: 0 VESA VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-40-generic 
root=UUID=be36c28f-47f3-4536-8bb3-8b2f3856fa42 ro quiet splash vt.handoff=7
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-40-generic N/A
   linux-backports-modules-5.4.0-40-generic  N/A
   linux-firmware                            1.187.1
  RfKill:
   
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 05/05/2011
  dmi.bios.vendor: HP
  dmi.bios.version: P67
  dmi.chassis.type: 23
  dmi.chassis.vendor: HP
  dmi.modalias: 
dmi:bvnHP:bvrP67:bd05/05/2011:svnHP:pnProLiantDL380G7:pvr:cvnHP:ct23:cvr:
  dmi.product.family: ProLiant
  dmi.product.name: ProLiant DL380 G7
  dmi.product.sku: 583917-B21
  dmi.sys.vendor: HP

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1886022/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to