On Mon, Jan 31, 2005 at 08:46:35PM +0000, Greg Kochanski wrote: > Unexpectedly, it is reproducible. > > Here's the relevant bit of ps -e -F . This was taken > a minute or so after I started the strace find . > > gpk 16650 16555 0 646 1480 0 20:35 pts/7 00:00:00 bash > gpk 16659 16650 1 428 572 0 20:36 pts/7 00:00:01 strace > find . -name #cvs > gpk 16660 16659 0 382 444 0 20:36 pts/7 00:00:00 find . > -name #cvs > gpk 16681 16583 0 624 852 0 20:38 pts/4 00:00:00 ps -e -F > > /var/log/dmesg and /var/log/syslog show no relevant entries > (and no entries at all since I started the find .) > > > The directory from which I launched find > is on a local disk; no disks are configured for NFS. > The directory was reached via a symbolic link, though that > ought not to be relevant. > > The disk it is on is the main system disk, and it seems > to be functioning well. > > > The tail end of the output of strace follows: > > getdents64(4, /* 113 entries */, 4096) = 4072 > getdents64(4, /* 50 entries */, 4096) = 1760 > getdents64(4, /* 0 entries */, 4096) = 0 > close(4) = 0 > chdir("22") = 0 > lstat64(".", {st_mode=S_IFDIR|0755, st_size=20480, ...}) = 0 > chdir("..") = 0 > lstat64(".", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 > lstat64("23", {st_mode=S_IFDIR|0755, st_size=20480, ...}) = 0 > open("23", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 4 > fstat64(4, {st_mode=S_IFDIR|0755, st_size=20480, ...}) = 0 > fcntl64(4, F_SETFD, FD_CLOEXEC) = 0 > getdents64(4, > > (The output stopped half-way through the last line. > I had it going directly to a terminal, rather than a file > to avoid any buffering.)
Ok, in a nutshell, what is going on here is that find is opening a directory (getdents64() is likely the result of a call to readdir()), and the kernel is not returning from that call. This is most likely because some IO is blocking (permanently) somewhere. Using the non-straced output of find you should be able to work out aproximately where in the filesystem this is occuring. You mentioned in another bug report that you are experiencing high load average, yet the CPU seems idle. You also mentioned you have been using a USB disk. I strongly suspect that this is infact the same issue. I strongly suspect that you have a large number of (find) processes blocked on IO somewhere in your filesystem. I would strongly suspect this is the mountpoint where the system thinks that the USB disk is, but it isn't there, and it is blocking, waiting to acccess the system. That this is reproducable is not surprising in the least. Blocking IO is very commonly used, and blocking means just that, it blocks until the result comes out. And while it is blocking, it is usually stuck in the kernel, and you can't kill process that are stuck in the kernel. -- Horms -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]