Hi, I have recently surmounted an odd problem involving Apache 1.3.27.0-2 on Debian testing/unstable (with a 2.4.22 kernel) and thought it may be worthwhile to explain what happened to a group of people who might care, in hopes that others find the information useful.
A few days ago my machine, which is a Dell P4 2.4 GHz box (pretty generic), started experiencing extremely high load averages and CPU usage. A couple of apache processes (anywhere from 1 to 5) had decided to eat 100% of CPU and 100% of RAM (1G physical 1G swap) between themselves (evenly distributed) for no apparent reason. Restarting apache would temporarily fix these runaway processes for anywhere from 30 seconds to 12 hours. Typically it would start up again within 10 or 15 minutes. Not being able to figure out what the hell these processes were doing by examining logs and the server-status page, I started stracing the runaway pids. The output looked like this: <-- strace output --> # strace -p <pid of a runaway apache> close(7) = 0 mremap(0x44735000, 304750592, 304754688, MREMAP_MAYMOVE) = 0x44735000 open("/etc/passwd", O_RDONLY) = 7 fcntl64(7, F_GETFD) = 0 fcntl64(7, F_SETFD, FD_CLOEXEC) = 0 _llseek(7, 0, [0], SEEK_CUR) = 0 fstat64(7, {st_mode=S_IFREG|0644, st_size=4158, ...}) = 0 mmap2(NULL, 4158, PROT_READ, MAP_SHARED, 7, 0) = 0x403f4000 _llseek(7, 4158, [4158], SEEK_SET) = 0 fstat64(7, {st_mode=S_IFREG|0644, st_size=4158, ...}) = 0 munmap(0x403f4000, 4158) = 0 close(7) = 0 <repeat> <-- end strace --> That would happen over and over about a hundred thousand times until the system ran out of memory and the process was killed by the kernel. I was stuck at this point. Why the hell is apache freaking out and eating tons of CPU while attempting to open /etc/passwd? Who knows? You wouldn't believe the amount of troubleshooting I did. New kernels, new source compiled apache binaries with default configurations, confirmed md5 sums of all apache binaries and linked libraries against a problem-free machine, a complete hardware swap, and even a run with apache-ssl. I tried everything and this was still happening. About to give up hope, I started looking at strace output again. I didn't notice this before: <-- more strace --> connect(7, {sa_family=AF_UNIX, path="/var/run/.nscd_socket"}, 110) = -1 \ ENOENT (No such file or directory) close(7) = 0 open("/etc/passwd", O_RDONLY) = 7 <-- end strace --> Hmm.. It.s trying to connect to NSCD before trying (and failing) to open /etc/passwd. It only was trying NSCD about once every 50 attempts. Not knowing what NSCD even is I Googled around and found out it provides cached passwd lookups (among other things). That sounded promising! Installed NSCD and, well, look at the graphs yourself: http://discore.org/cpusum-day.png (Daily CPU usage % updated every 5 min) NSCD was installed just after 3PM and the problem magically disappeared. Bug? Maybe. Error on my part? More likely. Odd? I think so. Fun? Not at all. I still don't know why apache couldn't just open /etc/passwd to get the information that NSCD provides more readily. The permissions are fine and nothing but apache is having problems. I'd be available to do further testing of this if anyone is interested. Mostly the point of this post is just to provide information on something that happened to me and was fixed quite easily after far too much troubleshooting. Thanks for reading, Tyler Morgan -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]