I have a shared /home directory under two machines: a box called "strago", running OpenBSD 2.7, and a box called "shadow", running woody.
/home is an entire harddisk on strago, mounted on shadow through NFS. The line I use in fstab to mount /home is: strago:/home /home nfs rsize=8192,wsize=8192,timeo=14,intr Lately, the nfs connection to strago has been dying, for unknown reasons, causing shadow to crash hard. (No ctrl-alt-del, Magic SysRq Key, or any of that will work, nor can I telnet/ssh in from another host and reboot from there.) This has been happening more and more frequently, to the point that it has now occurred 5 times today. Sometimes, if I can tell that the NFS connection has died, I can quickly umount /home as root and remount it again, at which point everything ends up working fine, with no crashes or anything. The error messages I get (as logged in /var/log/messages) are as follows: Nov 11 19:12:46 shadow kernel: nfs: server strago not responding, still trying Nov 11 19:13:01 shadow kernel: nfs: task 4940 can't get a request slot Nov 11 19:13:02 shadow kernel: nfs: task 4954 can't get a request slot Nov 11 19:13:09 shadow kernel: nfs: task 4955 can't get a request slot Nov 11 19:14:47 shadow kernel: nfs: task 4956 can't get a request slot It's not a (physical) connection problem between the two machines (at least as far as I can tell) ... status LEDs on both NICs still blink, and the connection will work fine after the Debian box has been rebooted or if I can quickly umount & remount /home. On the OpenBSD end, I was running nfsd with the options -tun 4 (which means "serve tcp and udp clients, with 4 servers." By advisory of the OpedBSD mailing list, I pumped the # of servers up to 16, but the problems persist. The machine isn't being used to export NFS to anywhere else, so 16 servers should be more than enough for my needs (right?) Anyone have any ideas on what's going wrong, and what I can try to fix it? Thanks a lot, folks. - Colin McMillen