Our NFS file server does double duty, serving both the compute nodes on the inside and some workstations on the outside. That is somewhat analogous to the intra-cluster situation. It turns out that updates some time in the last year introduced an issue where the lock manager stopped respecting the ports which were supposedly assigned to it. It took me a long time to notice this since there wasn't very much file locking going on between the workstations and the file server. However, anybody who used gnome (which I don't) would have seen it, since gnome does some file locking at startup, and this bug was was causing it to start very slowly.
Normally this port assignment issue wouldn't be a cluster issue, since one doesn't normally run a firewall between the file server and the compute nodes. However, this would be a problem for intra-cluster file sharing. To see if your file server has been bitten run rpcinfo -p on it, and if nlockmgr isn't in the right place, welcome to the club. For more information, see for instance: https://bugzilla.redhat.com/show_bug.cgi?id=434795 https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/28706 Long story short, the only way around this that I know of at present is to put: options lockd nlm_udpport=4001 nlm_tcpport=4001 (or whatever port you want) in /etc/modprobe.conf or an equivalent location and restart the NFS server. OK, make that: umount all mounts on the clients, restart the server, and remount on all the clients. Regards, David Mathog mat...@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf