On 04/19/2017 02:11 PM, Prentice Bisbal wrote:
Thanks for the suggestion(s). Just this morning I started considering
the network as a possible source of error. My stale file handle errors
are easily fixed by just restarting the nfs servers with 'service nfs
restart', so they aren't as severe you describe.

If a restart on solely the /server-side/ gets you back into a good state this is an interesting tidbit. Do you have some form of HA setup for NFS? Automatic failover (sometimes setup with IP aliasing) in the face of network hiccups can occasionally goof the clients if they aren't setup properly to keep up with the change. A restart of the server will likely revert back to using the primary, resulting in the clients thinking everything is back up and healthy again. This situation varies so much between vendors it's hard to say much more without more details on your setup.

Best,

ellis

P.S., apologies for the top-post last time around.

--
Ellis H. Wilson III, Ph.D.
     www.ellisv3.com
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to