Hi, Prentice. Have you checked MTU matches on all NICs and is honored by the router?
Cheers. On Wed 04/19/17 02:34PM EDT, Prentice Bisbal wrote: > > On 04/19/2017 02:17 PM, Ellis H. Wilson III wrote: > >On 04/19/2017 02:11 PM, Prentice Bisbal wrote: > >>Thanks for the suggestion(s). Just this morning I started considering > >>the network as a possible source of error. My stale file handle errors > >>are easily fixed by just restarting the nfs servers with 'service nfs > >>restart', so they aren't as severe you describe. > > > >If a restart on solely the /server-side/ gets you back into a good > >state this is an interesting tidbit. > That is correct, restarting NFS on the server-side is all it takes > to fix the problem > >Do you have some form of HA setup for NFS? Automatic failover > >(sometimes setup with IP aliasing) in the face of network hiccups > >can occasionally goof the clients if they aren't setup properly to > >keep up with the change. A restart of the server will likely > >revert back to using the primary, resulting in the clients > >thinking everything is back up and healthy again. This situation > >varies so much between vendors it's hard to say much more without > >more details on your setup. > > > My setup isn't nearly that complicated. Every node in this cluster > has a /local directory that is shared out to the other nodes in the > cluster. The other nodes automount this by remote directory as > /l/hostname, where "hostname" is the name of owner of the > filesystem. For example, hostB will mount hostA:/local as /l/lhostA. > > No fancy fail-over or anything like that. > >Best, > > > >ellis > > > >P.S., apologies for the top-post last time around. > > > NO worries. I'm so used to people doing that, in mailing lists that > I've become numb to it. > > Prentice > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Gavin W. Burris Senior Project Leader for Research Computing The Wharton School University of Pennsylvania Search our documentation: http://research-it.wharton.upenn.edu/about/ Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf