The main issue we see is that OST's get hung up occassionally which causes writes to hang as the OST flaps connecting and disconnecting with the MDS.  Rebooting the OSS's fixes the issue as it forces the remount.  It seems to only happen when the system is full (i.e. above 95% usage) and under heavy load.  Previous to our CentOS7 upgrade we didn't see this issue so we are convinced it is due to mismatch in the Lustre version.  Though it is most certainly the case that the fullness of the filesystem is contributing as it seems to go away when the filesystem usage is lower.  Still I have seen it a few times when the filesystem was at 85%.

Anyways the obvious culprit is the version mismatch.  It may also be that some of the addition features/enhancements in the 2.5.34 are conflicting with the mainline version as the 2.5.34 is something we got from Intel for the IEEL appliance we have been running.

Odds are you systems are fine as they aren't taking quite the pounding ours is.  The problem doesn't happen that frequently.

-Paul Edmon-


On 07/23/2018 02:03 PM, Michael Di Domenico wrote:
On Mon, Jul 23, 2018 at 1:34 PM, Paul Edmon <ped...@cfa.harvard.edu> wrote:
Yeah we've found out firsthand that its problematic as we have been seeing
issues :).  Hence the urge to upgrade.
what issues are you seeing?  I have 2.10.4 clients pointing at 2.5.1
servers, haven't seen any obvious issues and it's been running for
sometime now.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to