On Wednesday, December 29, 2010 07:29:21 pm Stuart Barkley wrote: > On Mon, 13 Dec 2010 at 17:43 -0000, Christopher Samuel wrote: ... > > One of the checks we do is to check that there are no symbol errors > > on the IB link. However, I'm wondering if simply saying a single > > error is too brutal for this - what do other people do about these ? > > I'm looking at Infiniband problems currently and have been watching > our SymbolErrorCounter values. I'm told a "small number" of these > errors are okay. I don't know the definition of "small" or over how > long a time period.
My personal take on this is that for a week of data or so two digits indicates a non-perfect link/port (but will probably not be a real problem). Three digits is a problem, fix it. /Peter
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf