I can try to help you here, and would need to understand your setup and
on which port the drop is occurring on.  Bad cable causing this seems
very unlikely.

Gilad. 

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Prentice Bisbal
Sent: Tuesday, December 02, 2008 7:24 AM
To: Beowulf Mailing List
Subject: [Beowulf] InfiniBand VL15 error

I'm getting this error when I run ibchecknet on my cluster:

#warn: counter VL15Dropped = 476        (threshold 100) lid 1 port 1
Error check on lid 1 (aurora HCA-1) port 1:  FAILED

I've googled around this morning, but haven't found anything helpful.
Most of the hits turn up code with the phrase "VL15Dropped", but nothing
explaining what this error means, what causes it, or how to fix it.

After clearing the counters with 'perfquery -r', the VL15Dropped count
starts increasing from zero almost immediately.

Any ideas what this error represents or how to fix? Could it be a bad
cable?

--
Prentice
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org To change your subscription
(digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to