Jon, > I have a rack full of identical compute > nodes. One of them has become heat sensitive. > > When it's in the warm computer room it crashes. > I can't even run memtest from the CentOS DVD > for 2 seconds. However, when this node is > in my much cooler office everything works > fine. All the other nodes are working fine > in the computer room. I'd such a problem when the plastic clip wich mount the base ring of CPU cooler was broken and CPU cooler was mounted by the rest 3 clips. When I started to save Virtual Machine compiling OpenFOAM from sources, Ubuntu made shutdown on overheat. > > I'm not convinced the problem is actually > the memory. Other than opening the node > to spray cooling liquid when it's in the warm > room, what approach would you use to figure out which > component(s) is(are) failing? > > Cordially, > -- > Jon Forrest > Research Computing Support > College of Chemistry > 173 Tan Hall > University of California Berkeley > Berkeley, CA > 94720-1460 > 510-643-1032 > jlforr...@berkeley.edu > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf >
Sincerely, Dmitry Яндекс.Почта. Поищите спам где-нибудь еще http://mail.yandex.ru/nospam/sign _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf