Folks, I have a Beowulf-type Debian-based cluster of 16 dual PII 333Mhz boxes (Asus P2L97-DS) Debian 1.3.1.
They run hard, often under full load with high network traffic for days or not weeks. The current application is an MPI application using a master-slave paradigm. I've noticed that the "master" machine often hangs just after daily or weekly crons. In fact, 6:48 or a bit later is a very popular hang time. Hangs on Sunday are the most popular (much to my chagrin!). The fact that the master hangs preferentially correlates with high network traffic. At first, I disabled (dpkg -r) xntp3 which seemed to be an obvious culprit but I'm still getting occasional hangs. Any thoughts? I suppose I could just disable these crons altogether but I'd like to see if any of you have run across this sort of problem first. --M =========================================================================== Martin Weinberg Phone: (413) 545-3821 Dept. of Physics and Astronomy FAX: (413) 545-2117/0648 530 Graduate Research Tower University of Massachusetts Amherst, MA 01003-4525 -- Unsubscribe? mail -s unsubscribe [EMAIL PROTECTED] < /dev/null