Re: [Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK

2009-12-04 Thread Greg Keller
On Dec 4, 2009, at 3:23 PM, Bogdan Costescu wrote: When loading/reloading the driver there seems to be an instantaneous drop of the link that forces a new delay cycle. Most likely the PXE stack doesn't reset the link; the link is up soon after the computer is powered on so, by the time the P

Re: [Beowulf] Re: cluster fails to boot with managed switch, but 5-port switch works OK

2009-12-04 Thread Bogdan Costescu
On Thu, Dec 3, 2009 at 9:17 PM, Greg Keller wrote: > Essentially, once the port > has a physical link light it may take a while before spanning tree allows > traffic to actually flow through the port.  Longer than a typical timeout. The time taken to activate the link is around 60s, but I've been

Re: [Beowulf] mpirun command

2009-12-04 Thread Don Holmgren
Your PC is likely running a Linux distribution that has LAM installed by default, and "mpirun" is in your path ahead of mpich's mpirun. You can confirm this with `which mpirun`. Try /mirror/mpich-1.2.7p1/bin/mpirun -np 1 cpi instead. Or make sure that /mirror/mpich-1.2.7p1/bin is at the fron

Re: [Beowulf] mpirun command

2009-12-04 Thread Gus Correa
Hi Christian Your default mpirun seems to be the old LAM MPI. Do "which mpirun", "mpirun --showme". You can use the full path name to your MPICH mpirun. You should also use the full path name to MPICH mpicc to compile cpi.c, for compatibility. It is likely that both are somewhere in your /mirror

[Beowulf] mpirun command

2009-12-04 Thread christian suhendra
hello guys... im using mpich-1.2.7p1 installed on my PC but when i run mpirun i've got this error : suhendr...@cluster2:/mirror/mpich-1.2.7p1/examples$ mpirun -np 1 cpi - It seems that there is no lamd running on the host

Re: [Beowulf] Dual head or service node related question ...

2009-12-04 Thread Reuti
Hi, Am 04.12.2009 um 10:24 schrieb Hearns, John: What is viewed as the best practice (or what are people doing) on something like an SGI ICE system with multiple service or head nodes? Does one service node generally assume the same role as the head node above (serving NFS, logins, and runni

RE: [Beowulf] New member, upgrading our existing Beowulf cluster

2009-12-04 Thread Hearns, John
It's not inevitable that the policy be that 3 month jobs are allowed. Three MONTHS. Some celebrities careers are shorter than that these days. If people running jobs like this don't checkpoint, they deserve everything they get. The contents of this email are confidential and for the exclusive

RE: [Beowulf] Dual head or service node related question ...

2009-12-04 Thread Hearns, John
What is viewed as the best practice (or what are people doing) on something like an SGI ICE system with multiple service or head nodes? Does one service node generally assume the same role as the head node above (serving NFS, logins, and running services like PBS pro)? Or ... if NFS is u