PS - And don't run the programs as root!

Gus Correa

Gus Correa wrote:
Hi Christian

Somehow your program was not attached to the message.

In any case, you didn't say anything about your "machinefile" contents.
You need to list the nodes you want to use there.
The command line will be something like this:

mpirun -np 4 -machinefile my_machinefile canon

"man mpirun" may help you with the details.
(I assume you are using the mpirun that comes with mpich1.)

Having said that, I suggest that you move from MPICH-1 to
OpenMPI or to MPICH2.
MPICH-1 (mpich-1.2.7p1) is old, not maintained or supported anymore,
and often times breaks in current Linux kernels.
The MPICH developers also recommend upgrading to MPICH2.

OpenMPI and MPICH2 are free, easy to install, stable, up to date,
and more efficient than MPICH1.
Upgrading to one of them is likely to avoid more trouble later,
specially with your tight deadline.

See:
http://www.open-mpi.org/
http://www.mcs.anl.gov/research/projects/mpich2/


I hope this helps,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------


christian suhendra wrote:
hello guys
i have installed mpich-1.2.7p1 on ubuntu 9.04, i have configured hte NFS and RSH..
i use device=ch_p4,,
but when i ran my program it's like not working i've got this result :
r...@cluster3:/mirror/mpich-1.2.7p1# mpirun -np 1 canon
Process 0 of 1 on cluster3
Total Time: 4.316000 msecs
r...@cluster3:/mirror/mpich-1.2.7p1# mpirun -np 4 canon
Process 0 of 4 on cluster3
Total Time: 21.552000 msecs
Process 2 of 4 on cluster2
Process 1 of 4 on cluster1
Process 3 of 4 on cluster1
r...@cluster3:/mirror/mpich-1.2.7p1#

the process only wotk in 1 node..
but when i test the machine it connected to all node..
r...@cluster3:/mirror/mpich-1.2.7p1# /mirror/mpich-1.2.7p1/sbin/tstmachines -v LINUX
Trying true on cluster1 ...
Trying true on cluster2 ...
Trying true on cluster3 ...
Trying true on cluster4 ...
Trying ls on cluster1 ...
Trying ls on cluster2 ...
Trying ls on cluster3 ...
Trying ls on cluster4 ...
Trying user program on cluster1 ...
Trying user program on cluster2 ...
Trying user program on cluster3 ...
Trying user program on cluster4 ...

i don't know where exactly the problem so that my program cannot run in all node..
please help me...
my deadline its about 1 week later...
i'm very excpeting your help...


i attached my listing program so you can test on your system
thank you very much...




regards
christian



------------------------------------------------------------------------

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to