Hi guys, ok, some more information. I am using OpenMPI-1.2.8 and I only start 4 processes per node. So my hostfile looks like that: comp12 slots=4 comp18 slots=4 comp08 slots=4
And yes, one process is the idle one which does things in the background. I have observed similar degradions before with a different program (GAMESS) where in the end, running a job on one node was _faster_ then running it on more than one nodes. Clearly, there is a problem here. Interesting to note that the fith process is consuming memory as well, I did not see that at the time when I posted it. That is somehow odd as well, as a different calculation (same program) does not show that behaviour. I assume it is one extra process per job-group which will act as a master or shepherd for the slave processes. I know that GAMESS (which does not use MPI but ddi) has one additional process as data-server. IIRC, the extra process does come from NWChem, but I doubt I am oversubscribing the node as it usually should not do much, as mentioned before. I am still wondering whether that could be a network issue? Thanks for your comments! All the best Jorg On Wednesday 16 December 2009 04:42:59 beowulf-requ...@beowulf.org wrote: > Hi Glen, Jorg > > Glen: Yes, you are right about MPICH1/P4 starting extra processes. > However, I wonder if that is what is happening to Jorg, > of if what he reported is just plain CPU oversubscription. > > Jorg: Do you use MPICH1/P4? > How many processes did you launch on a single node, four or five? > > Glen: Out of curiosity, I dug out the MPICH1/P4 I still have on an > old system, compiled and ran "cpi.c". > Indeed there are extra processes there, besides the ones that > I intentionally started in the mpirun command line. > When I launch two processes on a two-single-core-CPU machine, > I also get two (not only one) extra processes, in a total of four. > > However, as you mentioned, > the extra processes do not seem to use any significant CPU. > Top shows the two actual processes close to 100% and the > extra ones close to zero. > Furthermore, the extra processes don't use any > significant memory either. > > Anyway, in Jorg's case all processes consumed about > the same (low) amount of CPU, but ~15% memory each, > and there were 5 processes (only one "extra"?, is it one per CPU socket? > is it one per core? one per node?). > Hence, I would guess Jorg's context is different. > But ... who knows ... only Jorg can clarify. > > These extra processes seem to be related to the > mechanism used by MPICH1/P4 to launch MPI programs. > They don't seem to appear in recent OpenMPI or MPICH2, > which have other launching mechanisms. > Hence my guess that Jorg had an oversubscription problem. > > Considering that MPICH1/P4 is old, no longer maintained, > and seems to cause more distress than joy in current kernels, > I would not recommend it to Jorg or to anybody anyway. > > Thank you, > Gus Correa > --------------------------------------------------------------------- > Gustavo Correa > Lamont-Doherty Earth Observatory - Columbia University > Palisades, NY, 10964-8000 - USA > --------------------------------------------------------------------- -- ************************************************************* Jörg Saßmannshausen Research Fellow University of Strathclyde Department of Pure and Applied Chemistry 295 Cathedral St. Glasgow G1 1XL email: jorg.sassmannshau...@strath.ac.uk web: http://sassy.formativ.net Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf