Paul, On 30 May 2008 at 15:47, Paul Hewson wrote: | Hello, | | We have R working with Rmpi/openmpi, but I'm a little worried. Specifically, (a) the -np flag doesn't seem to override the hostfile (it works fine with fortran hello world) and (b) I appear to have twice as many processes running as I think I should. | | Rmpi version 0.5.5 | Openmpi version 1.1
That's old. Open MPI 1.2.* fixed and changed a lot of things. I am happy with 1.2.6, the default on Debian. | Viglen HPC with (effectively) 9 blades and 8 nodes on each blade. | myhosts file contains details of the 9 blades, but specifies that there are 4 slots on each blade (to make sure I leave room for other users). | | When running mpirun -bynode -np 2 -hostfile myhosts R --slave --vanilla task_pull.R | | 1. I get as many R slaves as there slots defined in my myhosts file (there are 36 slots defined, and I get 36 slaves, regardless of the setting of -np, the master goes on the first machine in the myhosts file. | 2. The .Rout file confirms that I have 1 comm with 1 master and 36 slaves | 3. When I top each blade it indicates that there are in fact 8 processes running on each blade and | 4. When I pstree each blade it indicates that there are two orted processes, each with 4 subprocesses. You never showed us task_pull.R ... And as I readily acknowledge that this can be tricky, why don't you experiment with simple setting?. Consider this token littler [1] invocation (or use Rscript if you prefer / have only that): [EMAIL PROTECTED]:~> r -e'library(Rmpi); cat("Hello rank", mpi.comm.rank(0), "size", mpi.comm.size(0), "on", mpi.get.processor.name(), "\n")' Hello rank 0 size 1 on ron [EMAIL PROTECTED]:~> So without an outer mpirun (or orterun as the Open MPI group now calls it) we get one instance. Makes sense. Now with two hosts defined on the fly, and two instances each: [EMAIL PROTECTED]:~> orterun -n 4 -H ron,joe r -e'library(Rmpi); cat("Hello rank", mpi.comm.rank(0), "size", mpi.comm.size(0), "on", mpi.get.processor.name(), "\n")' Hello rank 0 size 4 on ron Hello rank 2 size 4 on ron Hello rank 3 size 4 on joe Hello rank 1 size 4 on joe [EMAIL PROTECTED]:~> Adding '-bynode' and using '-np 4' instead of '-n 4' does not change anything. | >From the point of view of getting a job done this ***seems*** OK (it's running very quickly), but it doesn't seem quite right - given I'm sharing the machine with other users and so on. Is there something I've missed in the useage of mpirun with R/Rmpi. I cannot quite determine from what you said here what your objective is. What exactly are you trying to do that you are not getting done? Using fewer instances? Maybe that is in fact an Open MPI 1.2.* versus 1.1.* issue. One thing to note is that if you wrap all this in the excellent snow packache by Tierney et al, then Open MPI's '-n' can always be one as determine from _within_ how many nodes you want: [EMAIL PROTECTED]:~> orterun -bynode -np 1 -H ron,joe r -e'library(snow); cl <- makeCluster(4, "MPI"); res <- clusterCall(cl, function() Sys.info()["nodename"]); print(do.call(rbind, res))' Loading required package: utils Loading required package: Rmpi 4 slaves are spawned successfully. 0 failed. nodename [1,] "joe" [2,] "ron" [3,] "joe" [4,] "ron" [EMAIL PROTECTED]:~> Note the outer '-n 1' and the inner makeCluster(4, "MPI") to give you 4 slaves. If you use a larger '-n $N' you will get $N instances each starting as many nodes as makeCluster asks for. Hope this helps, Dirk [1] Littler can be had via Debian / Ubuntu or from http://dirk.eddelbuettel.com/code/littler.html -- Three out of two people have difficulties with fractions. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.