Follow-up Comment #2, bug #38092 (project findutils): I didn't find an option called --process-slot-var in the man or info page of my xargs, is it the same as --max-procs or -P? If yes then those are not sufficient because in MPI multiple copies of the same program are launched, one per shared memory node, with identical command line arguments and environment.
So for example if I were to have a list of files to process: for i in $(seq 1 1 1000); do echo file"$i"; done > files I can easily parallelize their processing on one node by calling xargs (in many environments mpirun or its equivalent must be used to launch any program on the compute nodes): mpirun -n 1 xargs --args-file files -P 12 -n 5 But trying to use more than one node doesn't work because the same xargs is invoked on each node meaning that each copy of xargs processes all of the files. I.e.: mpirun -n 1 xargs --arg-file files -P 12 -n 5 | wc -l 200 mpirun -n 2 xargs --arg-file files -P 12 -n 5 | wc -l 400 mpirun -n 3 xargs --arg-file files -P 12 -n 5 | wc -l 600 ... mpirun -n 10 xargs --arg-file files -P 12 -n 5 | wc -l 2000 It would be nice if in the above case when n copies of xargs are started by MPI the file list would first be divided into n parts and then each of those would be processed by only one xargs in the usual way. In principle adding support for that should be easy but I didn't get very far in the few hours that I hacked on xargs.c. For a simple programs the modifications would consist of something like this: At the top: #ifdef HAVE_MPI #include <mpi.h> #endif Then inside main: #ifdef MPI_VERSION int comm_size, rank; MPI_Initialize(argc, argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); // optionally check the return value of each MPI_* call... #endif Then in the place where the arguments are processed only process for example every comm_sizeth arg starting at offset rank: #ifdef MPI_VERSION size_t current_arg = 0; #endif in the while loop (or for loop doing bc_push_arg?): #ifdef MPI_VERSION if (current_arg % size == rank) { current_arg++; #endif ...do the usual stuff here... #ifdef MPI_VERSION } else { ...maybe do something here just to skip this argument?... } #endif and just before returning from main: original_exit_value = child_error; MPI_Finalize(); return child_error; If not using MPI then the preprocessed source should stay identical to the current version so these changes shouldn't be too invasive. _______________________________________________________ Reply to this item at: <http://savannah.gnu.org/bugs/?38092> _______________________________________________ Message sent via/by Savannah http://savannah.gnu.org/