On 10 May 2005, at 1:05 am, Paul Brossier wrote:
Hi all,
I am looking at ways to distribute batch jobs on various hosts. Essentially, i have N different command lines, and M different hosts to run them on:
foo -i file1.data -p 0.1 foo -i file2.data -p 0.1 foo -i file3.data -p 0.1 ... foo -i file1.data -p 0.2 ...
I had a try with 'queue' [1], but it seems rather obsolete now. I am now seeking recent alternatives. I went across a few solutions, such as DQS [2] (non-free, unmaintained), OpenPBS [3] (non-free), and distribulator [4] (looks interesting).
Now i feel like i have missed something obvious. Is there a tool out there that i could use as a drop in replacement for queue?
This is not the right forum for this question.
However, I'll answer you anyway, since I know something about this. The two market leaders for this sort of processing are Sun GridEngine (which is free [as in beer, at least]) and Platform LSF, which is proprietary and costs $$$, but is very good at what it does.
Both products can do what you are asking. Personally, I use LSF in my day job on a ~1500 CPU cluster, running a mixture of Red Hat 8.0, Debian sarge (on newer X86 boxes), Tru64 5.1B (on alphas) and SGI ProPack Linux (on our SGI Altixes), but I know SGE could run this as well.
In LSF, you'd submit that set of jobs (let's say your files are named file1.data - file100.data) as something like the following:
bsub -J"set1[1-100]" -o 0.1.output.%I foo -i file\$LSB_JOBINDEX.data -p 0.1
bsub -J"set2[1-100]" -o 0.2.output.%I foo -i file\$LSB_JOBINDEX.data -p 0.2
The standard output and standard error, as well as a job summary (CPU time and memory used, etc) would appear in output files named:
0.1.output.1 0.1.output.2 etc
GridEngine would have its own methods for doing these so called "job arrays".
I looked at GNU queue a long time ago, and it looked (to me) as though its mode of operation was largely based on how LSF works, but when I looked at GNU queue it was pretty fundamentally broken (and it got removed from woody as a result). GridEngine is rather different in its organisation, but a lot of people swear by it.
Tim
-- Dr Tim Cutts GPG: 1024/D FC81E159 5BA6 8CD4 2C57 9824 6638 C066 16E2 F4F5 FC81 E159
-- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]