On 19 Apr 2007, at 3:20 pm, Toon Knapen wrote:
Tim Cutts wrote:
Optimising for throughput, at least with an embarrassingly
parallel workload of serial jobs like we have here, is trivial; a
single first-come-first-served queue is optimal, as long as the
code is well written, and doesn't block too much on shared
resources like file servers or databases.
but what if you have a bi-cpu bi-core machine to which you assign 4
slots. Now one slot is being used by a process which performs heavy
IO. Suppose another process is launched that performs heavy IO. In
that case the latter process should wait until the first one is
done to avoid slowing down the efficiency of the system. Generally
however, clusters take only time and memory requirements into account.
I think that varies. LSF records the current I/O of a node as one of
its load indices, so you can request a node which is doing less than
a certain amount of I/O. I imagine the same is true of SGE, but I
wouldn't know.
Additionally, in the case above, for optimising the efficiency of
the node, I might prefer to launch just 1 process which uses 4
threads to perform multi-threaded (BLAS) calculations.
That could certainly be requested with LSF:
bsub -n 4 -R"select[io < 10] span[hosts=1]" my_four_thread_job
selects a host currently performing less than 10 KB per second, and
requests four job slots on a single node.
Tim
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf