but what if you have a bi-cpu bi-core machine to which you assign 4 slots. Now one slot is being used by a process which performs heavy IO. Suppose another process is launched that performs heavy IO. In that case the latter process should wait until the first one is done to avoid slowing down the efficiency of the system. Generally however, clusters take only time and memory requirements into account.
that seems rather silly. conceptually, there's no reason the scheduler can't model IO capacity as it does cpu or memory. permit users to declare that their jobs have some IO requirement - 1.0 meaning 1 share (node capacity divided by ncores.)
submits 4 jobs, the jobs do not directly start to generate heavy I/O. So the scheduler might think that the 4 jobs can easily coexist on this same node. However, after a few minutes all 4 jobs start eating disk BW and slow the node down horribly. What would your suggestion be to solve this ?
this is why I've never found LSF's load indices very useful, except on large shared-memory machines. of course, everything about scheduling on such a machine is easier (in general: partitions make life harder.)
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf