On Thu, 3 Dec 2009 at 2:29pm, Mark Hahn wrote
if a single node goes down, you need to take down all the
nodes in the chassis before you can remove the dead node. Not very
practical.
Eh? What's so hard about marking the other nodes as unusable in your
batch system, and waiting for them to become free?
depends on your max job length. but yeah, idling three nodes for a week
is not going to be noticable in anything but a quite small cluster...
But doesn't the engineer in you just bristle at the (admittedly, rather
slight) inefficiency? Call me OCD (you wouldn't be the first), but it
just bugs me.
--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf