On Sat, Apr 04, 2009 at 05:24:23PM -0400, Jason Riedy wrote:
> And Robert G. Brown writes:
> > For them servicing/replacing a system is cheap: Box dies.
> > Employee notes this, grabs box from Big Stack of Boxes, carries
> > it to dead box, removes dead box, replace it with new working
> > box, presses power switch, walks away.
> 
> Plus, your operator can be unskilled.

Um, not completely. These clusters work by starting with 3 copies of
every chunk of the data, and as you work you have to make sure that
you don't take down the wrong system and leave the cluster with 0 or 1
copies of a chunk of data. There are software mechanisms you can use
to help, but the operator needs to know how the rules work.

Some tasks, yeah, no problem: if the box is already dead. But many
tasks involve boxes which aren't dead yet: 1 disk has failed, the box
needs a reboot to run a new kernel, a new release of the application
software, etc etc.

-- greg


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to