Re: [Beowulf] Re: Purdue Supercomputer

Mark Hahn Sat, 10 May 2008 17:39:41 -0700

clusters.What if you have 1 of the systems in the cluster down or any
network failures.Can make our cluster(2-5 sytems only) work properly.


normally, the cluster's management software will monitor and deal with
node failure.  at least that means noticing a failure and ensuring that the
node isn't used (until fixed) and dealing with any jobs that involved the
node.  it's also fairly common for server nodes (not just slave/compute
nodes) to have some failover/high-availability features.  (HA can also be
done for compute jobs, but IMHO it's not worth considering in normal cases,
ie, infrequent node failures.)

Also what about geographically distant cluster systems.Say 1 in USA


sure, there's nothing about clusters that really assumes locality,
though obviously geographic distribution has effects on achievable
performance for wide-area MPI or distant file access.  wide-area
clustering seems more of a political stunt to me (yes, including grids.)

and other in India.How do we manage our cluster in mishaps or
difficult conditions.


I find that with IPMI and console redirection, it's very rarely necessary to
care about where your nodes are, at least from a sysadmin perspective.
you need to ask what the benefit is, though, in a wide-area cluster

(versus seprate, local ones.) I wouldn't assume that management wouldbe easier, and obviously only gratuitously parallel apps (sometimes calledembarassinly parallel) could use it.

lastly, how about having beowulf cluster systems in space.putting 1 pc
on each planet or celestial body that we want to track and the server
in india.


just because it could be done doesn't mean it makes sense...

is linux the best choice in such cases...


your choice of OS depends primarily on your preference and experience.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Re: Purdue Supercomputer

Reply via email to