David Mathog wrote:
Do you really need a UPS for the whole cluster? In many instances it is good enough to put a UPS on the master node and just use surge suppressors on the compute nodes. The up side being that only a small and relatively inexpensive UPS is required. The down side being of course that any power failure will break ongoing calculations. However, if your work can be check pointed a power failure will only wipe out the work since the last check point. If your power is reasonably reliable, this is a reasonable way to save a lot of money. Also, unless you are also buying a generator, a whole cluster UPS will only buy you limited up time during a power failure, so you may well lose the calculation despite the large UPS.
As you remark, it really depends on how reliable your power is. The vast majority of our power outages are either just blips or last under a minute. The blips are enough to reboot the computers. Generally (for us) if power goes out for more than a couple of minutes there is a good chance that it is going to be out for long enough that the nodes will need to be shut down. Having the whole cluster on UPS power saves a lot of down time.
Whether it is one large UPS or many small ones (a la Google) depends on your setup. One large one is easy to manage but it is expensive (and noisy! our Liebert emits a terrible high frequency noise). The small ones (one per node) are a lot cheaper but there is a lot of clutter unless you can do something similar to Google, but those aren't commodity parts.
Steve _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf