On 1/11/19 7:59 AM, Richard Chang wrote:
Hi,
I would like to know if we have or can make( or prepare) a USB bootable OS that we can boot in a cluster and its nodes to test all its functionality.

The purpose of this is to boot a new or existing cluster to check its health, including Infiniband network,  any cards, local hard disks, memory etc, so that I don't have to disturb the existing OS and its configuration.

If possible, it would be nice to boot the compute nodes from the master node.

Anyone knows of any pre-existing distribution that will do the job ? Or know how to do it with Centos or Ubuntu ?

FWIW: this is one of the uses cases of https://github.com/joelandman/nyble .  It works with CentOS, Debian, and Ubuntu (though I've not pushed the 18.04.1 changes yet).

I have a rudimentary USB target I was going to clean up soon, and the images can be centrally booted from a pxe server, and pull/run scripts post boot.

Runs in RAM, you can modify the distributions to your hearts content.  I have a few private repos here which have NVidia + MLNX + other drivers and related bits already built in.

I've set up many systems with this, tying it together with https://github.com/joelandman/tiburon for boot control.   This was originally used at Scalable Informatics when we were alive, and has evolved significantly since then.

If you want a simple pure USB distro for this, try SystemRescueCD, though I don't think it does Infiniband, or most drivers.


--

Joe Landman
e: joe.land...@gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to