On 1/11/19 7:59 AM, Richard Chang wrote:
Hi,
I would like to know if we have or can make( or prepare) a USB
bootable OS that we can boot in a cluster and its nodes to test all
its functionality.
The purpose of this is to boot a new or existing cluster to check its
health, including Infiniband network, any cards, local hard disks,
memory etc, so that I don't have to disturb the existing OS and its
configuration.
If possible, it would be nice to boot the compute nodes from the
master node.
Anyone knows of any pre-existing distribution that will do the job ?
Or know how to do it with Centos or Ubuntu ?
FWIW: this is one of the uses cases of
https://github.com/joelandman/nyble . It works with CentOS, Debian, and
Ubuntu (though I've not pushed the 18.04.1 changes yet).
I have a rudimentary USB target I was going to clean up soon, and the
images can be centrally booted from a pxe server, and pull/run scripts
post boot.
Runs in RAM, you can modify the distributions to your hearts content. I
have a few private repos here which have NVidia + MLNX + other drivers
and related bits already built in.
I've set up many systems with this, tying it together with
https://github.com/joelandman/tiburon for boot control. This was
originally used at Scalable Informatics when we were alive, and has
evolved significantly since then.
If you want a simple pure USB distro for this, try SystemRescueCD,
though I don't think it does Infiniband, or most drivers.
--
Joe Landman
e: joe.land...@gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf