Hi,
In the same vein, Benoit (cc of this mail) created an automatic way to
install a whole cluster based on centos/salt/slurm... Take a look here :
https://github.com/oxedions/banquise
On my side, I play with LXD to automate all this stuff (but I thought
about doing it with singularity too). The main idea is to have a stable
OS on the hardware that you do not need to (re-)install anymore but just
some containers that you can move or resize to fit your needs. I do not
know how exactly, but IMHO, I think our work should move to this kind of
DevOps things as in the cloud area. Actually I have some salt recipies
to orchestrate hardware reinstallation for some nodes (not all my
clusters), and then, I apply automatically other formulas to deploy the
containers based on some specifications.
Best regards,
Remy
Le 29/09/2016 à 13:33, Olli-Pekka Lehto a écrit :
We have our latest cluster software stack for a distributed set of
clusters built on Ansible:
https://github.com/CSC-IT-Center-for-Science/fgci-ansible
A recent presentation at the SLURM User Group on Ansiblizing SLURM:
https://gitpitch.com/CSC-IT-Center-for-Science/ansible-role-slurm/gitpitch
I see benefits also in being able to share playbooks and collaborate
on improving them with other teams in our organization and the
Universities, even ones working in non-HPC areas.
Best regards,
Olli-Pekka
--
Olli-Pekka Lehto
Development Manager
Computing Platforms
CSC - IT Center for Science Ltd.
E-Mail: olli-pekka.le...@csc.fi
Tel: +358 50 381 8604
skype: oplehto // twitter: ople
------------------------------------------------------------------------
*From: *"Craig Andrew" <cband...@wi.mit.edu>
*To: *"Tim Cutts" <t...@sanger.ac.uk>
*Cc: *beowulf@beowulf.org
*Sent: *Wednesday, 28 September, 2016 18:01:59
*Subject: *Re: [Beowulf] more automatic building
I agree with Tim.
We are finishing up an Ansible install and it has worked well for us.
Initially, we used it internally to help standardize our cluster
builds, but is has many more uses. We recently used it to
provision a VM that we saved off and uploaded to Amazon for
building an AMI. You can also use it to change attributes on your
running systems. I have used at Cobler in the past and it works
well, too. I just find Ansible to be a little easier.
Good luck,
Craig
Craig Andrew
Manager of Systems Administration
Whitehead Institute for Biomedical Research
------------------------------------------------------------------------
*From: *"Tim Cutts" <t...@sanger.ac.uk>
*To: *"Mikhail Kuzminsky" <mikk...@mail.ru>, beowulf@beowulf.org
*Sent: *Wednesday, September 28, 2016 10:46:41 AM
*Subject: *Re: [Beowulf] more automatic building
Any number of approaches will work. When I used to do this years
ago (I've long since passed on the technical side) I'd PXE boot,
partition the hard disk and set up a provisioning network and base
OS install using the Debian FAI (Fully Automated Install) system,
and then use cfengine to configure the machine once it had come in
that minimal state. This approach was used across the board for
all of our Linux boxes, from Linux desktops to database servers to
HPC compute nodes.
These days the team uses tools like cobbler and ansible to achieve
the same thing. There are lots of ways to do it, but the principle
is the same.
Tim
--
Head of Scientific Computing
Wellcome Trust Sanger Institute
On 28/09/2016, 15:34, "Beowulf on behalf of Mikhail Kuzminsky"
<beowulf-boun...@beowulf.org <mailto:beowulf-boun...@beowulf.org>
on behalf of mikk...@mail.ru <mailto:mikk...@mail.ru>> wrote:
I worked always w/very small HPC clusters and built them
manually (each server).
But what is reasonable to do for clusters containing some tens
or hundred of nodes ?
Of course w/modern Xeon (or Xeon Phi KNL) and IB EDR, during
the next year for example.
There are some automatic systems like OSCAR or even ROCKS.
But it looks that ROCKS don't support modern interconnects,
and there may be problems
w/OSCAR versions for support of systemd-based distributives
like CentOS 7. For next year -
is it reasonable to wait new OSCAR version or something else ?
Mikhail Kuzminsky,
Zelinsky Institute of Organic Chemistry RAS,
Moscow
-- The Wellcome Trust Sanger Institute is operated by Genome
Research Limited, a charity registered in England with number
1021457 and a company registered in England with number 2742969,
whose registered office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin
Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
___________________________________________
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin
Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
Rémy Dernat
Ingénieur d'Etudes
MBB/ISE-M
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf