Hi,

In the same vein, Benoit (cc of this mail) created an automatic way to install a whole cluster based on centos/salt/slurm... Take a look here : https://github.com/oxedions/banquise

On my side, I play with LXD to automate all this stuff (but I thought about doing it with singularity too). The main idea is to have a stable OS on the hardware that you do not need to (re-)install anymore but just some containers that you can move or resize to fit your needs. I do not know how exactly, but IMHO, I think our work should move to this kind of DevOps things as in the cloud area. Actually I have some salt recipies to orchestrate hardware reinstallation for some nodes (not all my clusters), and then, I apply automatically other formulas to deploy the containers based on some specifications.


Best regards,

Remy



Le 29/09/2016 à 13:33, Olli-Pekka Lehto a écrit :
We have our latest cluster software stack for a distributed set of clusters built on Ansible:
https://github.com/CSC-IT-Center-for-Science/fgci-ansible

A recent presentation at the SLURM User Group on Ansiblizing SLURM:
https://gitpitch.com/CSC-IT-Center-for-Science/ansible-role-slurm/gitpitch

I see benefits also in being able to share playbooks and collaborate on improving them with other teams in our organization and the Universities, even ones working in non-HPC areas.

Best regards,
Olli-Pekka
--
Olli-Pekka Lehto
Development Manager
Computing Platforms
CSC - IT Center for Science Ltd.
E-Mail: olli-pekka.le...@csc.fi
Tel: +358 50 381 8604
skype: oplehto // twitter: ople

------------------------------------------------------------------------

    *From: *"Craig Andrew" <cband...@wi.mit.edu>
    *To: *"Tim Cutts" <t...@sanger.ac.uk>
    *Cc: *beowulf@beowulf.org
    *Sent: *Wednesday, 28 September, 2016 18:01:59
    *Subject: *Re: [Beowulf] more automatic building

    I agree with Tim.

    We are finishing up an Ansible install and it has worked well for us.

    Initially, we used it internally to help standardize our cluster
    builds, but is has many more uses. We recently used it to
    provision a VM that we saved off and uploaded to Amazon for
    building an AMI. You can also use it to change attributes on your
    running systems. I have used at Cobler in the past and it works
    well, too. I just find Ansible to be a little easier.

    Good luck,
    Craig

    Craig Andrew
    Manager of Systems Administration
    Whitehead Institute for Biomedical Research

    ------------------------------------------------------------------------
    *From: *"Tim Cutts" <t...@sanger.ac.uk>
    *To: *"Mikhail Kuzminsky" <mikk...@mail.ru>, beowulf@beowulf.org
    *Sent: *Wednesday, September 28, 2016 10:46:41 AM
    *Subject: *Re: [Beowulf] more automatic building

    Any number of approaches will work.  When I used to do this years
    ago (I've long since passed on the technical side) I'd PXE boot,
    partition the hard disk and set up a provisioning network and base
    OS install using the Debian FAI (Fully Automated Install) system,
    and then use cfengine to configure the machine once it had come in
    that minimal state.  This approach was used across the board for
    all of our Linux boxes, from Linux desktops to database servers to
    HPC compute nodes.

    These days the team uses tools like cobbler and ansible to achieve
    the same thing. There are lots of ways to do it, but the principle
    is the same.

    Tim

--
    Head of Scientific Computing

    Wellcome Trust Sanger Institute

    On 28/09/2016, 15:34, "Beowulf on behalf of Mikhail Kuzminsky"
    <beowulf-boun...@beowulf.org <mailto:beowulf-boun...@beowulf.org>
    on behalf of mikk...@mail.ru <mailto:mikk...@mail.ru>> wrote:

        I worked always w/very small HPC clusters and built them
        manually (each server).
        But what is reasonable to do for clusters containing some tens
        or hundred of nodes ?
        Of course w/modern Xeon (or Xeon Phi KNL) and IB EDR, during
        the next year for example.
        There are some automatic systems like OSCAR or even ROCKS.

        But it looks that ROCKS don't support modern interconnects,
        and there may be problems
        w/OSCAR versions for support of systemd-based distributives
        like CentOS 7. For next year -
        is it reasonable to wait new OSCAR version or something else ?

        Mikhail Kuzminsky,
        Zelinsky Institute of Organic Chemistry RAS,
        Moscow


    -- The Wellcome Trust Sanger Institute is operated by Genome
    Research Limited, a charity registered in England with number
    1021457 and a company registered in England with number 2742969,
    whose registered office is 215 Euston Road, London, NW1 2BE.
    _______________________________________________
    Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin
    Computing
    To change your subscription (digest mode or unsubscribe) visit
    http://www.beowulf.org/mailman/listinfo/beowulf

-- ___________________________________________



    _______________________________________________
    Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin
    Computing
    To change your subscription (digest mode or unsubscribe) visit
    http://www.beowulf.org/mailman/listinfo/beowulf



_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

--
Rémy Dernat
Ingénieur d'Etudes
MBB/ISE-M

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to