Yeah we've pinged Intel/Whamcloud to find out upgrade paths as we wanted to know what the recommended procedure is.

Sure. So we have 3 systems that we want to upgrade 1 that is a PB and 2 that are 5 PB each.  I will just give you a description of one and assume that everything would scale linearly with size. They all have the same hardware.

The head nodes are Dell R620's while the shelves are M3420 (mds) and M3260 (oss).  The MDT is 2.2T with 466G used and 268M inodes used.  Each OST is 30T with each OSS hosting 6.  The filesystem itself is 93% full.

-Paul Edmon-


On 07/23/2018 01:58 PM, Jeff Johnson wrote:
Paul,

How big are your ldiskfs volumes? What type of underlying hardware are they? Running e2fsck (ldiskfs aware) is wise and can be done in parallel. It could be within a couple of days, the time all depends on the size and underlying hardware.

Going from 2.5.34 to 2.10.4 is a significant jump. I would be sure there isn't a step upgrade advised. I know there has been step upgrades in the past, not sure about going to/from these two versions.

--Jeff

On Mon, Jul 23, 2018 at 10:34 AM, Paul Edmon <ped...@cfa.harvard.edu <mailto:ped...@cfa.harvard.edu>> wrote:

    Yeah we've found out firsthand that its problematic as we have
    been seeing issues :).  Hence the urge to upgrade.

    We've begun exploring this but we wanted to reach out to other
    people who may have gone through the same thing to get their
    thoughts.  We also need to figure out how significant an outage
    this will be.  As if it takes a day or two of full outage to do
    the upgrade that is more acceptable than a week.  We also wanted
    to know if people had experienced data loss/corruption in the
    process and any other kinks.

    We were planning on playing around on VM's to test the upgrade
    path before committing to upgrading our larger systems.  One of
    the questions we had though was if we needed to run e2fsck
    before/after the upgrade as that could add significant time to the
    outage for that to complete.

    -Paul Edmon-


    On 07/23/2018 01:18 PM, Jeff Johnson wrote:
    You're running 2.10.4 clients against 2.5.34 servers? I believe
    there are notable lnet attrs that don't exist in 2.5.34. Maybe a
    Whamcloud wiz might chime in but I think that version mismatch
    might be problematic.

    You can do a testbed upgrade to test taking a ldiskfs volume from
    2.5.34 to 2.10.4, just to be conservative.

    --Jeff


    On Mon, Jul 23, 2018 at 10:05 AM, Paul Edmon
    <ped...@cfa.harvard.edu <mailto:ped...@cfa.harvard.edu>> wrote:

        My apologies I meant 2.5.34 not 2.6.34. We'd like to get up
        to 2.10.4 which is what our clients are running.  Recently we
        upgraded our cluster to CentOS7 which necessitated the client
        upgrade.  Our storage servers though stayed behind on 2.5.34.

        -Paul Edmon-


        On 07/23/2018 01:00 PM, Jeff Johnson wrote:
        Paul,

        2.6.34 is a kernel version. What version of Lustre are you
        at now? Some updates are easier than others.

        --Jeff

        On Mon, Jul 23, 2018 at 8:59 AM, Paul Edmon
        <ped...@cfa.harvard.edu <mailto:ped...@cfa.harvard.edu>> wrote:

            We have some old large scale Lustre installs that are
            running 2.6.34 and we want to get these up to the latest
            version of Lustre.  I was curious if people in this
            group have any experience with doing this and if they
            could share them.  How do you handle upgrades like
            this?  How much time does it take?  What are the
            pitfalls?  How do you manage it with minimal customer
            interruption? Should we just write off upgrading and
            stand up new servers that are on the correct version (in
            which case we need to transfer the several PB's worth of
            data over to the new system)?

            Thanks for your wisdom.

            -Paul Edmon-

            _______________________________________________
            Beowulf mailing list, Beowulf@beowulf.org
            <mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
            To change your subscription (digest mode or unsubscribe)
            visit http://www.beowulf.org/mailman/listinfo/beowulf
            <http://www.beowulf.org/mailman/listinfo/beowulf>




-- ------------------------------
        Jeff Johnson
        Co-Founder
        Aeon Computing

        jeff.john...@aeoncomputing.com
        <mailto:jeff.john...@aeoncomputing.com>
        www.aeoncomputing.com <http://www.aeoncomputing.com>
        t: 858-412-3810 x1001   f: 858-412-3845
        m: 619-204-9061

        4170 Morena Boulevard, Suite C - San Diego, CA 92117

        High-Performance Computing / Lustre Filesystems / Scale-out
        Storage


        _______________________________________________
        Beowulf mailing list,Beowulf@beowulf.org <mailto:Beowulf@beowulf.org>  
sponsored by Penguin Computing
        To change your subscription (digest mode or unsubscribe) 
visithttp://www.beowulf.org/mailman/listinfo/beowulf
        <http://www.beowulf.org/mailman/listinfo/beowulf>


        _______________________________________________
        Beowulf mailing list, Beowulf@beowulf.org
        <mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
        To change your subscription (digest mode or unsubscribe)
        visit http://www.beowulf.org/mailman/listinfo/beowulf
        <http://www.beowulf.org/mailman/listinfo/beowulf>




-- ------------------------------
    Jeff Johnson
    Co-Founder
    Aeon Computing

    jeff.john...@aeoncomputing.com
    <mailto:jeff.john...@aeoncomputing.com>
    www.aeoncomputing.com <http://www.aeoncomputing.com>
    t: 858-412-3810 x1001   f: 858-412-3845
    m: 619-204-9061

    4170 Morena Boulevard, Suite C - San Diego, CA 92117

    High-Performance Computing / Lustre Filesystems / Scale-out Storage


    _______________________________________________
    Beowulf mailing list,Beowulf@beowulf.org <mailto:Beowulf@beowulf.org>  
sponsored by Penguin Computing
    To change your subscription (digest mode or unsubscribe) 
visithttp://www.beowulf.org/mailman/listinfo/beowulf
    <http://www.beowulf.org/mailman/listinfo/beowulf>


    _______________________________________________
    Beowulf mailing list, Beowulf@beowulf.org
    <mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
    To change your subscription (digest mode or unsubscribe) visit
    http://www.beowulf.org/mailman/listinfo/beowulf
    <http://www.beowulf.org/mailman/listinfo/beowulf>




--
------------------------------
Jeff Johnson
Co-Founder
Aeon Computing

jeff.john...@aeoncomputing.com <mailto:jeff.john...@aeoncomputing.com>
www.aeoncomputing.com <http://www.aeoncomputing.com>
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite C - San Diego, CA 92117

High-Performance Computing / Lustre Filesystems / Scale-out Storage


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to