Yeah, that's my preferred solution as the hardware we have is nearing
end of life. In that case though we would then have to coordinate the
cut over of the data to the new storage and forklift all those PB's over
to the new system, which brings its own unique challenges. Plus then
you also have to have the budget to buy the new hardware.
Right now we are just exploring our options.
-Paul Edmon-
On 07/24/2018 04:52 AM, Jörg Saßmannshausen wrote:
Hi Paul,
with a file system being 93% full, in my humble opinion it would make sense to
increase the underlying hardware capacity as well. The reasoning behind it is
that usually over time there will be more data on any given file system and
thus if there is already a downtime, I would increase the size of it as well.
I rather have a bit of a longer down time and then I got a new version of
Luster (of which I know little about) and more capacity which will last for
longer than only upgrading Luster and then run out of disc capacity a bit
later.
It also means that in your case you could simply install the new system, test
it, and then migrate the data over. Depending how it is set up you even could
do that at stages.
As you mentioned 3 different Luster servers, you could for example start with
the biggest one and use new hardware here. The freed capacity of the now
obsolete hardware could for example being utilized for the other systems.
Of course, I don't know your hardware etc.
Just some ideas from a hot London 8-)
Jörg
Am Montag, 23. Juli 2018, 14:11:40 BST schrieb Paul Edmon:
Yeah we've pinged Intel/Whamcloud to find out upgrade paths as we wanted
to know what the recommended procedure is.
Sure. So we have 3 systems that we want to upgrade 1 that is a PB and 2
that are 5 PB each. I will just give you a description of one and
assume that everything would scale linearly with size. They all have the
same hardware.
The head nodes are Dell R620's while the shelves are M3420 (mds) and
M3260 (oss). The MDT is 2.2T with 466G used and 268M inodes used. Each
OST is 30T with each OSS hosting 6. The filesystem itself is 93% full.
-Paul Edmon-
On 07/23/2018 01:58 PM, Jeff Johnson wrote:
Paul,
How big are your ldiskfs volumes? What type of underlying hardware are
they? Running e2fsck (ldiskfs aware) is wise and can be done in
parallel. It could be within a couple of days, the time all depends on
the size and underlying hardware.
Going from 2.5.34 to 2.10.4 is a significant jump. I would be sure
there isn't a step upgrade advised. I know there has been step
upgrades in the past, not sure about going to/from these two versions.
--Jeff
On Mon, Jul 23, 2018 at 10:34 AM, Paul Edmon <ped...@cfa.harvard.edu
<mailto:ped...@cfa.harvard.edu>> wrote:
Yeah we've found out firsthand that its problematic as we have
been seeing issues :). Hence the urge to upgrade.
We've begun exploring this but we wanted to reach out to other
people who may have gone through the same thing to get their
thoughts. We also need to figure out how significant an outage
this will be. As if it takes a day or two of full outage to do
the upgrade that is more acceptable than a week. We also wanted
to know if people had experienced data loss/corruption in the
process and any other kinks.
We were planning on playing around on VM's to test the upgrade
path before committing to upgrading our larger systems. One of
the questions we had though was if we needed to run e2fsck
before/after the upgrade as that could add significant time to the
outage for that to complete.
-Paul Edmon-
On 07/23/2018 01:18 PM, Jeff Johnson wrote:
You're running 2.10.4 clients against 2.5.34 servers? I believe
there are notable lnet attrs that don't exist in 2.5.34. Maybe a
Whamcloud wiz might chime in but I think that version mismatch
might be problematic.
You can do a testbed upgrade to test taking a ldiskfs volume from
2.5.34 to 2.10.4, just to be conservative.
--Jeff
On Mon, Jul 23, 2018 at 10:05 AM, Paul Edmon
<ped...@cfa.harvard.edu <mailto:ped...@cfa.harvard.edu>> wrote:
My apologies I meant 2.5.34 not 2.6.34. We'd like to get up
to 2.10.4 which is what our clients are running. Recently we
upgraded our cluster to CentOS7 which necessitated the client
upgrade. Our storage servers though stayed behind on 2.5.34.
-Paul Edmon-
On 07/23/2018 01:00 PM, Jeff Johnson wrote:
Paul,
2.6.34 is a kernel version. What version of Lustre are you
at now? Some updates are easier than others.
--Jeff
On Mon, Jul 23, 2018 at 8:59 AM, Paul Edmon
<ped...@cfa.harvard.edu <mailto:ped...@cfa.harvard.edu>> wrote:
We have some old large scale Lustre installs that are
running 2.6.34 and we want to get these up to the latest
version of Lustre. I was curious if people in this
group have any experience with doing this and if they
could share them. How do you handle upgrades like
this? How much time does it take? What are the
pitfalls? How do you manage it with minimal customer
interruption? Should we just write off upgrading and
stand up new servers that are on the correct version (in
which case we need to transfer the several PB's worth of
data over to the new system)?
Thanks for your wisdom.
-Paul Edmon-
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
<mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe)
visit http://www.beowulf.org/mailman/listinfo/beowulf
<http://www.beowulf.org/mailman/listinfo/beowulf>
jeff.john...@aeoncomputing.com
<mailto:jeff.john...@aeoncomputing.com>
www.aeoncomputing.com <http://www.aeoncomputing.com>
t: 858-412-3810 x1001 f: 858-412-3845
m: 619-204-9061
4170 Morena Boulevard, Suite C - San Diego, CA 92117
High-Performance Computing / Lustre Filesystems / Scale-out
Storage
_______________________________________________
Beowulf mailing list,Beowulf@beowulf.org
<mailto:Beowulf@beowulf.org> sponsored by Penguin Computing To
change your subscription (digest mode or unsubscribe)
visithttp://www.beowulf.org/mailman/listinfo/beowulf
<http://www.beowulf.org/mailman/listinfo/beowulf>
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
<mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe)
visit http://www.beowulf.org/mailman/listinfo/beowulf
<http://www.beowulf.org/mailman/listinfo/beowulf>
jeff.john...@aeoncomputing.com
<mailto:jeff.john...@aeoncomputing.com>
www.aeoncomputing.com <http://www.aeoncomputing.com>
t: 858-412-3810 x1001 f: 858-412-3845
m: 619-204-9061
4170 Morena Boulevard, Suite C - San Diego, CA 92117
High-Performance Computing / Lustre Filesystems / Scale-out Storage
_______________________________________________
Beowulf mailing list,Beowulf@beowulf.org <mailto:Beowulf@beowulf.org>
sponsored by Penguin Computing To change your subscription (digest
mode or unsubscribe)
visithttp://www.beowulf.org/mailman/listinfo/beowulf
<http://www.beowulf.org/mailman/listinfo/beowulf>
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
<mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
<http://www.beowulf.org/mailman/listinfo/beowulf>
jeff.john...@aeoncomputing.com <mailto:jeff.john...@aeoncomputing.com>
www.aeoncomputing.com <http://www.aeoncomputing.com>
t: 858-412-3810 x1001 f: 858-412-3845
m: 619-204-9061
4170 Morena Boulevard, Suite C - San Diego, CA 92117
High-Performance Computing / Lustre Filesystems / Scale-out Storage
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf