Re: [Beowulf] RAID5 rebuild, remount with write without reboot?

Joe Landman Tue, 05 Sep 2017 11:07:59 -0700


On 09/05/2017 01:28 PM, mathog wrote:

Short form:
An 8 disk (all 2Tb SATA) RAID5 on an LSI MR-USAS2 SuperMicrocontroller (lspci shows " LSI Logic / Symbios Logic MegaRAID SAS 2008[Falcon]") system was long ago configured with a small partition ofone disk as /boot and logical volumes for / (root) and /home on asingle large virual drive on the RAID. Due to disk problems and aself goal (see below) the array went into a degraded=1 state (asreported by megacli) and write locked both root and home. When thefailed disk was replaced and the rebuild completed those were bothstill write locked. "mount -a" didn't help in either case. A rebootbrought them up normally but ideally that should not have beennecessary. Is there a method to remount the logical volumes writablethat does not require a reboot?


Generally the FW would write lock it.  A

    mount -o remount,rw $path

may not clear this.  I've found that I need to often do something akin to

    echo "- - -" > /sys/class/scsi_host/host0/scan

for each scsi host bus. Another thing to try is to remove the driverand modprobe it again. However, as your /boot and / are on it, thisprobably won't work well.


Reboot has this same effect though, so you did this sort of by default.

Regards,

Joe

Long form:
Periodic testing of the disks inside this array turned up pendingsectors with
this command:

   smartctl -a  /dev/sda -d sat+megaraid,7

A replacement disk was obtained and the usual replacement method applied:

megacli -pdoffline -physdrv[64:7] -a0
megacli -pdmarkmissing -physdrv[64:7] -a0
megacli -pdprprmv -physdrv[64:7] -a0
megacli -pdlocate -start -physdrv[64:7] -a0
The disk with the flashing light was physically swapped. The smartctlwas run again and unfortunately its values were unchanged. I hadalways assumed that the "7" in that smartctl was a physical slot,turns out that it is actually the "Device ID". In my defense thesmartctl man page does a very poor job describing this:
megaraid,N - [Linux only] the device consists of one or moreSCSI/SAS disks
  connected to  a  MegaRAID controller.   The  non-negative integer N (in
  the range of 0 to 127 inclusive) denotes which disk on the controller
  is monitored.  Use syntax such as:
In this system, unlike the others I had worked on previously, DeviceID and
slots were not 1:1.
Anyway, about a nanosecond after this was discovered the disk atDevice ID 7 was marked as Failed by the controller whereas previouslyit had been "Online, Spun Up".Ugh. At that point the logical volumes were all set read only and theOS became barely usable, with commands like "more" no longerfunctioning. Megacli and sshd, thankfully, still worked. Figuringthat I had nothing to lose the replacement disk was removed from slot7 and the original, hopefully still good disk replaced. That put thesystem into this state.
slot 4 (device ID 7) failed.
slot 7 (device ID 5) is Offline.

and

megacli -PDOnline -physdrv[64:7] -a0

put it at

slot 4 (device ID 7) failed.
slot 7 (device ID 5) Online, Spun Up
The logical volumes were still read only but "more" and most othercommands now worked again. Megacli still showed the "degraded" valueas 1. I'm still not clear
how the two "read only" states differed to cause this change.

At that point the failed disk in slot 4 (not 7!) was replaced with the
new disk (which had been briefly in slot 7) and it immediately beganto rebuild. Something on the order of 48 hours later that rebuildcompleted, and the controller set "degraded" back to 0. However, thelogical volumes were still readonly. "mount -a" didn't fix it, so thesystem was rebooted, which worked.
We have two of these back up systems. They are supposed to haveidentical contents but do not. Fixing that is another item on a longtodo list. RAID 6 would have been a better choice for this muchstorage, but it does not look like this card supports it:
RAID0, RAID1, RAID5, RAID00, RAID10, RAID50, PRL 11, PRL 11 withspanning,
  SRL 3 supported, PRL11-RLQ0 DDF layout with no span,
  PRL11-RLQ0 DDF layout with span
That rebuild is far too long for comfort. Had another disk failed inthose two days that would have been it. Neither controller has batterybackup, and the one in question is not even on a UPS, so a powerglitch could be fatal too. Not a happy thought while record SoCaltemperatures persisted throughout the entire rebuild! The systems arein different buildings on the same campus, sharing the same powergrid. There are no other backups for most of this data.
Even though the controller shows this system as no longer degraded,should I believe that there was no data loss? I can run checksums onall the files (even though it will take forever) and compare the twosystems. But as I said previously, the files were not entirely 1:1,so there are certainly going to be some files on this system whichhave no match on the other.
Regards,

David Mathog
mat...@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visithttp://www.beowulf.org/mailman/listinfo/beowulf


--
Joe Landman
e: joe.land...@gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] RAID5 rebuild, remount with write without reboot?

Reply via email to