On Friday, 03-01-2025 at 09:57 Michael Stone wrote:
> On Thu, Jan 02, 2025 at 06:29:08PM +1100, George at Clug wrote:
> >I once used/tested with RAID 6 and was amazed how long it took to rebuild a
> >3TB swapped hard drive (about 8 hours, if I recall).
(This was using an Intel 8 port SAS/SATA card, sadly I cannot get Battery
Backup for it anymore, and running without battery backup is not worthwhile).
>
> Spinning disks are slow; 8 hours for 3TB is about 100MB/s and 8h is
> about how much time I'd expect it to take to write a commodity disk that
> size. It becomes really painful on a 24TB drive (8 times as large, 8
> times as long;
24TB drive - WOW - just thinking about that hurts.
> a fast spinning disk might be able to cut that down to a
> full day). If you want significant speed increases you need an nvme
> drive--but beware: there's a default rate-limit to minimize impact to
> other work. (see /proc/sys/dev/raid/speed_limit_max, probably 200MB/s,
> and you'd want to set it much higher to fully utilize nvme storage.)
I assume speed_limit* is for mdadm ? When I get time, I would like to do some
testing with mdadm. I think I did one test run about 10 or 15 years ago, but
now I just work (er play) with Desktop computers where single drives and rsync
backups is sufficient for my needs. I am too lazy for my own good, I guess.
I think one issue with RAID is, most people do not keep watch on drive health,
and it is only when the second drive fails and the whole RAID dies, that they
notice there were issues long ago. (maybe that is just me?)
Does anyone have experience with mdadm and for keeping an eye of the health of
your drives and RAID? I am curious if you get timely notifications that a
drive is not well (starting to get too many errors) and should be replaced, so
you can replace the drive before failure?
I get yearly power outages due to storms, though I do have UPS, sometimes they
loose power before clean shutdown. I need to pay more attention to battery
health and do testing, which I do not do, my fault. My recommendation to people
1) if you use UPS, test by putting your severs into read-only mode, then remove
power and see if the system automatically powers down before the batteries
fail. 2) if you do backups, occasionally try to do a full disaster restore from
backups.
I wonder how well mdadm survives unexpected power outages (or lockups)?
Having had worked with Microsoft Operating systems, I was quite concerned about
the possibility of lockups when I started working with Linux servers around
2009. As I think about it now, I have never had a lockup on a Linux server. And
this very much holds true for my home KVM hypervisors and servers that I test
with. The KVM hypervisors run 24x7 for about the past 15 years and have never
locked up. Amazing and pleasing.
Some reading for me:
https://www.cyberciti.biz/tips/linux-raid-increase-resync-rebuild-speed.html
The /proc/sys/dev/raid/speed_limit_max is config file that reflects the current
“goal” rebuild speed for times when no non-rebuild activity is current on an
array. The default is 100,000.
https://tldp.org/HOWTO/Software-RAID-HOWTO-6.html
6.5 Monitoring RAID arrays
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/8/html/managing_storage_devices/managing-raid_managing-storage-devices#setting-up-email-notifications-to-monitor-a-raid_managing-raid
21.18. Setting up email notifications to monitor a RAID
https://medium.com/@kahalekar.sunil/how-to-configure-raid-and-monitor-disk-usage-ensuring-timely-notification-via-email-in-case-of-facf4c4d7656
https://www.ionos.com/help/server-cloud-infrastructure/dedicated-server-for-servers-purchased-before-102818/rescue-and-recovery/software-raid-status-monitoring-linux/
George.
>
>