On Friday, 03-01-2025 at 09:57 Michael Stone wrote:
> On Thu, Jan 02, 2025 at 06:29:08PM +1100, George at Clug wrote:
> >I once used/tested with RAID 6 and was amazed how long it took to rebuild a 
> >3TB swapped hard drive (about 8 hours, if I recall).

(This was using an Intel 8 port SAS/SATA card, sadly I cannot get Battery 
Backup for it anymore, and running without battery backup is not worthwhile).


> 
> Spinning disks are slow; 8 hours for 3TB is about 100MB/s and 8h is 
> about how much time I'd expect it to take to write a commodity disk that 
> size. It becomes really painful on a 24TB drive (8 times as large, 8 
> times as long; 

 24TB drive - WOW - just thinking about that hurts.

> a fast spinning disk might be able to cut that down to a 
> full day). If you want significant speed increases you need an nvme
> drive--but beware: there's a default rate-limit to minimize impact to 
> other work. (see /proc/sys/dev/raid/speed_limit_max, probably 200MB/s, 
> and you'd want to set it much higher to fully utilize nvme storage.) 

I assume speed_limit* is for mdadm ?  When I get time, I would like to do some 
testing with mdadm. I think I did one test run about 10 or 15 years ago, but 
now I just work (er play) with Desktop computers where single drives and rsync 
backups is sufficient for my needs. I am too lazy for my own good, I guess.

I think one issue with RAID is, most people do not keep watch on drive health, 
and it is only when the second drive fails and the whole RAID dies, that they 
notice there were issues long ago. (maybe that is just me?)

Does anyone have experience with mdadm and for keeping an eye of the health of 
your drives and RAID?  I am curious if you get timely notifications that a 
drive is not well (starting to get too many errors) and should be replaced, so 
you can replace the drive before failure?

I get yearly power outages due to storms, though I do have UPS, sometimes they 
loose power before clean shutdown. I need to pay more attention to battery 
health and do testing, which I do not do, my fault. My recommendation to people 
1) if you use UPS, test by putting your severs into read-only mode, then remove 
power and see if the system automatically powers down before the batteries 
fail. 2) if you do backups, occasionally try to do a full disaster restore from 
backups. 

I wonder how well mdadm survives unexpected power outages (or lockups)?

Having had worked with Microsoft Operating systems, I was quite concerned about 
the possibility of lockups when I started working with Linux servers around 
2009. As I think about it now, I have never had a lockup on a Linux server. And 
this very much holds true for my home KVM hypervisors and servers that I test 
with. The KVM hypervisors run 24x7 for about the past 15 years and have never 
locked up. Amazing and pleasing. 

Some reading for me:
https://www.cyberciti.biz/tips/linux-raid-increase-resync-rebuild-speed.html
The /proc/sys/dev/raid/speed_limit_max is config file that reflects the current 
“goal” rebuild speed for times when no non-rebuild activity is current on an 
array. The default is 100,000.

https://tldp.org/HOWTO/Software-RAID-HOWTO-6.html
6.5 Monitoring RAID arrays 

https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/8/html/managing_storage_devices/managing-raid_managing-storage-devices#setting-up-email-notifications-to-monitor-a-raid_managing-raid
21.18. Setting up email notifications to monitor a RAID

https://medium.com/@kahalekar.sunil/how-to-configure-raid-and-monitor-disk-usage-ensuring-timely-notification-via-email-in-case-of-facf4c4d7656

https://www.ionos.com/help/server-cloud-infrastructure/dedicated-server-for-servers-purchased-before-102818/rescue-and-recovery/software-raid-status-monitoring-linux/


George.

> 
> 

Reply via email to