Hello, I have 4 HDDs in software RAID10 for my backup server. I had help on this list when I started to configure it and everything was working great for a year.
Last few days I noticed some load issues. During backup rotation of rsnapshot load would go very high. I watched iotop and jbd2 was top most offten. It was very strange, since jbd2 is ext4 journal manager and I'm using ext4 only for root partition. I created RAID1 for /boot, RAID10 for root and another, biggest RAID10 device for backup. Backup partition is using xfs. Boot and root are using ext4 (don't know why I used ext4 for boot, I know it's not necessary to have journal on boot partition). Debian installer set up LVM by default, so I leave it that way for boot and root. Here is graph where it can be seen that iowait went suddenly up: http://img163.imageshack.us/img163/8453/jef4.png Since backup partition is where most of the job is done, I thought that it's LVMs fault for high load. I moved mysql to xfs backup partition and it improved situation. Still, load is much higher that 10 days ago. Idle, it was 0.1-0.3. Now it's 1.5-3. iotop shows some mysqld and jbd2 processes when server is idle and not much data is written or read. Why that load then? I was thinking of reinstalling Debian without LVM on boot and root. I rememberd atop command and that it also shows disk usage. So here is relevant part: DSK | sda | busy 80% | read 2 | write 204 | KiB/r 4 | KiB/w 16 | MBr/s 0.00 | MBw/s 0.33 | avq 6.24 | avio 38.5 ms | DSK | sdd | busy 12% | read 0 | write 215 | KiB/r 0 | KiB/w 16 | MBr/s 0.00 | MBw/s 0.36 | avq 5.25 | avio 5.51 ms | DSK | sdb | busy 9% | read 0 | write 203 | KiB/r 0 | KiB/w 16 | MBr/s 0.00 | MBw/s 0.33 | avq 7.45 | avio 4.49 ms | DSK | sdc | busy 8% | read 0 | write 215 | KiB/r 0 | KiB/w 16 | MBr/s 0.00 | MBw/s 0.36 | avq 8.91 | avio 3.89 ms | Although all four disks are used in the same way regarding data being read or written to them, sda is much busier and it's average number of milliseconds needed by a request ('avio') is way higher. So, my next assumption is that sda is malfunctioning. I used smartctl to see if I can get any useful information about that. Output of smartctl -t short /dev/sda: SMART overall-health self-assessment test result: PASSED I now started "smartctl -t long /dev/sda" but it will take four hours to finish. Until I have those results, I thought to ask you for an opinion. Can I assume that hard drive is failing? Can there be some other cause for this strange sda behavior? Regards, Veljko -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131018145912.ga11...@angelina.example.com