On 31/12/11 08:55, Stan Hoeppner wrote:
On 12/30/2011 4:37 AM, Tony van der Hoff wrote:
I'm getting irrregular reports on a Squeeze software raid1 array with
two 500GB disks like this:
Dec 30 09:53:39 tony-lx kernel: [143496.872684] ata3.00: exception Emask
0x10 SAct 0x3 SErr 0x400000 action 0x6 frozen
Dec 30 09:53:39 tony-lx kernel: [143496.872694] ata3.00: irq_stat
0x08000000, interface fatal error
Dec 30 09:53:39 tony-lx kernel: [143496.872703] ata3: SError: { Handshk }
Dec 30 09:53:39 tony-lx kernel: [143496.872710] ata3.00: failed command:
WRITE FPDMA QUEUED
Dec 30 09:53:39 tony-lx kernel: [143496.872725] ata3.00: cmd
61/60:00:80:d0:56/00:00:13:00:00/40 tag 0 ncq 49152 out
Dec 30 09:53:39 tony-lx kernel: [143496.872729] res
40/00:08:20:9b:5a/00:00:13:00:00/40 Emask 0x10 (ATA bus error)
Dec 30 09:53:39 tony-lx kernel: [143496.872736] ata3.00: status: { DRDY }
Dec 30 09:53:39 tony-lx kernel: [143496.872742] ata3.00: failed command:
WRITE FPDMA QUEUED
Dec 30 09:53:39 tony-lx kernel: [143496.872755] ata3.00: cmd
61/20:08:20:9b:5a/00:00:13:00:00/40 tag 1 ncq 16384 out
Dec 30 09:53:39 tony-lx kernel: [143496.872758] res
40/00:08:20:9b:5a/00:00:13:00:00/40 Emask 0x10 (ATA bus error)
Dec 30 09:53:39 tony-lx kernel: [143496.872765] ata3.00: status: { DRDY }
Dec 30 09:53:39 tony-lx kernel: [143496.872776] ata3: hard resetting link
Dec 30 09:53:40 tony-lx kernel: [143497.356148] ata3: SATA link up 6.0
Gbps (SStatus 133 SControl 300)
Dec 30 09:53:40 tony-lx kernel: [143497.358983] ata3.00: configured for
UDMA/133
Dec 30 09:53:40 tony-lx kernel: [143497.359005] ata3: EH complete
Can anyone please enlighten me to what it means; Am I about to lose a disk?
Why haven't you looked at the SMART data for the disk on ATA3? Normally
that will answer your question directly above.
Well, Stan, I did. Unfortunately I didn't understand the reports, which
contain a plethora of information, for which I haven't been able to
locate an authoritative explanation, but seem to indicate that the
drives are OK:
root@tony-lx:~# smartctl -a /dev/sda
smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Device Model: ST3500413AS
Serial Number: 6VMS3YG7
Firmware Version: JC45
User Capacity: 500,107,862,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sat Dec 31 09:20:49 2011 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection:
Enabled.
Self-test execution status: ( 0) The previous self-test routine
completed
without error or no self-test
has ever
been run.
Total time to complete Offline
data collection: ( 600) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection
on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 83) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x103f) SCT Status supported.
SCT Error Recovery Control
supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 120 099 006 Pre-fail
Always - 243530983
3 Spin_Up_Time 0x0003 100 100 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 30
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 0
7 Seek_Error_Rate 0x000f 072 060 030 Pre-fail
Always - 18363743
9 Power_On_Hours 0x0032 098 098 000 Old_age
Always - 1893
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always - 30
183 Runtime_Bad_Block 0x0032 075 075 000 Old_age Always
- 25
184 End-to-End_Error 0x0032 100 100 099 Old_age Always
- 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always
- 0
188 Command_Timeout 0x0032 100 096 000 Old_age Always
- 74
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always
- 0
190 Airflow_Temperature_Cel 0x0022 071 067 045 Old_age Always
- 29 (Lifetime Min/Max 27/30)
194 Temperature_Celsius 0x0022 029 040 000 Old_age Always
- 29 (0 16 0 0)
195 Hardware_ECC_Recovered 0x001a 033 024 000 Old_age Always
- 243530983
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always
- 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 199 000 Old_age Always
- 455
240 Head_Flying_Hours 0x0000 100 253 000 Old_age
Offline - 158690451654579
241 Total_LBAs_Written 0x0000 100 253 000 Old_age
Offline - 2626825492
242 Total_LBAs_Read 0x0000 100 253 000 Old_age
Offline - 377807731
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
root@tony-lx:~# smartctl -a /dev/sdb
smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Device Model: ST3500413AS
Serial Number: 6VMS41GW
Firmware Version: JC45
User Capacity: 500,107,862,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sat Dec 31 09:24:25 2011 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection:
Enabled.
Self-test execution status: ( 0) The previous self-test routine
completed
without error or no self-test
has ever
been run.
Total time to complete Offline
data collection: ( 600) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection
on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 81) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x103f) SCT Status supported.
SCT Error Recovery Control
supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 117 099 006 Pre-fail
Always - 138763088
3 Spin_Up_Time 0x0003 100 100 000 Pre-fail
Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 30
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always - 0
7 Seek_Error_Rate 0x000f 061 060 030 Pre-fail
Always - 1374378
9 Power_On_Hours 0x0032 098 098 000 Old_age
Always - 1893
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always - 30
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always
- 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always
- 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always
- 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always
- 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always
- 0
190 Airflow_Temperature_Cel 0x0022 070 065 045 Old_age Always
- 30 (Lifetime Min/Max 27/31)
194 Temperature_Celsius 0x0022 030 040 000 Old_age Always
- 30 (0 17 0 0)
195 Hardware_ECC_Recovered 0x001a 036 005 000 Old_age Always
- 138763088
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always
- 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always
- 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age
Offline - 213915141146536
241 Total_LBAs_Written 0x0000 100 253 000 Old_age
Offline - 2186563909
242 Total_LBAs_Read 0x0000 100 253 000 Old_age
Offline - 2014986862
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
--
Tony van der Hoff | mailto:t...@vanderhoff.org
Buckinghamshire, England |
--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4eff52c1.3060...@vanderhoff.org