On Mon, 28 Apr 2014 18:43:17 -0400 KS <list...@fastmail.fm> wrote: > Hi, > > I was checking one of my systems and the SMART data for /dev/sda came > out as below. Should I change it to avoid loosing data? If not, which > information in SMART data indicates that it is time to do it? > > Thanks, > KS > ------------------- > smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.2.34-std312-amd64] > (local build) > Copyright (C) 2002-11 by Bruce Allen, > http://smartmontools.sourceforge.net > > === START OF INFORMATION SECTION === > Model Family: Western Digital Caviar Black > Device Model: WDC WD5001AALS-00J7B1 > Serial Number: WD-WMATV6698581 > LU WWN Device Id: 5 0014ee 0577e9964 > Firmware Version: 05.00K05 > User Capacity: 500,107,862,016 bytes [500 GB] > Sector Size: 512 bytes logical/physical > Device is: In smartctl database [for details use: -P show] > ATA Version is: 8 > ATA Standard is: Exact ATA specification draft version not indicated > Local Time is: Mon Apr 28 18:38:59 2014 UTC > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > === START OF READ SMART DATA SECTION === > SMART overall-health self-assessment test result: PASSED > > General SMART Values: > Offline data collection status: (0x84) Offline data collection > activity was suspended by an interrupting > command from host. > Auto Offline Data Collection: > Enabled. > Self-test execution status: ( 0) The previous self-test routine > completed > without error or no self-test > has ever > been run. > Total time to complete Offline > data collection: (11160) seconds. > Offline data collection > capabilities: (0x7b) SMART execute Offline > immediate. Auto Offline data collection > on/off support. > Suspend Offline collection > upon new command. > Offline surface scan > supported. Self-test supported. > Conveyance Self-test > supported. Selective Self-test supported. > SMART capabilities: (0x0003) Saves SMART data before > entering power-saving mode. > Supports SMART auto save > timer. Error logging capability: (0x01) Error logging > supported. General Purpose Logging supported. > Short self-test routine > recommended polling time: ( 2) minutes. > Extended self-test routine > recommended polling time: ( 131) minutes. > Conveyance self-test routine > recommended polling time: ( 5) minutes. > SCT capabilities: (0x3037) SCT Status supported. > SCT Feature Control supported. > SCT Data Table supported. > > SMART Attributes Data Structure revision number: 16 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE > UPDATED WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail > Always > - 0 > 3 Spin_Up_Time 0x0027 229 221 021 Pre-fail > Always > - 8525 > 4 Start_Stop_Count 0x0032 099 099 000 Old_age > Always > - 1124 > 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail > Always > - 0 > 7 Seek_Error_Rate 0x002e 200 200 000 Old_age > Always > - 0 > 9 Power_On_Hours 0x0032 091 091 000 Old_age > Always > - 7208 > 10 Spin_Retry_Count 0x0032 100 100 000 Old_age > Always > - 0 > 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age > Always > - 0 > 12 Power_Cycle_Count 0x0032 099 099 000 Old_age > Always > - 1123 > 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age > Always > - 31 > 193 Load_Cycle_Count 0x0032 200 200 000 Old_age > Always > - 1124 > 194 Temperature_Celsius 0x0022 108 101 000 Old_age > Always > - 42 > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age > Always > - 0 > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age > Always > - 0 > 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age > Offline - 0 > 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age > Always > - 0 > 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age > Offline - 0 > > SMART Error Log Version: 1 > No Errors Logged > > SMART Self-test log structure revision number 1 > No self-tests have been logged. [To run self-tests, use: smartctl -t] > > > SMART Selective self-test log data structure revision number 1 > SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS > 1 0 0 Not_testing > 2 0 0 Not_testing > 3 0 0 Not_testing > 4 0 0 Not_testing > 5 0 0 Not_testing > Selective self-test flags (0x0): > After scanning selected spans, do NOT read-scan remainder of disk. > If Selective self-test is pending on power-up, resume after 0 minute > delay.
Looks good to me, always assuming you're not seeing read and write errors while running your applications. I'm assuming here that you also ran the offline tests: Otherwise this isn't enough information. When the offline-uncorrectable gets above 0, I replace the drive: New bad sectors can be expected. When current pending sector gets above 0, I worry hard, because that's a bad sign. You mention in another email concern about a drive temperature of 42C. We'd all like our components to be 32C all the time, but that's more of a hope than a reality. Depending on your processor, video card, and ventilation, it's possible that the ambient temperature in your box is 42 degrees, and the box is actually heating up your drive. Right now I'm testing my new build of my rsync backup server with a new 3TB WD Green drive, running this command on my whole backup history: find /backupserver/stevebup | -exec ls -lh {} + The WD green is running at 37C, and the WD blue system disk is running at 40C. The CPU is 45C and the mobo temp is 36C. This is in a box with a top mounted 200mm fan pushing out in excess of 100 cubic feet per minute, along with a back mounted outbound 120mm, two inbound front mounted 120's, and a side mounted inbound 140. Can you imagine what would be happening in there if I had less fans? Modern processors are 100 watts: That's like putting a 100 watt incandescent lightbulb in there. Without adequate ventilation, it could turn into an oven and bake your hard drives. Hang on a second: I just stopped my test process, let's see what happens to the temperatures after a couple minutes... Well, it's been about 15 minutes since I killed my test program so that my backup server was idling. My CPU is now 42C, the WD black system disk is 40C, and the WD green data disk is now 36C. SteveT Steve Litt * http://www.troubleshooters.com/ Troubleshooting Training * Human Performance -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20140428204735.29d8705d@mydesk