[Expired for linux (Ubuntu) because there has been no activity for 60 days.]
** Changed in: linux (Ubuntu) Status: Incomplete => Expired -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/593635 Title: HDD freezes caused by ata exception that results in soft resetting of link Status in “linux” package in Ubuntu: Expired Bug description: Under even moderately heavy disk writes, I am seeing exceptions like the below in my kern.log ----------------------------------------------- Jun 13 13:33:03 cellar kernel: [66188.434868] ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 Jun 13 13:33:03 cellar kernel: [66188.434874] ata4.01: BMDMA stat 0x46 Jun 13 13:33:03 cellar kernel: [66188.434879] ata4.01: failed command: WRITE DMA EXT Jun 13 13:33:03 cellar kernel: [66188.434886] ata4.01: cmd 35/00:00:00:94:b2/00:04:13:00:00/f0 tag 0 dma 524288 out Jun 13 13:33:03 cellar kernel: [66188.434888] res 51/84:01:ff:95:b2/84:02:13:00:00/f0 Emask 0x30 (host bus error) Jun 13 13:33:03 cellar kernel: [66188.434892] ata4.01: status: { DRDY ERR } Jun 13 13:33:03 cellar kernel: [66188.434895] ata4.01: error: { ICRC ABRT } Jun 13 13:33:03 cellar kernel: [66188.434907] ata4: soft resetting link Jun 13 13:33:03 cellar kernel: [66188.622000] ata4.01: configured for UDMA/100 Jun 13 13:33:03 cellar kernel: [66188.622013] ata4: EH complete ---------------------------------------------- This is with the latest stable lucid kernel (2.6.32-22-generic #36-Ubuntu). I've also tried a mainline kernel (2.6.35-020635rc1) & still get the same errors except that there's an additional stack trace: ----------------------------------------------- Jun 14 18:55:40 cellar kernel: [ 152.874172] irq 19: nobody cared (try booting with the "irqpoll" option) Jun 14 18:55:40 cellar kernel: [ 152.874182] Pid: 0, comm: swapper Tainted: P 2.6.35-020635rc1-generic #020635rc1 Jun 14 18:55:40 cellar kernel: [ 152.874185] Call Trace: Jun 14 18:55:40 cellar kernel: [ 152.874198] [<c01a58cc>] __report_bad_irq+0x2c/0x90 Jun 14 18:55:40 cellar kernel: [ 152.874204] [<c016fee3>] ? sched_clock_tick+0x73/0xa0 Jun 14 18:55:40 cellar kernel: [ 152.874209] [<c01a5a44>] note_interrupt+0xe4/0x120 Jun 14 18:55:40 cellar kernel: [ 152.874214] [<c0179da0>] ? tick_nohz_update_jiffies+0x60/0x70 Jun 14 18:55:40 cellar kernel: [ 152.874219] [<c01a6364>] handle_fasteoi_irq+0x84/0xe0 Jun 14 18:55:40 cellar kernel: [ 152.874224] [<c0104abf>] handle_irq+0x1f/0x30 Jun 14 18:55:40 cellar kernel: [ 152.874230] [<c05afefb>] do_IRQ+0x4b/0xc0 Jun 14 18:55:40 cellar kernel: [ 152.874234] [<c01032f0>] common_interrupt+0x30/0x40 Jun 14 18:55:40 cellar kernel: [ 152.874239] [<c010a3a7>] ? mwait_idle+0x57/0xa0 Jun 14 18:55:40 cellar kernel: [ 152.874243] [<c010189c>] cpu_idle+0x8c/0xc0 Jun 14 18:55:40 cellar kernel: [ 152.874249] [<c05a4337>] start_secondary+0xf7/0x130 Jun 14 18:55:40 cellar kernel: [ 152.874252] handlers: Jun 14 18:55:40 cellar kernel: [ 152.874254] [<c0431060>] (ata_bmdma_interrupt+0x0/0x190) Jun 14 18:55:40 cellar kernel: [ 152.874261] [<c044fb10>] (usb_hcd_irq+0x0/0x90) Jun 14 18:55:40 cellar kernel: [ 152.874268] Disabling IRQ #19 Jun 14 18:56:09 cellar kernel: [ 181.856015] ata4: lost interrupt (Status 0x51) Jun 14 18:56:09 cellar kernel: [ 181.856034] ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Jun 14 18:56:09 cellar kernel: [ 181.856039] ata4.01: BMDMA stat 0x46, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0 Jun 14 18:56:09 cellar kernel: [ 181.856045] ata4.01: failed command: WRITE DMA EXT Jun 14 18:56:09 cellar kernel: [ 181.856053] ata4.01: cmd 35/00:00:00:84:08/00:04:3b:00:00/f0 tag 0 dma 524288 out Jun 14 18:56:09 cellar kernel: [ 181.856054] res 40/00:00:00:4f:c2/00:00:00:00:00/50 Emask 0x24 (host bus error) Jun 14 18:56:09 cellar kernel: [ 181.856058] ata4.01: status: { DRDY } Jun 14 18:56:09 cellar kernel: [ 181.856072] ata4: soft resetting link Jun 14 18:56:09 cellar kernel: [ 182.160065] ata4.01: configured for UDMA/133 Jun 14 18:56:09 cellar kernel: [ 182.160072] ata4.01: device reported invalid CHS sector 0 Jun 14 18:56:09 cellar kernel: [ 182.160080] ata4: EH complete -------------------------------------------------------------------- I've tried booting with "libata.force=noncq" on both kernels (lucid stable & 2.6.35 mainline) but makes no difference. I didn't see these errors in Jaunty. I think they started sometime in Karmic. I upgraded to Lucid in the hopes that the newer release fixed it but no difference. I think I've ruled out HDD failure. I get these errors on 2 old (3+ years) Seagate 7200.10 disks as well as a brand new Seagate 7200.12 disk. There are similar bug reports in launchpad but one difference that I noticed is that I consistently see the message "failed command: WRITE DMA EXT" while the other reports fail during a read or some other command. I can very reliably reproduce the errors by running a rdiff-backup 'restore' operation from an external USB HDD. == Steps to reproduce == 1. Boot into Gnome & login 2. Run 'tail -f /var/log/kern.log' in one terminal window 3. Run 'rdiff-backup --force -r now /media/freeagent/share /share/' in another terminal Within a few seconds, I can see the errors show up in the kernel logs. Running a fast torrent download will do the trick too. Since I can reproduce the problem so easily, I'll be very willing to try any special kernel builds to help solve this one. ProblemType: Bug DistroRelease: Ubuntu 10.04 Package: linux-image-2.6.32-22-generic 2.6.32-22.36 Regression: Yes Reproducible: Yes ProcVersionSignature: Ubuntu 2.6.32-22.36-generic 2.6.32.11+drm33.2 Uname: Linux 2.6.32-22-generic i686 NonfreeKernelModules: nvidia AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21. Architecture: i386 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC0: antrix 1387 F.... pulseaudio CRDA: Error: [Errno 2] No such file or directory Card0.Amixer.info: Card hw:0 'Intel'/'HDA Intel at 0xf9ffc000 irq 16' Mixer name : 'Realtek ALC662 rev1' Components : 'HDA:10ec0662,15650000,00100101' Controls : 36 Simple ctrls : 19 Date: Mon Jun 14 19:23:00 2010 HibernationDevice: RESUME=UUID=c6dab799-13a8-443e-b2a3-4b93f3bbb42e IwConfig: lo no wireless extensions. eth0 no wireless extensions. MachineType: BIOSTAR Group G31-M7 TE ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.32-22-generic root=UUID=466535ad-0b59-4fd0-b18b-ba486150f91a ro quiet splash ProcEnviron: PATH=(custom, user) LANG=en_SG.utf8 SHELL=/bin/bash RelatedPackageVersions: linux-firmware 1.34 RfKill: SourcePackage: linux dmi.bios.date: 04/10/2009 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 080014 dmi.board.asset.tag: To Be Filled By O.E.M. dmi.board.name: G31-M7 TE dmi.board.vendor: BIOSTAR Group dmi.chassis.asset.tag: None dmi.chassis.type: 3 dmi.chassis.vendor: BIOSTAR Group dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr080014:bd04/10/2009:svnBIOSTARGroup:pnG31-M7TE:pvr:rvnBIOSTARGroup:rnG31-M7TE:rvr:cvnBIOSTARGroup:ct3:cvr: dmi.product.name: G31-M7 TE dmi.sys.vendor: BIOSTAR Group To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/593635/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp