> > since your bios doesnt reliably detect drives on reboot i would start by > looking for firmware upgrades. > > smc likes to say dont upgrade your firmware unless you have a problem, i be > you meet that requirement. > > j >
Thanks, I'm in the process of trying to do this but I'm having some trouble getting it to recognize any of the bootable disks I've created. I'll dig out a USB stick and try that since I'm having no joy with the virtual CD/floppy options. In your BIOS you should have a setting for "Interrupt 19 Capture" and, I > believe, the default setting is "Enable". Change it to "Disable". This > will disable your ability to boot off your SAS controller but you don't do > that right now anyway. > > Good luck! > > -Russ > I've done this but it gives the same errors. > Some further digging there appears to be a similar issue on the FreeBSD > side of things with the Tylersburg chipset (found on the X8ST3-F). > > http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009946.html > > The USB devices and the SATA devices all contend with IRQ19 (as seen by > uhci and pci-ide all piled together). Might it be possible to switch your > SATA mode to AHCI rather than IDE? That will use a different driver and > subsequently might use a different interrupt. > > -Russ I believe it is possible to change the SATA port types in bios but from memory when I did this previously it prevented the OS (OpenSolaris at the time) from booting because the rpool physical path changed and I had to go in and modify something with a Live disk to make it work. I will see if I can dig out my notes or find an article on this. Thanks for the advice so far guys. I'll let you know when I make some progress. Cheers, Daniel On 29 September 2011 17:36, Russell Hansen <[email protected]> wrote: > Some further digging there appears to be a similar issue on the FreeBSD > side of things with the Tylersburg chipset (found on the X8ST3-F). > > http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009946.html > > The USB devices and the SATA devices all contend with IRQ19 (as seen by > uhci and pci-ide all piled together). Might it be possible to switch your > SATA mode to AHCI rather than IDE? That will use a different driver and > subsequently might use a different interrupt. > > -Russ > > ________________________________ > > From: Russell Hansen [mailto:[email protected]] > Sent: Thu 9/29/2011 8:59 AM > To: Discussion list for OpenIndiana > Subject: Re: [OpenIndiana-discuss] SATA device errors,possibly due to IRQ > conflict > > > > In your BIOS you should have a setting for "Interrupt 19 Capture" and, I > believe, the default setting is "Enable". Change it to "Disable". This > will disable your ability to boot off your SAS controller but you don't do > that right now anyway. > > Good luck! > > -Russ > > ________________________________ > > From: Daniel [mailto:[email protected]] > Sent: Thu 9/29/2011 2:31 AM > To: [email protected] > Subject: [OpenIndiana-discuss] SATA device errors,possibly due to IRQ > conflict > > > > Hi, > > I've got a server running OpenIndiana 148 on a Supermicro *X8ST3-F* that > has > been working perfectly for months right up until I added some more storage. > > The board has 6 * SATA ports and 8 * SAS ports. Previously all the drives > in > my storage pool were attached to the 8 SAS ports and only my rpool drive > was > using one of the SATA ports. > > Now that I have added another 4 drives I've had to connect them to the SATA > ports - this is when the system started to become unstable. > > I have had periods of very heavy usage that have cause no problems > whatsoever (for example, I copied 4 TB of data on to the pool, most of > which > would have had to go on the new drives then did several scrubs over the > next > few days). The system seems perfectly happy to sustain a 350mb+ read or > write (or a bit of both) for hours on end with no errors at all. Then other > times, typically overnight or early morning when it's just ticking over > with > < 500k read/write, it will fall apart. > > There are three kinds of failure I'm experiencing, seemingly randomly: > > 1. Errors about failed read/write on 2 or 4 SATA drives in > /var/adm/messages > and system io hung - system has to have the power cut to recover - ssh > won't > connect, can't get past the username prompt on the terminal. No ZFS errors > reported > 2. Errors about failed read/write, system io NOT hung, ZFS reporting > faulted > drives (2 or 4) and hundreds of thousands of errors. In this scenario, the > machine can be rebooted cleanly BUT the failed drives don't get detected by > BIOS. Usually a full power down, wait 30 seconds, power back up will allow > the drives to be detected again. When it powers back up ZFS will report > lots > of errors but sort itself out after a resilver - I haven't actually had any > perminent data loss yet, zfs has always recovered. > 3. No errors at all in either /var/adm/messages or zpool status but hung > io. > > > I've swaped the drive connections around to prove it isn't the new disks > that are at fault and this has confirmed that it's whichever devices are > connected to the SATA controller that are having the problem. > > When I rebooted the machine after the latest failure I checked the > /var/adm/messages and there are thousands (9995 in total but that may be > from several reboots) messages identical to the following: > > "[ID 954099 kern.info] NOTICE: IRQ19 is being shared by drivers with > different interrupt levels." > > In case it's useful: > > cs2dsb@chronos:~$ echo ::interrupts -d | pfexec mdb -k > IRQ Vect IPL Bus Trg Type CPU Share APIC/INT# Driver Name(s) > 9 0x80 9 PCI Lvl Fixed 1 1 0x0/0x9 acpi_wrapper_isr > 11 0xd1 14 PCI Lvl Fixed 2 1 0x0/0xb hpet_isr > 16 0x84 9 PCI Lvl Fixed 7 1 0x0/0x10 uhci#0 > 18 0x82 9 PCI Lvl Fixed 5 2 0x0/0x12 uhci#5, ehci#0 > 19 0x86 9 PCI Lvl Fixed 3 6 0x0/0x13 uhci#4, uhci#2, > pci-ide#0, > pci-ide#1, pci-ide#1, pci-ide#0 > 21 0x85 9 PCI Lvl Fixed 0 1 0x0/0x15 uhci#1 > 23 0x83 9 PCI Lvl Fixed 6 2 0x0/0x17 uhci#3, ehci#1 > 24 0x81 7 PCI Edg MSI 4 1 - pcieb#4 > 25 0x60 6 PCI Edg MSI 1 1 - e1000g#0 > 26 0x61 6 PCI Edg MSI 2 1 - e1000g#1 > 27 0x40 5 PCI Edg MSI 3 1 - mpt#0 > 32 0x20 2 Edg IPI all 1 - cmi_cmci_trap > 160 0xa0 0 Edg IPI all 0 - poke_cpu > 208 0xd0 14 Edg IPI all 1 - kcpc_hw_overflow_intr > 209 0xd3 14 Edg IPI all 1 - cbe_fire > 210 0xd4 14 Edg IPI all 1 - cbe_fire > 240 0xe0 15 Edg IPI all 1 - xc_serv > 241 0xe1 15 Edg IPI all 1 - apic_error_intr > > cs2dsb@chronos:~$ echo ::interrupts | pfexec mdb -k > IRQ Vect IPL Bus Trg Type CPU Share APIC/INT# ISR(s) > 9 0x80 9 PCI Lvl Fixed 1 1 0x0/0x9 acpi_wrapper_isr > 11 0xd1 14 PCI Lvl Fixed 2 1 0x0/0xb hpet_isr > 16 0x84 9 PCI Lvl Fixed 7 1 0x0/0x10 uhci_intr > 18 0x82 9 PCI Lvl Fixed 5 2 0x0/0x12 uhci_intr, ehci_intr > 19 0x86 9 PCI Lvl Fixed 3 6 0x0/0x13 uhci_intr, uhci_intr, > ata_intr, ata_intr, ata_intr, ata_intr > 21 0x85 9 PCI Lvl Fixed 0 1 0x0/0x15 uhci_intr > 23 0x83 9 PCI Lvl Fixed 6 2 0x0/0x17 uhci_intr, ehci_intr > 24 0x81 7 PCI Edg MSI 4 1 - pcieb_intr_handler > 25 0x60 6 PCI Edg MSI 1 1 - e1000g_intr_pciexpress > 26 0x61 6 PCI Edg MSI 2 1 - e1000g_intr_pciexpress > 27 0x40 5 PCI Edg MSI 3 1 - mpt_intr > 32 0x20 2 Edg IPI all 1 - cmi_cmci_trap > 160 0xa0 0 Edg IPI all 0 - poke_cpu > 208 0xd0 14 Edg IPI all 1 - kcpc_hw_overflow_intr > 209 0xd3 14 Edg IPI all 1 - cbe_fire > 210 0xd4 14 Edg IPI all 1 - cbe_fire > 240 0xe0 15 Edg IPI all 1 - xc_serv > 241 0xe1 15 Edg IPI all 1 - apic_error_intr > > > So, basically two questions: > > 1. How do I fix this IRQ issue so that I don't get those warnings during > boot up? > 2. Is this likely to be the cause of the drive problems described above? > > Any advice would be much appreciated. > > Thanks, > > Daniel > _______________________________________________ > OpenIndiana-discuss mailing list > [email protected] > http://openindiana.org/mailman/listinfo/openindiana-discuss > > > > > > _______________________________________________ > OpenIndiana-discuss mailing list > [email protected] > http://openindiana.org/mailman/listinfo/openindiana-discuss > > _______________________________________________ OpenIndiana-discuss mailing list [email protected] http://openindiana.org/mailman/listinfo/openindiana-discuss
