On Mon, 13 Jun 2011, Hans Petter Selasky wrote:
On Sunday 12 June 2011 23:50:24 Charles Sprickman wrote:
On Sun, 12 Jun 2011, Hans Petter Selasky wrote:
On Saturday 11 June 2011 23:43:11 Charles Sprickman wrote:
Hello,
We ran into an odd problem last week with our serial consoles after
moving the USB to serial adapters from an old 4.11 box to a box running
8.1. We have two boxes that incorporate (I assume) hubs and a bunch of
FTDI serial interfaces. One has 16 ports, the other 8. Each is
plugged directly into a USB port on the rear of the mainboard. We run
conserver[1] to handle access to the serial ports. From what I've
observed, this application opens the ports when the daemon starts - it
logs any output (handy for panics, or anything else that might spit
interesting info to the console) and waits for clients to connect to
it.
Everything had been working fine for a few weeks. The box was rebooted
recently to enable PostgreSQL to start normally (bumped SHM stuff in
loader.conf). After six days, we found that the consoles were
unresponsive. Restarting conserver brought us this each time we
connected to a console for full read/write access:
[Thu Jun 9 10:04:59 2011] conserver (50113): ERROR: [h22]
open(/dev/ttyU4): Interrupted system call: forcing down
[Thu Jun 9 10:04:59 2011] conserver (50112): ERROR: [h21]
open(/dev/ttyU11): Interrupted system call: forcing down
All devices still appeared in /dev. Stopping conserver and confirming
it and all child processes were gone and then using picocom and cu
yielded no response on the serial ports.
We also found (after the fact) that around the time the consoles became
unresponsive, cpu usage went to nearly 90% and was mostly in the kernel
process "intr":
root 12 70.5 0.0 0 136 ?? WL Fri12AM 120:01.47 [intr]
A graph showing cpu usage (red is "system"):
http://i.imgur.com/0yO5l.png
I should note that we know the cpu spike and devices becoming
unresponsive can be correlated because one of the serial ports runs a
temperature monitor which is tied into our monitoring. When the data
goes stale, we get notified.
Issuing a "usbconfig -u 0 reset" caused all devices except for the root
hub to disappear and not come back. CPU usage also dipped a bit after
that. Rebooting was the only way to resolve the issue - perhaps
plugging and unplugging would have worked, but that's a bit too complex
for our remote hands.
I can supply full dmesg and more, but for now, here's a summary of the
usb info from dmesg:
FreeBSD 8.1-RELEASE #7: Wed Dec 22 00:49:50 EST 2010
ohci0: <OHCI (generic) USB controller> mem 0xfe9fc000-0xfe9fcfff irq 10
at device 15.2 on pci0
ohci0: [ITHREAD]
...
usbus0: <OHCI (generic) USB controller> on ohci0
usbus0: 12Mbps Full Speed USB v1.0
ugen0.1: <(0x1166)> at usbus0
...
uhub0: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on
usbus0
uhub0: 4 ports with 4 removable, self powered
ugen0.2: <vendor 0x05e3> at usbus0
uhub1: <vendor 0x05e3 USB Hub, class 9/0, rev 1.01/0.11, addr 2> on
usbus0 uhub1: 7 ports with 7 removable, self powered
ugen0.3: <FTDI> at usbus0
uftdi0: <USB FAST SERIAL ADAPTER> on usbus0
uftdi1: <USB FAST SERIAL ADAPTER> on usbus0
ugen0.4: <FTDI> at usbus0
uftdi2: <USB FAST SERIAL ADAPTER> on usbus0
uftdi3: <USB FAST SERIAL ADAPTER> on usbus0
ugen0.5: <FTDI> at usbus0
uftdi4: <USB FAST SERIAL ADAPTER> on usbus0
uftdi5: <USB FAST SERIAL ADAPTER> on usbus0
ugen0.6: <FTDI> at usbus0
uftdi6: <USB FAST SERIAL ADAPTER> on usbus0
uftdi7: <USB FAST SERIAL ADAPTER> on usbus0
ugen0.7: <FTDI> at usbus0
uftdi8: <USB FAST SERIAL ADAPTER> on usbus0
uftdi9: <USB FAST SERIAL ADAPTER> on usbus0
ugen0.8: <FTDI> at usbus0
uftdi10: <USB FAST SERIAL ADAPTER> on usbus0
uftdi11: <USB FAST SERIAL ADAPTER> on usbus0
ugen0.9: <vendor 0x05e3> at usbus0
uhub2: <vendor 0x05e3 USB Hub, class 9/0, rev 1.01/0.12, addr 9> on
usbus0 uhub2: 4 ports with 4 removable, self powered
ugen0.10: <FTDI> at usbus0
uftdi12: <USB FAST SERIAL ADAPTER> on usbus0
uftdi13: <USB FAST SERIAL ADAPTER> on usbus0
ugen0.11: <FTDI> at usbus0
... (mangling below is as it appears in dmesg)
da1 at sym0 bus 0 scbus0 target 1 lun 0uftdi14: <USB FAST SERIAL
ADAPTER> on usbus0
da1: <SEAGATE ST336807LW 0C01> Fixed Direct Access SCSI-3 device
uftdi15: <USB FAST SERIAL ADAPTER> on usbus0
...
Root mount waiting for: usbus0
ugen0.12: <vendor 0x0409> at usbus0
uhub3: <vendor 0x0409 product 0x0050, class 9/0, rev 2.00/1.00, addr 12>
on usbus0
Root mount waiting for: usbus0
uhub3: 7 ports with 7 removable, self powered
ugen0.13: <FTDI> at usbus0
uftdi16: <FT232R USB UART> on usbus0
Root mount waiting for: usbus0
ugen0.14: <FTDI> at usbus0
uftdi17: <FT232R USB UART> on usbus0
Root mount waiting for: usbus0
ugen0.15: <FTDI> at usbus0
uftdi18: <FT232R USB UART> on usbus0
ugen0.16: <FTDI> at usbus0Root mount waiting for:
usbus0
uftdi19: <FT232R USB UART> on usbus0
ugen0.17: <FTDI> at usbus0
uftdi20: <FT232R USB UART> on usbus0
Root mount waiting for: usbus0
ugen0.18: <FTDI> at usbus0
uftdi21: <FT232R USB UART> on usbus0
Root mount waiting for: usbus0
ugen0.19: <vendor 0x0409> at usbus0
uhub4: <vendor 0x0409 product 0x005a, class 9/0, rev 2.00/1.00, addr 19>
on usbus0
uhub4: 4 ports with 4 removable, self powered
Root mount waiting for: usbus0
ugen0.20: <FTDI> at usbus0
uftdi22: <FT232R USB UART> on usbus0
Root mount waiting for: usbus0
ugen0.21: <FTDI> at usbus0
uftdi23: <FT232R USB UART> on usbus0
Trying to mount root from zfs:zroot
Thanks,
Charles
Hi,
Try to get output from vmstat -i.
Also try to set the:
hw.usb.ehci.iaadbug=1
and
hw.usb.ehci.lostintrbug=1
in /boot/loader.conf
Quick question - this host only has USB 1.1 - so I don't think I have
ehci, do I? To be honest, I'm not totally clear on what's what as far as
uhci, ehci, and ohci. This is what's seen in sysctl output related to
USB:
[spork@h12 ~]$ sysctl -a|grep usb|more
hw.pci.usb_early_takeover: 1
hw.usb.no_boot_wait: 0
hw.usb.debug: 0
hw.usb.usb_lang_mask: 255
hw.usb.usb_lang_id: 9
hw.usb.template: 0
hw.usb.power_timeout: 30
hw.usb.ucom.cons_baud: 9600
hw.usb.ucom.cons_unit: -1
dev.usbus.0.%desc: OHCI (generic) USB controller
dev.usbus.0.%driver: usbus
dev.usbus.0.%parent: ohci0
dev.uhub.0.%parent: usbus0
Ok, then those quirks won't help.
For OHCI, I guess you should check vmstat -i.
Oddly enough, the box paniced today, but it appeared to be related to fxp.
However in the coredump summary, I have "vmstat -i" output, and ohci seems
fairly high in comparison to everything else:
vmstat -i
interrupt total rate
irq4: uart0 106 0
irq10: ohci0 142322001 968176
irq14: ata0 1178 8
irq20: fxp0 3008691 20467
irq21: fxp1 1733357 11791
irq28: sym1 30 0
irq29: sym0 2624749 17855
cpu0: timer 728063100 4952810
cpu1: timer 728044684 4952684
Total 1605797896 10923795
Also, just a brief summary of the panic, since it mentions the interrupt
process again:
#7 0x8059139b in fxp_new_rfabuf (sc=0x8564c000, rxp=0x8564c1c0)
at /usr/src/sys/dev/fxp/if_fxp.c:2611
#8 0x8059285b in fxp_intr (xsc=0x8564c000)
at /usr/src/sys/dev/fxp/if_fxp.c:1931
#9 0x8067b1db in intr_event_execute_handlers (p=0x8553d7f8,
ie=0x8557d080)
at /usr/src/sys/kern/kern_intr.c:1220
#10 0x8067c8eb in ithread_loop (arg=0x856525d0)
at /usr/src/sys/kern/kern_intr.c:1233
#11 0x80678f11 in fork_exit (callout=0x8067c880 <ithread_loop>,
arg=0x856525d0, frame=0xd80e7d38) at /usr/src/sys/kern/kern_fork.c:844
#12 0x80931de0 in fork_trampoline () at
/usr/src/sys/i386/i386/exception.s:270
And also unrelated to usb, but fairly bizarre "netstat -m" output:
18446744073709550887/1355/626/25600 mbuf clusters in use
(current/cache/total/max)
18014398509480560K/3497K/2073K bytes allocated to network
(current/cache/total)
Sorry for all the extra noise, but I'm not adept enough at determining
whether this panic was usb related or fxp related.
Thanks,
Charles
--HPS
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-usb
To unsubscribe, send any mail to "[email protected]"