On Mon, Oct 11, 2010 at 03:53:31PM -0700, Steve Kargl wrote: > It seems recent changes to the bge driver are causing > some problems with my hardware where the watchdog is > now timing out. > > /var/log/messages contains > > 14:23:14 kernel: SMP: AP CPU #1 Launched! > 14:23:14 kernel: Trying to mount root from ufs:/dev/ad6s1a > 14:23:15 kernel: bge1: link state changed to UP > 14:23:15 lpd[1190]: lpd startup: logging=0 > 14:23:15 ntpd[1224]: ntpd 4.2.4p5-a (1) > 14:23:15 kernel: bge0: link state changed to UP > 14:23:24 ntpd[1225]: time reset -0.677316 s > 14:23:24 ntpd[1225]: kernel time sync status change 2001 > 14:31:01 kernel: bge0: watchdog timeout -- resetting > 14:31:01 kernel: bge0: link state changed to DOWN > 14:31:02 kernel: Limiting icmp unreach response from 613 to 200 packets/sec > 14:31:04 ntpd[1225]: sendto(140.142.2.8) (fd=22): No route to host > 14:31:04 kernel: bge0: link state changed to UP > 14:31:30 kernel: Limiting icmp unreach response from 205 to 200 packets/sec > 14:31:31 kernel: Limiting icmp unreach response from 203 to 200 packets/sec > 15:40:11 su: kargl to root on /dev/pts/0 > 15:40:35 kernel: bge0: link state changed to DOWN > 15:40:38 kernel: bge0: link state changed to UP > > The last 2 bge messages are from me manually using > ifconfig to bring my net connect back to life. > > troutmask:kargl[206] sysctl -a | grep bge.0 > dev.bge.0.%desc: Broadcom Gigabit Ethernet Controller, ASIC rev. 0x002100 > dev.bge.0.%driver: bge > dev.bge.0.%location: slot=9 function=0 handle=\_SB_.PCI0.GOLA.GLAN > dev.bge.0.%pnpinfo: vendor=0x14e4 device=0x1648 subvendor=0x14e4 > subdevice=0x1644 class=0x020000 > dev.bge.0.%parent: pci2 > dev.bge.0.forced_collapse: 0 > dev.bge.0.forced_udpcsum: 0 > dev.bge.0.stats.FramesDroppedDueToFilters: 0 > dev.bge.0.stats.DmaWriteQueueFull: 0 > dev.bge.0.stats.DmaWriteHighPriQueueFull: 0 > dev.bge.0.stats.NoMoreRxBDs: 0 > dev.bge.0.stats.InputDiscards: 0 > dev.bge.0.stats.InputErrors: 0 > dev.bge.0.stats.RecvThresholdHit: 325 > dev.bge.0.stats.DmaReadQueueFull: 0 > dev.bge.0.stats.DmaReadHighPriQueueFull: 0 > dev.bge.0.stats.SendDataCompQueueFull: 0 > dev.bge.0.stats.RingSetSendProdIndex: 469 > dev.bge.0.stats.RingStatusUpdate: 330 > dev.bge.0.stats.Interrupts: 330 > dev.bge.0.stats.AvoidedInterrupts: 0 > dev.bge.0.stats.SendThresholdHit: 0 > dev.bge.0.stats.rx.ifHCInOctets: 569452 > dev.bge.0.stats.rx.Fragments: 0 > dev.bge.0.stats.rx.UnicastPkts: 497 > dev.bge.0.stats.rx.MulticastPkts: 1 > dev.bge.0.stats.rx.FCSErrors: 0 > dev.bge.0.stats.rx.AlignmentErrors: 0 > dev.bge.0.stats.rx.xonPauseFramesReceived: 0 > dev.bge.0.stats.rx.xoffPauseFramesReceived: 0 > dev.bge.0.stats.rx.ControlFramesReceived: 0 > dev.bge.0.stats.rx.xoffStateEntered: 0 > dev.bge.0.stats.rx.FramesTooLong: 0 > dev.bge.0.stats.rx.Jabbers: 0 > dev.bge.0.stats.rx.UndersizePkts: 0 > dev.bge.0.stats.rx.inRangeLengthError: 0 > dev.bge.0.stats.rx.outRangeLengthError: 0 > dev.bge.0.stats.tx.ifHCOutOctets: 39056 > dev.bge.0.stats.tx.Collisions: 0 > dev.bge.0.stats.tx.XonSent: 0 > dev.bge.0.stats.tx.XoffSent: 0 > dev.bge.0.stats.tx.flowControlDone: 0 > dev.bge.0.stats.tx.InternalMacTransmitErrors: 0 > dev.bge.0.stats.tx.SingleCollisionFrames: 0 > dev.bge.0.stats.tx.MultipleCollisionFrames: 0 > dev.bge.0.stats.tx.DeferredTransmissions: 0 > dev.bge.0.stats.tx.ExcessiveCollisions: 0 > dev.bge.0.stats.tx.LateCollisions: 0 > dev.bge.0.stats.tx.UnicastPkts: 468 > dev.bge.0.stats.tx.MulticastPkts: 0 > dev.bge.0.stats.tx.BroadcastPkts: 1 > dev.bge.0.stats.tx.CarrierSenseErrors: 0 > dev.bge.0.stats.tx.Discards: 0 > dev.bge.0.stats.tx.Errors: 0 > dev.bge.0.wake: 0 > > In the time that it's taken me to compose this message > the timeout has fire again. > > 15:47:01 kernel: Limiting icmp unreach response from 662 to 200 packets/sec > 15:47:02 kernel: Limiting icmp unreach response from 446 to 200 packets/sec > 15:47:03 kernel: Limiting icmp unreach response from 436 to 200 packets/sec > 15:47:04 kernel: Limiting icmp unreach response from 464 to 200 packets/sec > 15:47:05 kernel: Limiting icmp unreach response from 438 to 200 packets/sec > 15:47:06 kernel: Limiting icmp unreach response from 445 to 200 packets/sec > 15:47:07 kernel: bge0: watchdog timeout -- resetting > 15:47:07 kernel: bge0: link state changed to DOWN > 15:47:07 kernel: Limiting icmp unreach response from 439 to 200 packets/sec > 15:47:08 kernel: Limiting icmp unreach response from 330 to 200 packets/sec > 15:47:11 kernel: bge0: link state changed to UP > 15:47:12 kernel: Limiting icmp unreach response from 214 to 200 packets/sec > 15:47:13 kernel: Limiting icmp unreach response from 202 to 200 packets/sec > 15:47:14 kernel: Limiting icmp unreach response from 238 to 200 packets/sec > 15:49:42 kernel: bge0: link state changed to DOWN > 15:49:44 kernel: bge0: link state changed to UP > > I not seen these icmp unreach response messages. >
The icmp unreach has nothing to do with bge(4). Check whether a server that listens on an UDP port is still alive on your box. What worries me is bge(4) watchdog timeouts. It looks like your controller is BCM5704. I also have bge(4) regression report from marius on sparc64. He said r213945 seemed to cause the issue and I'm working on the issue. Could you also try the attached patch?
Index: sys/dev/bge/if_bge.c =================================================================== --- sys/dev/bge/if_bge.c (revision 213695) +++ sys/dev/bge/if_bge.c (working copy) @@ -1619,9 +1619,6 @@ CSR_WRITE_4(sc, BGE_RX_STD_RCB_MAXLEN_FLAGS, rcb->bge_maxlen_flags); CSR_WRITE_4(sc, BGE_RX_STD_RCB_NICADDR, rcb->bge_nicaddr); - /* Reset the standard receive producer ring producer index. */ - bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, 0); - /* * Initialize the jumbo RX producer ring control * block. We set the 'ring disabled' bit in the @@ -1665,6 +1662,9 @@ bge_writembx(sc, BGE_MBX_RX_MINI_PROD_LO, 0); } + /* Reset the standard receive producer ring producer index. */ + bge_writembx(sc, BGE_MBX_RX_STD_PROD_LO, 0); + /* * The BD ring replenish thresholds control how often the * hardware fetches new BD's from the producer rings in host
_______________________________________________ [email protected] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[email protected]"
