On Tue, Aug 21, 2018 at 9:59 AM Nikita V. Shirokov <tehn...@tehnerd.com> wrote: > > On Tue, Aug 21, 2018 at 08:58:15AM -0700, Alexander Duyck wrote: > > On Mon, Aug 20, 2018 at 12:32 PM Nikita V. Shirokov <tehn...@tehnerd.com> > > wrote: > > > > > > we are getting such errors: > > > > > > [ 408.737313] ixgbe 0000:03:00.0 eth0: Detected Tx Unit Hang (XDP) > > > Tx Queue <46> > > > TDH, TDT <0>, <2> > > > next_to_use <2> > > > next_to_clean <0> > > > tx_buffer_info[next_to_clean] > > > time_stamp <0> > > > jiffies <1000197c0> > > > [ 408.804438] ixgbe 0000:03:00.0 eth0: tx hang 1 detected on queue 46, > > > resetting adapter > > > [ 408.804440] ixgbe 0000:03:00.0 eth0: initiating reset due to tx timeout > > > [ 408.817679] ixgbe 0000:03:00.0 eth0: Reset adapter > > > [ 408.866091] ixgbe 0000:03:00.0 eth0: TXDCTL.ENABLE for one or more > > > queues not cleared within the polling period > > > [ 409.345289] ixgbe 0000:03:00.0 eth0: detected SFP+: 3 > > > [ 409.497232] ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow > > > Control: RX/TX > > > > > > while running XDP prog on ixgbe nic. > > > right now i'm seing this on bpfnext kernel > > > (latest commit from Wed Aug 15 15:04:25 2018 -0700 ; > > > 9a76aba02a37718242d7cdc294f0a3901928aa57) > > > > > > looks like this is the same issue as reported by Brenden in > > > https://www.spinics.net/lists/netdev/msg439438.html > > > > > > -- > > > Nikita V. Shirokov > > > > Could you provide some additional information about your setup. > > Specifically useful would be "ethtool -i", "ethtool -l", and lspci > > -vvv info for your device. The total number of CPUs on the system > > would be useful to know as well. In addition could you try > > reproducing > sure: > > ethtool -l eth0 > Channel parameters for eth0: > Pre-set maximums: > RX: 0 > TX: 0 > Other: 1 > Combined: 63 > Current hardware settings: > RX: 0 > TX: 0 > Other: 1 > Combined: 48 > > # ethtool -i eth0 > driver: ixgbe > version: 5.1.0-k > firmware-version: 0x800006f1 > expansion-rom-version: > bus-info: 0000:03:00.0 > supports-statistics: yes > supports-test: yes > supports-eeprom-access: yes > supports-register-dump: yes > supports-priv-flags: yes > > > # nproc > 48 > > lspci: > > 03:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ > Network Connection (rev 01) > Subsystem: Intel Corporation Device 000d > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ > Stepping- SERR+ FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > <TAbort- <MAbort- >SERR- <PERR- INTx- > Latency: 0, Cache Line Size: 32 bytes > Interrupt: pin A routed to IRQ 30 > NUMA node: 0 > Region 0: Memory at c7d00000 (64-bit, non-prefetchable) [size=1M] > Region 2: I/O ports at 6000 [size=32] > Region 4: Memory at c7e80000 (64-bit, non-prefetchable) [size=16K] > Expansion ROM at c7e00000 [disabled] [size=512K] > Capabilities: [40] Power Management version 3 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA > PME(D0+,D1-,D2-,D3hot+,D3cold+) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- > Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ > Address: 0000000000000000 Data: 0000 > Masking: 00000000 Pending: 00000000 > Capabilities: [70] MSI-X: Enable+ Count=64 Masked- > Vector table: BAR=4 offset=00000000 > PBA: BAR=4 offset=00002000 > Capabilities: [a0] Express (v2) Endpoint, MSI 00 > DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s > <512ns, L1 <64us > ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ > SlotPowerLimit 0.000W > DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ > Unsupported+ > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset- > MaxPayload 256 bytes, MaxReadReq 512 bytes > DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ > TransPend+ > LnkCap: Port #2, Speed 5GT/s, Width x8, ASPM L0s, Exit > Latency L0s unlimited, L1 <8us > ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ > DLActive- BWMgmt- ABWMgmt- > DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, > OBFF Not Supported > DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, > OBFF Disabled > LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- > Transmit Margin: Normal Operating Range, > EnterModifiedCompliance- ComplianceSOS- > Compliance De-emphasis: -6dB > LnkSta2: Current De-emphasis Level: -6dB, > EqualizationComplete-, EqualizationPhase1- > EqualizationPhase2-, EqualizationPhase3-, > LinkEqualizationRequest- > Capabilities: [100 v1] Advanced Error Reporting > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- > RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- > NonFatalErr+ > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- > NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ > ChkEn- > Capabilities: [140 v1] Device Serial Number 90-e2-ba-ff-ff-b6-b2-60 > Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI) > ARICap: MFVC- ACS-, Next Function: 0 > ARICtl: MFVC- ACS-, Function Group: 0 > Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV) > IOVCap: Migration-, Interrupt Message Number: 000 > IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+ > IOVSta: Migration- > Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function > Dependency Link: 00 > VF offset: 128, stride: 2, Device ID: 10ed > Supported Page Size: 00000553, System Page Size: 00000001 > Region 0: Memory at 00000000c7c00000 (64-bit, prefetchable) > Region 3: Memory at 00000000c7b00000 (64-bit, prefetchable) > VF Migration: offset: 00000000, BIR: 0 > Kernel driver in use: ixgbe > > > > > workaround for now is to do the same, as Brenden did in his original > finding: make sure that combined + xdp queues < max_tx_queues > (e.g. w/ combined == 14 the issue goes away). > > > the issue with one of the sample XDP programs provided with the kernel > > such as the xdp2 which I believe uses the XDP_TX function. We need to > > try and create a similar setup in our own environment for > > reproduction and debugging. > > will try but this could take a while, because i'm not sure that we have > ixgbe in our test lab (and it would be hard to run such test in prod) > > > > > Thanks. > > > > - Alex > > -- > Nikita V. Shirokov
So I have been reading the datasheet (https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82599-10-gbe-controller-datasheet.pdf) and it looks like the assumption that Brenden came to in the earlier referenced link is probably correct. From what I can tell there is a limit of 64 queues in the base RSS mode of the device, so while it supports more than 64 queues you can only make use of 64 as per table 7-25. For now I think the workaround you are using is probably the only viable solution. I myself don't have time to work on resolving this, but I am sure on of the maintainers for ixgbe will be responding shortly. One possible solution we may want to look at would be to make use of the 32 pool/VF mode in the MTQC register. That should enable us to make use of all 128 queues but I am sure there would be other side effects such as having to set the bits in the PFVFTE register in order to enable the extra Tx queues. Thanks. - Alex