Hi,

I am doing some XDP testing on a dual socket, combined 40 core machine
with ixgbe. I have found that with the default settings, depending on
which core a packet is received on, the xdp tx queue will hang with:

  ixgbe 0000:01:00.0 eno1: Detected Tx Unit Hang (XDP)
    Tx Queue             <38>
    TDH, TDT             <0>, <8>
    next_to_use          <8>
    next_to_clean        <0>
  tx_buffer_info[next_to_clean]
    time_stamp           <0>
    jiffies              <101f21bb8>
  ixgbe 0000:01:00.0 eno1: tx hang 1 detected on queue 38, resetting adapter
  ixgbe 0000:01:00.0 eno1: initiating reset due to tx timeout
  ixgbe 0000:01:00.0 eno1: Reset adapter

When the received core is such that the xdp queue falls beyond
MAX_TX_QUEUES, then the hang results. In other words, if I leave
`ethtool -L eno1 combined 40` (the default), and a packet is received on
core 24 or greater, it hangs. However, if I lower the tx queue count to
24 (since XDP is forced to nr_cpu_ids), or if I force the incoming
packets onto core < 24 with an ntuple filter, then no hang occurs.

I imagine that some limits on the number of queues is in order here, or
some error reporting when loading the xdp program/allocating queues.

For now I am working around by lowering the rx queue count to leave
space for the xdp queues.

Thanks,
Brenden

Reply via email to