Hi,
I am doing some XDP testing on a dual socket, combined 40 core machine
with ixgbe. I have found that with the default settings, depending on
which core a packet is received on, the xdp tx queue will hang with:
ixgbe 0000:01:00.0 eno1: Detected Tx Unit Hang (XDP)
Tx Queue <38>
TDH, TDT <0>, <8>
next_to_use <8>
next_to_clean <0>
tx_buffer_info[next_to_clean]
time_stamp <0>
jiffies <101f21bb8>
ixgbe 0000:01:00.0 eno1: tx hang 1 detected on queue 38, resetting adapter
ixgbe 0000:01:00.0 eno1: initiating reset due to tx timeout
ixgbe 0000:01:00.0 eno1: Reset adapter
When the received core is such that the xdp queue falls beyond
MAX_TX_QUEUES, then the hang results. In other words, if I leave
`ethtool -L eno1 combined 40` (the default), and a packet is received on
core 24 or greater, it hangs. However, if I lower the tx queue count to
24 (since XDP is forced to nr_cpu_ids), or if I force the incoming
packets onto core < 24 with an ntuple filter, then no hang occurs.
I imagine that some limits on the number of queues is in order here, or
some error reporting when loading the xdp program/allocating queues.
For now I am working around by lowering the rx queue count to leave
space for the xdp queues.
Thanks,
Brenden