Currently txq/qdisc selection is based on flow hash so packets from
the same flow will follow the order when they enter qdisc/txq, which
avoids out-of-order problem.

To improve the concurrency of QoS algorithm we plan to have multiple
per-cpu queues for a single TC class and do busy polling from a
per-class thread to drain these queues. If we can do this frequently
enough the out-of-order situation in this polling thread should not be
that bad.

To give more details - in the send path we introduce per-cpu per-class
queues so that packets from the same class and same core will be
enqueued to the same place. Then a per-class thread poll the queues
belonging to its class from all the cpus and aggregate them into
another per-class queue. This can effectively reduce contention but
inevitably introduces potential out-of-order issue.

Any concern/suggestion for working towards this direction?

Reply via email to