The discussion about tc action reminded my of something else I wanted
to take care of in the next time. Some of the non-work-conserving
qdiscs (HFSC, TBF, netem) need to peek at the next packet when
throttling to calculate the timeout when to wake up. This is currently
done be dequeueing a packet, looking at it (usually the size), and
requeueing it. This only works properly when the inner qdisc doesn't
reorder packets, otherwise it might hand out a different packet when
dequeued again after wakeing up, which results in either wakeing
up too early (when the packet is larger) or underutilization (when the
packet is smaller). To correctly deal with this, we need a peek
operation that guarantees that the next packet dequeued will be the
one peeked at, even if a higher priority packer arrives. This will
increase the worst-case latency by the transmission time of one full
sized packet for reordering qdiscs, but the same can happen today,
this way at least there is no underutilization.

There are basically two possibilities how to implement this. The less
intrusive, but IMO more hackish one is to just handle this inside the
qdiscs that require this operation by not requeueing the packet to
the qdisc, but keeping a private reference somewhere. The disadvantage
is that this distorts statistics and estimators, the classful qdisc
would for example have more packets queued than the sum of all its
inner qdiscs. The other possibility is to introduce a ->peek operation
or a flag to ->dequeue and handle it within the reordering qdiscs.
I think we only need to implement it for non-classful (or single-class)
qdiscs, I can't imagine why anyone would add a scheduler as inner qdisc
to a different scheduler, at least with the current ones.

Any preferences or suggestions? Otherwise I'll go with the second
possibility.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to