On Mon, 2007-24-09 at 16:47 -0700, Waskiewicz Jr, Peter P wrote: > We should make sure we're symmetric with the locking on enqueue to > dequeue. If we use the single device queue lock on enqueue, then > dequeue will also need to check that lock in addition to the individual > queue lock. The details of this are more trivial than the actual > dequeue to make it efficient though.
It would be interesting to observe the performance implications. > The dequeue locking would be pushed into the qdisc itself. This is how > I had it originally, and it did make the code more complex, but it was > successful at breaking the heavily-contended queue_lock apart. I have a > subqueue structure right now in netdev, which only has queue_state (for > netif_{start|stop}_subqueue). This state is checked in sch_prio right > now in the dequeue for both prio and rr. My approach is to add a > queue_lock in that struct, so each queue allocated by the driver would > have a lock per queue. Then in dequeue, that lock would be taken when > the skb is about to be dequeued. more locks implies degraded performance. If only one processor can enter that region, presumably after acquiring the outer lock , why this secondary lock per queue? > The skb->queue_mapping field also maps directly to the queue index > itself, so it can be unlocked easily outside of the context of the > dequeue function. The policy would be to use a spin_trylock() in > dequeue, so that dequeue can still do work if enqueue or another dequeue > is busy. So there could be a parallel cpu dequeueing at the same time? Wouldnt this have implications depending on what the scheduling algorithm used? If for example i was doing priority queueing i would want to make sure the highest priority is being dequeued first AND by all means goes out first to the driver; i dont want a parallell cpu dequeing a lower prio packet at the same time. > And the allocation of qdisc queues to device queues is assumed > to be one-to-one (that's how the qdisc behaves now). Ok, that brings back the discussion we had; my thinking was something like dev->hard_prep_xmit() would select the ring and i think you staticly already map the ring to a qdisc queue. So i dont think dev->hard_prep_xmit() is useful to you. In any case, there is nothing the batching patches do that interfere or prevent you from going the path you intend to. instead of dequeueing one packet, you dequeue several and instead of sending to the driver one packet, you send several. And using the xmit_win, you should never ever have to requeue. cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html