Herbert Xu a écrit :
On Wed, Nov 28, 2007 at 01:53:36PM -0500, Hideo AOKI wrote:
+/**
+ *     __skb_queue_purge_and_sub_memory_allocated
+ *             - empty a list and subtruct memory allocation counter
+ *     @sk:   sk
+ *     @list: list to empty
+ *     Delete all buffers on an &sk_buff list and subtruct the
+ *     truesize of the sk_buff for memory accounting. Each buffer
+ *     is removed from the list and one reference dropped. This
+ *     function does not take the list lock and the caller must
+ *     hold the relevant locks to use it.
+ */
+static inline void __skb_queue_purge_and_sub_memory_allocated(struct sock *sk,
+                                       struct sk_buff_head *list)
+{
+       struct sk_buff *skb;
+       int purged_skb_size = 0;
+       while ((skb = __skb_dequeue(list)) != NULL) {
+               purged_skb_size += sk_datagram_pages(skb->truesize);
+               kfree_skb(skb);
+       }
+       atomic_sub(purged_skb_size, sk->sk_prot->memory_allocated);
+}

Thanks, this is a lot better than before!

However, I'm still a little concerned about the effect of two more
atomic op's per packet that we're adding here.  Hang on a sec, that
should've been Dave's line since atomic ops are cheap on x86 :)

But seriously, it's not so much that we have two more atomic op's
per packet, but we have two more writes to a single global counter
for each packet.  This is going to really suck on SMP.

So what I'd like to see is a scheme that's similar to sk_forward_alloc.
The idea is that each socket allocates memory using mem_schedule and
then stores it in sk_forward_alloc.  Each packet then only has to
add to/subtract from sk_forward_alloc.

There is one big problem with this though, UDP is not serialised like
TCP.  So you can't just use sk_forward_alloc since it's not an atomic_t.

We'll need to think about this one a bit more.

I agree adding yet another atomics ops is a big problem.

Another idea, coupled with recent work on percpu storage done by Christoph Lameter, would be to use kind of a percpu_counter :

We dont really need strong and precise memory accounting (UDP , but TCP as well), just some kind of limit to avoid memory to be too much used.

That is, updating a percpu variable, and doing some updates to a global counter only when this percpu variable escapes from a given range.

Lot of contended cache lines could benefit from this relaxing (count of sockets...)

I would wait first that Christoph work is done, so that we dont need atomic ops on local cpu storage (and no need to disable preemption too).

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to