Herbert Xu a écrit :
On Wed, Nov 28, 2007 at 01:53:36PM -0500, Hideo AOKI wrote:
+/**
+ * __skb_queue_purge_and_sub_memory_allocated
+ * - empty a list and subtruct memory allocation counter
+ * @sk: sk
+ * @list: list to empty
+ * Delete all buffers on an &sk_buff list and subtruct the
+ * truesize of the sk_buff for memory accounting. Each buffer
+ * is removed from the list and one reference dropped. This
+ * function does not take the list lock and the caller must
+ * hold the relevant locks to use it.
+ */
+static inline void __skb_queue_purge_and_sub_memory_allocated(struct sock *sk,
+ struct sk_buff_head *list)
+{
+ struct sk_buff *skb;
+ int purged_skb_size = 0;
+ while ((skb = __skb_dequeue(list)) != NULL) {
+ purged_skb_size += sk_datagram_pages(skb->truesize);
+ kfree_skb(skb);
+ }
+ atomic_sub(purged_skb_size, sk->sk_prot->memory_allocated);
+}
Thanks, this is a lot better than before!
However, I'm still a little concerned about the effect of two more
atomic op's per packet that we're adding here. Hang on a sec, that
should've been Dave's line since atomic ops are cheap on x86 :)
But seriously, it's not so much that we have two more atomic op's
per packet, but we have two more writes to a single global counter
for each packet. This is going to really suck on SMP.
So what I'd like to see is a scheme that's similar to sk_forward_alloc.
The idea is that each socket allocates memory using mem_schedule and
then stores it in sk_forward_alloc. Each packet then only has to
add to/subtract from sk_forward_alloc.
There is one big problem with this though, UDP is not serialised like
TCP. So you can't just use sk_forward_alloc since it's not an atomic_t.
We'll need to think about this one a bit more.
I agree adding yet another atomics ops is a big problem.
Another idea, coupled with recent work on percpu storage done by Christoph
Lameter, would be to use kind of a percpu_counter :
We dont really need strong and precise memory accounting (UDP , but TCP as
well), just some kind of limit to avoid memory to be too much used.
That is, updating a percpu variable, and doing some updates to a global
counter only when this percpu variable escapes from a given range.
Lot of contended cache lines could benefit from this relaxing (count of
sockets...)
I would wait first that Christoph work is done, so that we dont need atomic
ops on local cpu storage (and no need to disable preemption too).
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html