[EMAIL PROTECTED] a écrit :
On Mon, Jan 21, 2008 at 02:40:43AM -0800, David Miller wrote:
From: Joonwoo Park <[EMAIL PROTECTED]>
Date: Tue, 22 Jan 2008 00:08:57 +0900

The rt_run_flush() can be stucked if it was called while netdev is on the high load.
It's possible when pushing rtable to rt_hash is faster than pulling
from it.

Signed-off-by: Joonwoo Park <[EMAIL PROTECTED]>
I agree with the analysis of the problem, however not the solution.

This will absolutely kill software interrupt latency.

In fact, we have moved much of the flush work into a workqueue in
net-2.6.25 because of how important that is

We need to find some other way to solve this.


Dave, Eric,
Thanks so much for comments.

I did stress tests and I found that the real problem was not consumer & supplier
issue.
It was the problem for me to innumerable enabling & disabling the softirq.
But I'm still thinking need of considering issue 'faster caching than flush'. :)
ifconfig up on heavy loaded interface.
Before patching:
 time ifconfig eth1 up
 BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9]
 ...

After patching:
 time ifconfig eth1 up
real    0m0.007s
user    0m0.000s
sys     0m0.004s

Thanks!
Joonwoo


From 87c29506de967e811ad5b57cd2e1a002134e878f Mon Sep 17 00:00:00 2001
From: Joonwoo Park <[EMAIL PROTECTED]>
Date: Wed, 23 Jan 2008 15:16:54 +0900
Subject: [PATCH] [IPV4] route: reduce locking/unlocking in rt_run_flush

The rt_run_flush does spin_lock_bh/spin_unlock_bh for rt_hash_mask + 1
times.
The rt_hash_mask takes from 32767 to 65535, so it's big overhead.
In addition, disable_bh/enable_bh for many times in the rt_run_flush
can cause stuck on a machine with heavily pended softirqs.

This patch reduces locking/unlocking as doing it with jumping the lock
slots.

ifconfig up on heavy loaded interface.
Before:
 time ifconfig eth1 up
 BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9]
 ...

After:
 time ifconfig eth1 up
real    0m0.007s
user    0m0.000s
sys     0m0.004s


Unfortunatly, your patch doesnt work on CONFIG_SMP=n (softirq will be disabled for the whole scan of table)

Also, some machines around there have 2^22 slots in hash table, and NR_CPUS=4, so softirqs will be disabled for a too long time.

Please try net-2.6.25 and submit patches on top of it if necessary, since rt_run_flush() has pending changes, not in net-2.6

Note : The 'soft lockup' can be avoided by other means.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to