It seems prettier to just draw graphs and since this one is small file; here it is attached. The graph demos a patched net-2.6.24 vs a plain net-2.6.24 kernel with a udp app that sends on 4 CPUs as fast as the the lower layers would allow it. Refer to my earlier description of the test setup etc. As i noted earlier on, for this hardware at about 200B or so, we approach wire speed, so the app is mostly idle above that as the link becomes the bottleneck; example it is > 85% idle on 512B and > 90% idle on 1024B. This is so for either batch or non-batch. So the differentiation is really in the smaller sized packets.
Enjoy! cheers, jamal
batch-pps.pdf
Description: Adobe PDF document