Hey all-
        Sorry for the delay here, but it took me a bit to get the perf bits
working to my satisfaction.  As Ingo requested I added do_csum to the perf
benchmarking utility (as part of the mem suite, since it didn't seem right to
create its own suite).  I've also revamped the do_csum routine to do some smart
prefetching, as it yielded slightly better performance over simple prefetching
at a fixed stride:

Without prefetch:
[root@rdma-dev-02 perf]# ./perf bench mem csum -r x86-64-csum -l 1500B -s 512MB
-i 1000000 -c
# Running mem/csum benchmark...
# Copying 1500B Bytes ...

       0.955977 Cycle/Byte

With prefetch:
[root@rdma-dev-02 perf]# ./perf bench mem csum -r x86-64-csum -l 1500B -s 512MB
-i 1000000 -c
# Running mem/csum benchmark...
# Copying 1500B Bytes ...

       0.922540 Cycle/Byte


About a 3% improvement.

Signed-off-by: Neil Horman <[email protected]>
CC: [email protected]
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: [email protected]

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to