I'm wondering if the following code looks reasonable as a benchmark to test
performance of L1D cache in Gem5. Changing the size of the L1 cache in Gem5
(from 32KiB to 1KiB) doesn't seem to show any significant change in the
benchmark performance (~1%). I would have thought that using a 1KiB L1D cache
should show a large decrease in performance. I'm using: "RiscvMinorCPU()" with
"system.mem_mode = 'timing'". Benchmark code should reuse memory within a 24KiB
block, so the performance between a 32KiB cache and a 1KiB cache should be
significant.
Call to benchmark code in the C code would look something like
"ubench_cache(100, buffer, (1<<10)*24);".
static uint32_t ubench_cache(const size_t iters,
const void *in,
const size_t sz) {
const uint8_t *buf = (uint8_t*)in;
const size_t def_blk_sz = ((size_t)(1<<10)) * 24;
const size_t blk_sz = (sz < def_blk_sz) ? (sz) : (def_blk_sz);
const size_t cl_sz = 64;
const size_t in_iters = 100;
uint8_t h = 0;
for (size_t j=0; j < iters; j++) { // Outer iterations
for (size_t b=0; b < sz; b+=blk_sz) { // There will be only 1 block
for (size_t i=0; i < in_iters; i++) { // Inner iterations
for (size_t o=0; o < cl_sz; o++) { // Offset withing a cache
line
for (size_t c=0; c < blk_sz; c+=cl_sz) { // Cache line within a block
const size_t ndx = b + c + o;
h += (ndx < sz) ? (buf[ndx]) : (i);
}
}
}
}
}
return (uint32_t)h;
}
To have something else to test with, can anyone recommend a small / simple
cache benchmark written in C? I'm trying to see if my Gem5 configuration is the
problem, or if this test isn't a good one.
Thanks much,
~Aaron Vose
_______________________________________________
gem5-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]