Iozone is a very good tool to understand the performance of your storage for a variety of access patterns, including the impact of cache and ram. It spits out the raw data and in addition allows writing it to a spreadsheet file that you can then graph with openoffice or excel in order
to look at the dimensions you are interested in.

The trick is to read the documentation and to specify the correct parameters for the use case you are interested in ;-)

Example run I used for a machine with 2G of RAM. I was only interested in the write, read and random read/write test, so I only specified those. There is a variety of others that are more relevant to databases, i.e. backwards read.

I was not interested in a file size below 1G and record sizes below 4k, so I specified that as well.
iozone -c -C -g 4g -i 0 -i 1 -i 2 -n 1g -q 1m -y 4k -a -b $runname.xls

# -c include close() call in timing, relevant for nfs3
# -C show bytes transfered in each child
# -g 4g         maximum file size 4g (should be 8g if we have 4G of RAM)
# -n 1g         minimum file size 1g
# -q 1m         maximum record size 1m
# -y 4k         minimum record size 4k
# -i 0  write, rewrite test
# -i 1  read, reread test
# -i 2 random read/write
# -a fully automatic

Michael

Lombard, David N wrote:
On Tue, Sep 11, 2007 at 11:54:01AM -0400, Joe Landman wrote:
Loic Tortay wrote:

My pointless result was of course mostly due to cache, with 4 threads
each writing 1 Gbyte to 4 existing 2 GBytes files (one file per
thread).  The block size used was 128 kBytes, all (random) accesses are
block aligned, the value is the average aggregated throughput of all
threads for a 20 minutes run.
I seem to remember being told in a matter of fact manner by someone some
time ago, that only 2GB of IO mattered to them (which was entirely
cached BTW), so thats how they measured.  Caused me some head
scratching, but, well, ok.

If that was *truly* their usage, it *is* ok.  But...

My (large) concern on iozone and related is that it spends most of its
time *in cache*.  Its funny, if you go look at the disks during the
smaller tests, the blinkenlights don't blinken all that often ...
(certainly not below 2GB or so).

Agreed--it all depends on the phys memory; failing to consider that
can lead to invalid results.

Then again, maybe IOzone should be renamed "cache-zone" :)  More
seriously, I made some quick source changes to be able to do IOzone far
outside cache sizes (and main memory sizes) so I could see what impact
this has on the system.  It does have a noticable impact, and I report
on it in the benchmark report.

I've found specifying the appropriate file size works well.  For a while
now, filesize > 2*memorysize has gotten to uncached performance for
directly connected devices.  Complications can clearly occur when
there's additional caching on the other end of the wire.

What have you needed beyond the various sync options, e.g., -e, -G, -k,
-o, -r?

_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to