Jeffrey B. Layton wrote:
Here comes the $64 question - how do you benchmark the IO portion of your
code so you can understand whether you need a parallel file system, what
kind
of connection do you need from a client to the storage, etc. This is a
difficult
problem and one in which I have an interest.
Yeah, it is hard. My own view is that it takes a hard looking at the
code if you can, and if you can't, we use dstat, atop, vmstat, and other
tools to see if the IO channel is full.
What we do here (if atop/dstat/... suggest that IO is an issue) is to
replicate the runs, and provide ever larger pipes to IO to see if this
ameliorates problems.
We have found it does for some codes (specific CFD and others).
It is hard in general to do a good IO benchmark, which is why we have
bonnie++ and IOzone. They aren't great, they have domains of applicability.
I wrote something called IO-bm to help a customer evaluate multiple
streams (reading/writing) to file system(s). I still have to get it
working with MPI-IO (it is an MPI code), but it seems to reflect
specific threaded IO workloads reasonably well. Its good enough to use
for some tuning effort on the underlying system.
The best way I've found is to look a the IO pattern of your code(s). The
best
Yup. For this you either need source or a way to profile the IOs ...
I've found to do this is to run an strace against the code. I've written
an strace
This does help.
Empirical data is better than none at all, even reconstructed data is
quite helpful
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web : http://www.scalableinformatics.com
http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf