Jeffrey B. Layton wrote:

Here comes the $64 question - how do you benchmark the IO portion of your
code so you can understand whether you need a parallel file system, what kind of connection do you need from a client to the storage, etc. This is a difficult
problem and one in which I have an interest.

Yeah, it is hard. My own view is that it takes a hard looking at the code if you can, and if you can't, we use dstat, atop, vmstat, and other tools to see if the IO channel is full.

What we do here (if atop/dstat/... suggest that IO is an issue) is to replicate the runs, and provide ever larger pipes to IO to see if this ameliorates problems.

We have found it does for some codes (specific CFD and others).

It is hard in general to do a good IO benchmark, which is why we have bonnie++ and IOzone. They aren't great, they have domains of applicability.

I wrote something called IO-bm to help a customer evaluate multiple streams (reading/writing) to file system(s). I still have to get it working with MPI-IO (it is an MPI code), but it seems to reflect specific threaded IO workloads reasonably well. Its good enough to use for some tuning effort on the underlying system.

The best way I've found is to look a the IO pattern of your code(s). The best

Yup.  For this you either need source or a way to profile the IOs ...

I've found to do this is to run an strace against the code. I've written an strace

This does help.

Empirical data is better than none at all, even reconstructed data is quite helpful


--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
       http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to