Hi all, In case someone is interested, we are developing a set of inputs (MiDataSets) for the MiBench benchmark. Iterative optimization is now a popular technique to obtain performance or code size improvements over the default settings in a compiler. However, in most of the research projects, the best configuration is found for one arbitrary dataset and it is assumed that this configuration will work well with any other dataset that a program uses. We created 20 different datasets per program for free MiBench benchmark to evaluate this assumption and analyze the behavior of various programs with multiple datasets. We hope that this will enable more realistic benchmarking, practical iterative optimizations (iterative compilation), and can help to automatically improve GCC optimization heuristic.
We just made a pre-release of the 1st version of MiDataSets and we made an effort to include only copyright free inputs from the Internet. However, mistakes are possible - in such cases, please contact me to resolve the issue or remove the input. More information can be found at the MiDataSets development website: http://midatasets.sourceforge.net or in the paper: Grigori Fursin, John Cavazos, Michael O'Boyle and Olivier Temam. MiDataSets: Creating The Conditions For A More Realistic Evaluation of Iterative Optimization. Proceedings of the International Conference on High Performance Embedded Architectures Compilers (HiPEAC 2007), Ghent, Belgium, January 2007 Any suggestions and comments are welcome! Yours, Grigori Fursin ===================================================== Grigori Fursin, PhD INRIA Futurs, France http://fursin.net/research