I'm pretty much done implementing my ideas of an openMP parallelization. In short: I was not able to beat the current implementation.
While a parallel logarithmic buffer reduction sounds nice indeed it has the huge disadvantage that there needs to be a barrier waiting for all threads to finish rendering before the reduction can start. Using the profiling interface where all voices just keep playing, this barrier is actually pretty minor. But for real midi files where voices start and stop unpredictably this barrier dominates everything. David actually did a very good job here by efficiently using thread idling times to mix in any finished buffers. I had to get rid of those barriers using the nowait clause (which consequently breaks rendering) to be slightly faster. This convinced me that the current parallelization approach is the right way to go, even if it involves a lot manual thread handling. And the little synchronization done currently is absolutely neglectable compared to my approach. So for now I'm withdrawing the idea of switching to openMP. Perhaps I'll investigate whether aligned memory accesses to allow auto-vectorization by the compiler turn out to be more beneficial. Tom _______________________________________________ fluid-dev mailing list fluid-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/fluid-dev