On Wed, Sep 26, 2012 at 7:32 AM, Jakub Jelinek <ja...@redhat.com> wrote: > Have you considered also an __AVX__ version handling 4 elements at a time? > Without __AVX2__ one would need to cast __m256i to __m256d for and/or, as > AVX1 doesn't have _mm256_and_si256 or _mm256_or_si256, but _mm256_and_pd > or _mm256_or_pd could be used instead.
One step to do first. Currently the random number engine interface is inefficient since it returns a single number. What we need is an additional interface to return vectors. I'd love to use the gcc vector extensions. For engines like sfmt this is natural. There are a few issues with C++ support for the vector extensions. Operations available in C are not supported in C++ yet.