On Wed, 26 Sep 2012, Ulrich Drepper wrote:

On Wed, Sep 26, 2012 at 7:32 AM, Jakub Jelinek <ja...@redhat.com> wrote:
Have you considered also an __AVX__ version handling 4 elements at a time?
Without __AVX2__ one would need to cast __m256i to __m256d for and/or, as
AVX1 doesn't have _mm256_and_si256 or _mm256_or_si256, but _mm256_and_pd
or _mm256_or_pd could be used instead.

Or we can just use '|' on __m256i vectors, which generates a suitable instruction depending on -mavx/-mavx2.

One step to do first.

:-)

Currently the random number engine interface is
inefficient since it returns a single number.  What we need is an
additional interface to return vectors.

Isn't the __generate interface good enough?

--
Marc Glisse

PS: mixed scalar-vector operations are posted here:
http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01557.html

after that, C and C++ should be at the same level.

Reply via email to