Thanks Uros. I incorporated suggested changes in attached patch. --Sunil Pandey
i386: Add AVX512 unaligned intrinsics
__m512i _mm512_loadu_epi32( void * sa);
__m512i _mm512_loadu_epi64( void * sa);
void _mm512_storeu_epi32(void * d, __m512i a);
void _mm256_storeu_epi32(void * d, __m256i a);
void _mm_storeu_epi32(void * d, __m128i a);
void _mm512_storeu_epi64(void * d, __m512i a);
void _mm256_storeu_epi64(void * d, __m256i a);
void _mm_storeu_epi64(void * d, __m128i a);
Tested on x86-64.
gcc/
PR target/90980
* config/i386/avx512fintrin.h (_mm512_loadu_epi32): New.
(_mm512_loadu_epi64): Likewise.
(_mm512_storeu_epi32): Likewise.
(_mm512_storeu_epi64): Likewise.
* config/i386/avx512vlintrin.h (_mm_storeu_epi32): New.
(_mm256_storeu_epi32): Likewise.
(_mm_storeu_epi64): Likewise.
(_mm256_storeu_epi64): Likewise.
gcc/testsuite/
PR target/90980
* gcc.target/i386/pr90980-1.c: New test.
* gcc.target/i386/pr90980-2.c: Likewise.
* gcc.target/i386/pr90980-3.c: Likewise.
On Tue, Jul 9, 2019 at 11:39 PM Uros Bizjak <[email protected]> wrote:
>
> On Tue, Jul 9, 2019 at 11:44 PM Sunil Pandey <[email protected]> wrote:
> >
> > __m512i _mm512_loadu_epi32( void * sa);
> > __m512i _mm512_loadu_epi64( void * sa);
> > void _mm512_storeu_epi32(void * d, __m512i a);
> > void _mm256_storeu_epi32(void * d, __m256i a);
> > void _mm_storeu_epi32(void * d, __m128i a);
> > void _mm512_storeu_epi64(void * d, __m512i a);
> > void _mm256_storeu_epi64(void * d, __m256i a);
> > void _mm_storeu_epi64(void * d, __m128i a);
> >
> > Tested on x86-64.
> >
> > OK for trunk?
> >
> > --Sunil Pandey
> >
> >
> > gcc/
> >
> > PR target/90980
> > * config/i386/avx512fintrin.h (__v16si_u): New data type
> > (__v8di_u): Likewise
> > (_mm512_loadu_epi32): New.
> > (_mm512_loadu_epi64): Likewise.
> > (_mm512_storeu_epi32): Likewise.
> > (_mm512_storeu_epi64): Likewise.
> > * config/i386/avx512vlintrin.h (_mm_storeu_epi32): New.
> > (_mm256_storeu_epi32): Likewise.
> > (_mm_storeu_epi64): Likewise.
> > (_mm256_storeu_epi64): Likewise.
> >
> > gcc/testsuite/
> >
> > PR target/90980
> > * gcc.target/i386/avx512f-vmovdqu32-3.c: New test.
> > * gcc.target/i386/avx512f-vmovdqu64-3.c: Likewise.
> > * gcc.target/i386/pr90980-1.c: Likewise.
> > * gcc.target/i386/pr90980-2.c: Likewise.
>
> +/* Internal data types for implementing unaligned version of intrinsics. */
> +typedef int __v16si_u __attribute__ ((__vector_size__ (64),
> + __aligned__ (1)));
> +typedef long long __v8di_u __attribute__ ((__vector_size__ (64),
> + __aligned__ (1)));
>
> You should define only one generic __m512i_u type, something like:
>
> typedef long long __m512i_u __attribute__ ((__vector_size__ (64),
> __may_alias__, __aligned__ (1)));
>
> Please see avxintrin.h how __m256i_u is defined and used.
>
> Uros.
0001-i386-Add-AVX512-unaligned-intrinsics.patch
Description: Binary data
