On Wed, Jul 10, 2019 at 9:11 PM Sunil Pandey <skpg...@gmail.com> wrote: > > Thanks Uros. I incorporated suggested changes in attached patch. > > --Sunil Pandey > > i386: Add AVX512 unaligned intrinsics > > __m512i _mm512_loadu_epi32( void * sa); > __m512i _mm512_loadu_epi64( void * sa); > void _mm512_storeu_epi32(void * d, __m512i a); > void _mm256_storeu_epi32(void * d, __m256i a); > void _mm_storeu_epi32(void * d, __m128i a); > void _mm512_storeu_epi64(void * d, __m512i a); > void _mm256_storeu_epi64(void * d, __m256i a); > void _mm_storeu_epi64(void * d, __m128i a); > > Tested on x86-64. > > gcc/ > > PR target/90980 > * config/i386/avx512fintrin.h (_mm512_loadu_epi32): New. > (_mm512_loadu_epi64): Likewise. > (_mm512_storeu_epi32): Likewise. > (_mm512_storeu_epi64): Likewise. > * config/i386/avx512vlintrin.h (_mm_storeu_epi32): New. > (_mm256_storeu_epi32): Likewise. > (_mm_storeu_epi64): Likewise. > (_mm256_storeu_epi64): Likewise. > > gcc/testsuite/ > > PR target/90980 > * gcc.target/i386/pr90980-1.c: New test. > * gcc.target/i386/pr90980-2.c: Likewise. > * gcc.target/i386/pr90980-3.c: Likewise.
Looks good, but please put new intrinsics nearby existing intrinsics, so we will have e.g.: _mm512_loadu_epi32 _mm512_mask_loadu_epi32 _mm512_maskz_loadu_epi32 and in similar way for other loads and stores. Uros. > > On Tue, Jul 9, 2019 at 11:39 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On Tue, Jul 9, 2019 at 11:44 PM Sunil Pandey <skpg...@gmail.com> wrote: > > > > > > __m512i _mm512_loadu_epi32( void * sa); > > > __m512i _mm512_loadu_epi64( void * sa); > > > void _mm512_storeu_epi32(void * d, __m512i a); > > > void _mm256_storeu_epi32(void * d, __m256i a); > > > void _mm_storeu_epi32(void * d, __m128i a); > > > void _mm512_storeu_epi64(void * d, __m512i a); > > > void _mm256_storeu_epi64(void * d, __m256i a); > > > void _mm_storeu_epi64(void * d, __m128i a); > > > > > > Tested on x86-64. > > > > > > OK for trunk? > > > > > > --Sunil Pandey > > > > > > > > > gcc/ > > > > > > PR target/90980 > > > * config/i386/avx512fintrin.h (__v16si_u): New data type > > > (__v8di_u): Likewise > > > (_mm512_loadu_epi32): New. > > > (_mm512_loadu_epi64): Likewise. > > > (_mm512_storeu_epi32): Likewise. > > > (_mm512_storeu_epi64): Likewise. > > > * config/i386/avx512vlintrin.h (_mm_storeu_epi32): New. > > > (_mm256_storeu_epi32): Likewise. > > > (_mm_storeu_epi64): Likewise. > > > (_mm256_storeu_epi64): Likewise. > > > > > > gcc/testsuite/ > > > > > > PR target/90980 > > > * gcc.target/i386/avx512f-vmovdqu32-3.c: New test. > > > * gcc.target/i386/avx512f-vmovdqu64-3.c: Likewise. > > > * gcc.target/i386/pr90980-1.c: Likewise. > > > * gcc.target/i386/pr90980-2.c: Likewise. > > > > +/* Internal data types for implementing unaligned version of intrinsics. > > */ > > +typedef int __v16si_u __attribute__ ((__vector_size__ (64), > > + __aligned__ (1))); > > +typedef long long __v8di_u __attribute__ ((__vector_size__ (64), > > + __aligned__ (1))); > > > > You should define only one generic __m512i_u type, something like: > > > > typedef long long __m512i_u __attribute__ ((__vector_size__ (64), > > __may_alias__, __aligned__ (1))); > > > > Please see avxintrin.h how __m256i_u is defined and used. > > > > Uros.