https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68484
--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> --- (In reply to Richard Biener from comment #4) > As the summary mentions 'volatile' I'll also point to the implementation of > the intrinsics which have > > /* Store four SPFP values. The address must be 16-byte aligned. */ > extern __inline void __attribute__((__gnu_inline__, __always_inline__, > __artificial__)) > _mm_store_ps (float *__P, __m128 __A) > { > *(__v4sf *)__P = (__v4sf)__A; > } > > so they are not using a volatile qualified type to access *__P which means > the stores are not considered volatile by GCC. > > The arguments about strict-aliasing requirements still hold, only __m128 is > declared as __may_alias__: > > /* The Intel API is flexible enough that we must allow aliasing with other > vector types, and their scalar components. */ > typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); > > /* Internal data types for implementing the intrinsics. */ > typedef float __v4sf __attribute__ ((__vector_size__ (16))); > > so the v4sf store has regular TBAA rules applied (and the __may_alias__ on > the by value passed __A has no effect). > > -> target "bug", but I'd say an INVALID one. > > HJ, I remember the "master" copy of the intrinsics documentation is > somewhere at Intel - what does that say to the two above issues? > > Thus all of this boils down to the question whether the intrinsics are > implemented correctly (as documented). The volatile part of it would > mean to either pessimize all users or that we can't implement the > intrinsics as C functions. _mm_store_ps is documented in Intel SDM for movaps. It doesn't say anything about aliasing.