http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43725
--- Comment #3 from joseph at codesourcery dot com <joseph at codesourcery dot com> 2010-10-04 23:45:57 UTC --- On Mon, 4 Oct 2010, siarhei.siamashka at gmail dot com wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43725 > > --- Comment #2 from Siarhei Siamashka <siarhei.siamashka at gmail dot com> > 2010-10-04 22:59:56 UTC --- > (In reply to comment #1) > > So the compiler is correct not to be using vld1 for this code. The memory > > format of int32x4_t is defined to be the format of a neon register that has > > been filled from an array of int32 values and then stored to memory using > > VSTM > > (or equivalent sequence). The implication of all this is that int32x4_t > > does > > not (necessarily) have the same memory layout as int32_t[4]. > > Could you elaborate on this? Specifically about the case when memory format > for > VSTM and VST1 may differ. Big-endian. I previously explained the issues with big-endian NEON vectors in GCC at length: http://gcc.gnu.org/ml/gcc-patches/2010-06/msg00409.html