http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57271

            Bug ID: 57271
           Summary: ARM: gcc generates insufficient alignment for memory
                    passed as extra argument for function return large
                    composite type
           Product: gcc
           Version: 4.8.1
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: java4ada at yahoo dot com

Created attachment 30109
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30109&action=edit
Testcase and output

Please find enclosed input Vector4.ii and Vector4.s compiled with "./xgcc -fpic
 -mfloat-abi=softfp -mthumb -Os -march=armv7-a -mfpu=neon -S Vector4.ii".

Because function initVector4() returns instance of Vector4 16-byte in size, GCC
passes internal memory buffer as the first argument to hold the return value. 
This is shown in Vector4.s line#54 "add r0,sp,#8", and the buffer is filled at
line#33 "vst1.64 {d16-d17}, [r0:128]".  The 128-bit alignment hint is due to
the fact that class Vector4 is declared to be 16-byte aligned.  Problem is, r0
may not be aligned to 16-byte if sp is 16-byte aligned, which results in crash
at vst1.64 [:128].  It seems that GCC doesn't honor the alignment of internal
memory buffer.

If Vector4 is declared to be 32-byte align, GCC generates extra code to ensure
r0 is properly aligned.  I assume GCC should do it as low as 16-byte too.

Reply via email to