This fixes a long-standing problem with GCC's implementation of the PPC64 ELF ABI. If a structure contains a member requiring 128-bit alignment, and that structure is passed as a parameter, the parameter currently receives only 64-bit alignment. This is an error, and is incompatible with correct code generated by the IBM XL compilers.
This situation should be rare, so we feel that it is ok to correct GCC to generate compliant code. However, it is possible that new code may need to link with existing libraries that expect the incorrect parameter passing. For this case, a new option mcompat-align-parm is provided to cause the noncompliant code to be generated. (David, any better choice of name for this option?) Correcting the initial bug exposed another problem with varargs handling. When receiving a parameter having size zero, the alignment specified for the parameter must be honored. Several variations in the compatibility test suite detected that this is not being done. For example, a parameter of type struct { struct { } a[4]; } __attribute__((aligned)) must use the default alignment for the target, which for PowerPC is 128 bits. This was not being done. When I corrected the first problem to generate 128-bit alignment for structs that require it, the un-honored alignment for zero-size arguments in varargs was exposed. This patch corrects that also. In both cases, existing code for MachO/darwin can be used for AIX and Linux. I've provided two tests to verify correct parameter offsets for the first problem, one with the mcompat-align-parm option set, and one without. I did not add tests for the second problem, as the compatibility test suite is already sufficient. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no regressions. Ok for trunk? Thanks, Bill gcc: 2013-08-14 Bill Schmidt <wschm...@linux.vnet.ibm.com> PR target/57949 * doc/invoke.texi: Add documentation of mcompat-align-parm option. * config/rs6000/rs6000.opt: Add mcompat-align-parm option. * config/rs6000/rs6000.c (rs6000_function_arg_boundary): For AIX and Linux, correct BLKmode alignment when 128-bit alignment is required and compatibility flag is not set. (rs6000_gimplify_va_arg): For AIX and Linux, honor specified alignment for zero-size arguments when compatibility flag is not set. gcc/testsuite: 2013-08-14 Bill Schmidt <wschm...@linux.vnet.ibm.com> PR target/57949 * gcc.target/powerpc/pr57949-1.c: New. * gcc.target/powerpc/pr57949-2.c: New. Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 201492) +++ gcc/doc/invoke.texi (working copy) @@ -866,7 +866,8 @@ See RS/6000 and PowerPC Options. -msave-toc-indirect -mno-save-toc-indirect @gol -mpower8-fusion -mno-mpower8-fusion -mpower8-vector -mno-power8-vector @gol -mcrypto -mno-crypto -mdirect-move -mno-direct-move @gol --mquad-memory -mno-quad-memory} +-mquad-memory -mno-quad-memory @gol +-mcompat-align-parm -mno-compat-align-parm} @emph{RX Options} @gccoptlist{-m64bit-doubles -m32bit-doubles -fpu -nofpu@gol @@ -18323,6 +18324,22 @@ stack location in the function prologue if the fun a pointer on AIX and 64-bit Linux systems. If the TOC value is not saved in the prologue, it is saved just before the call through the pointer. The @option{-mno-save-toc-indirect} option is the default. + +@item -mcompat-align-parm +@itemx -mno-compat-align-parm +@opindex mcompat-align-parm +Generate (do not generate) code to pass structure parameters with a +maximum alignment of 64 bits, for compatibility with older versions +of GCC. + +Older versions of GCC (prior to 4.9.0) incorrectly did not align a +structure parameter on a 128-bit boundary when that structure contained +a member requiring 128-bit alignment. This is corrected in more +recent versions of GCC. This option may be used to generate code +that is compatible with functions compiled with older versions of +GCC. + +The @option{-mno-compat-align-parm} option is the default. @end table @node RX Options Index: gcc/testsuite/gcc.target/powerpc/pr57949-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr57949-1.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr57949-1.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-options "-O2 -mcpu=power7" } */ + +/* Verify that vs is 16-byte aligned in the absence of -mcompat-align-parm. */ + +typedef float v4sf __attribute__ ((vector_size (16))); +struct s { long m; v4sf v; }; +long n; +v4sf ve; + +void pr57949 (long d1, long d2, long d3, long d4, long d5, long d6, + long d7, long d8, long d9, struct s vs) { + n = vs.m; + ve = vs.v; +} + +/* { dg-final { scan-assembler "li \.\*,144" } } */ +/* { dg-final { scan-assembler "ld \.\*,128\\(1\\)" } } */ Index: gcc/testsuite/gcc.target/powerpc/pr57949-2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr57949-2.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr57949-2.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-options "-O2 -mcpu=power7 -mcompat-align-parm" } */ + +/* Verify that vs is not 16-byte aligned with -mcompat-align-parm. */ + +typedef float v4sf __attribute__ ((vector_size (16))); +struct s { long m; v4sf v; }; +long n; +v4sf ve; + +void pr57949 (long d1, long d2, long d3, long d4, long d5, long d6, + long d7, long d8, long d9, struct s vs) { + n = vs.m; + ve = vs.v; +} + +/* { dg-final { scan-assembler "ld .\*,136\\(1\\)" } } */ +/* { dg-final { scan-assembler "ld .\*,120\\(1\\)" } } */ Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (revision 201492) +++ gcc/config/rs6000/rs6000.opt (working copy) @@ -546,3 +546,7 @@ Use ISA 2.07 transactional memory (HTM) instructio mquad-memory Target Report Mask(QUAD_MEMORY) Var(rs6000_isa_flags) Generate the quad word memory instructions (lq/stq/lqarx/stqcx). + +mcompat-align-parm +Target Report Var(rs6000_compat_align_parm) Init(0) Save +Generate aggregate parameter passing code with at most 64-bit alignment. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 201492) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -8344,8 +8344,8 @@ rs6000_function_arg_boundary (enum machine_mode mo || (type && TREE_CODE (type) == VECTOR_TYPE && int_size_in_bytes (type) >= 16)) return 128; - else if (TARGET_MACHO - && rs6000_darwin64_abi + else if (((TARGET_MACHO && rs6000_darwin64_abi) + || (DEFAULT_ABI == ABI_AIX && !rs6000_compat_align_parm)) && mode == BLKmode && type && TYPE_ALIGN (type) > 64) return 128; @@ -9893,8 +9893,9 @@ rs6000_gimplify_va_arg (tree valist, tree type, gi We don't need to check for pass-by-reference because of the test above. We can return a simplifed answer, since we know there's no offset to add. */ - if (TARGET_MACHO - && rs6000_darwin64_abi + if (((TARGET_MACHO + && rs6000_darwin64_abi) + || (DEFAULT_ABI == ABI_AIX && !rs6000_compat_align_parm)) && integer_zerop (TYPE_SIZE (type))) { unsigned HOST_WIDE_INT align, boundary;