[PATCH], PR 64019, fix power7/power8 regression

Michael Meissner Mon, 01 Dec 2014 14:48:59 -0800

in my change on November 24th (adding the support to use scalar floating point
values in Altivec registers) there was a regression when Spec 2000 was compiled
for 32-bit big endian power7 systems.  The rs6000_legitimize_reload_address
function generated reg+offset address for scalar values.  This resulted in
compiler generating these addresses to load up a constant in some cases in
32-bit.  This patch does not give an optimized address for scalar types if they
can go in Altivec registers.  By not generating an 'optimized' address, reload
falls to try other options, and it eventually generates the lfd instruction
with an offset instead of an lxsdx.


I have bootstraped these patches on big endian power7, big endian power8, and
little endian power8 systems, and there were no regressions.  Is the patch ok
to install?

[gcc]
2014-12-01  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        PR target/64019
        * config/rs6000/rs6000.c (rs6000_legitimize_reload_address): Do
        not create LO_SUM address for constant addresses if the type can
        go in Altivec registers.

[gcc/testsuite]
2014-12-01  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        PR target/64019
        * gcc.target/powerpc/pr64019.c: New file.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c  (revision 218090)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -7593,7 +7593,11 @@ rs6000_legitimize_reload_address (rtx x,
         naturally aligned.  Since we say the address is good here, we
         can't disable offsets from LO_SUMs in mem_operand_gpr.
         FIXME: Allow offset from lo_sum for other modes too, when
-        mem is sufficiently aligned.  */
+        mem is sufficiently aligned.
+
+        Also disallow this if the type can got in VMX/Altivec registers, since
+        those registers do not have d-form (reg+offset) address modes.  */
+      && !reg_addr[mode].scalar_in_vmx_p
       && mode != TFmode
       && mode != TDmode
       && (mode != TImode || !TARGET_VSX_TIMODE)
Index: gcc/testsuite/gcc.target/powerpc/pr64019.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr64019.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr64019.c  (revision 0)
@@ -0,0 +1,71 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power7" } } */
+/* { dg-options "-O2 -ffast-math -mcpu=power7" } */
+
+#include <math.h>
+
+typedef struct
+{
+  double x, y, z;
+  double q, a, b, mass;
+  double vx, vy, vz, vw, dx, dy, dz;
+}
+ATOM;
+int
+u_f_nonbon (lambda)
+     double lambda;
+{
+  double r, r0, xt, yt, zt;
+  double lcutoff, cutoff, get_f_variable ();
+  double rdebye;
+  int inbond, inangle, i;
+  ATOM *a1, *a2, *bonded[10], *angled[10];
+  ATOM *(*use)[];
+  int uselist (), nuse, used;
+  ATOM *cp, *bp;
+  int a_number (), inbuffer;
+  double (*buffer)[], xx, yy, zz, k;
+  int invector, atomsused, ii, jj, imax;
+  double (*vector)[];
+  ATOM *(*atms)[];
+  double dielectric;
+  rdebye = cutoff / 2.;
+  dielectric = get_f_variable ("dielec");
+  imax = a_number ();
+  for (jj = 1; jj < imax; jj++, a1 = bp)
+    {
+      if ((*use)[used] == a1)
+       {
+         used += 1;
+       }
+      while ((*use)[used] != a1)
+       {
+         for (i = 0; i < inbuffer; i++)
+           {
+           }
+         xx = a1->x + lambda * a1->dx;
+         yy = a1->y + lambda * a1->dy;
+         zz = a1->z + lambda * a1->dz;
+         for (i = 0; i < inbuffer; i++)
+           {
+             xt = xx - (*buffer)[3 * i];
+             yt = yy - (*buffer)[3 * i + 1];
+             zt = zz - (*buffer)[3 * i + 2];
+             r = xt * xt + yt * yt + zt * zt;
+             r0 = sqrt (r);
+             xt = xt / r0;
+             zt = zt / r0;
+             k =
+               -a1->q * (*atms)[i]->q * dielectric * exp (-r0 / rdebye) *
+               (1. / (rdebye * r0) + 1. / r);
+             k += a1->a * (*atms)[i]->a / r / r0 * 6;
+             k -= a1->b * (*atms)[i]->b / r / r / r0 * 12;
+             (*vector)[3 * i] = xt * k;
+             (*vector)[3 * i + 1] = yt * k;
+             (*vector)[3 * i + 2] = zt * k;
+           }
+       }
+    }
+}

[PATCH], PR 64019, fix power7/power8 regression

Reply via email to