[PATCH, rs6000] Correct optimization of VSX extract-load for little endian

Bill Schmidt Wed, 03 Sep 2014 21:08:58 -0700

Hi,

The *vsx_extract_<mode>_load pattern performs a scalar load of memory
when possible, rather than a vector load followed by an extract.  The
assembly for the pattern always loads the 0th memory doubleword element,
but the pattern match selects the 0th for big-endian and the 1st for
little-endian, leading to wrong results for LE.  This patch changes the
pattern match to look for the 0th element regardless of endianness.


I ran across this when working on another patch, which provides more
test coverage for this scenario and will be submitted shortly.  For this
patch, I'm just correcting the now-failing vsx-extract-1.c test.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Ok for trunk?  (This should eventually be backported to
4.8 and 4.9 as well...)

Thanks,
Bill


[gcc]

2014-09-03  Bill Schmidt  <wschm...@linux.vnet.ibm.com>

        * config/rs6000/vsx.md (*vsx_extract_<mode>_load): Always match
        selection of 0th memory doubleword, regardless of endianness.

[gcc/testsuite]

2014-09-03  Bill Schmidt  <wschm...@linux.vnet.ibm.com>

        * gcc.target/powerpc/vsx-extract-1.c:  Test 0th doubleword
        regardless of endianness.


Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md    (revision 214897)
+++ gcc/config/rs6000/vsx.md    (working copy)
@@ -1835,7 +1835,7 @@
   [(set (match_operand:<VS_scalar> 0 "register_operand" "=d,wv,wr")
        (vec_select:<VS_scalar>
         (match_operand:VSX_D 1 "memory_operand" "m,Z,m")
-        (parallel [(match_operand:QI 2 "vsx_scalar_64bit" "wD,wD,wD")])))]
+        (parallel [(const_int 0)])))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
   "@
    lfd%U1%X1 %0,%1
Index: gcc/testsuite/gcc.target/powerpc/vsx-extract-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vsx-extract-1.c    (revision 214897)
+++ gcc/testsuite/gcc.target/powerpc/vsx-extract-1.c    (working copy)
@@ -7,10 +7,4 @@
 
 #include <altivec.h>
 
-#if __LITTLE_ENDIAN__
-#define OFFSET 1
-#else
-#define OFFSET 0
-#endif
-
-double get_value (vector double *p) { return vec_extract (*p, OFFSET); }
+double get_value (vector double *p) { return vec_extract (*p, 0); }

[PATCH, rs6000] Correct optimization of VSX extract-load for little endian

Reply via email to