Hi,

The GCC vectorizer can't vectorize the following loop even though the target
supports 2-lane SIMD left shift.

    short a[256], b[256];
    foo ()
    {
      int i;
      for (i=0; i<256; i++)
        { a[i] = b[i] << 4; }
    }

The reason seems to be GCC is promoting the source from short to int, then
performing left shift on int type and finally a type demotion is done to
covert it back to short. Below is the related tree dump:

     _2 = (intD.1) _1;
     # RANGE [-524288, 524272] NONZERO 4294967280
     _3 = _2 << 4;
     # RANGE [-32768, 32767] NONZERO 65520
     _4 = (short intD.10) _3;
     # .MEM_8 = VDEF <.MEM_14>
     aD.1888[i_13] = _4;

I checked tree-vect-patterns.c and found there is a pattern recognizer
"vect_recog_over_widening_pattern" to recognize such sequences already.
    
But, in vect_operation_fits_smaller_type, it only recognizes the sequences
when the promoted type is 4 times wider than the original type. The reason
seems to be the original proposal at:

      https://gcc.gnu.org/ml/gcc-patches/2011-07/msg01472.html

is to handle the following sequences where three types are involved, and the
width, T_PROMOTED = 2 * T_INTER = 4 * T_ORIG.

      T_ORIG a;
      T_PROMOTED b, c;
      T_INTER d;

      b = (T_PROMOTED) a;
      c = b << 2;
      d = (T_INTER) c;

While we could also handle the following sequence where only two types are
involved, and T_PROMOTED = 2 * T_ORIG

      T_ORIG a;
      T_PROMOTED b, c, d;

      b = (T_PROMOTED) a;
      c = b << 2;
      d = (T_ORIG) c;

Performing the left shift on T_ORIG directly should be equal to performing
it on T_PROMOTED then converting back to T_ORIG.

x86-64/AArch64/PPC64 bootstrap OK (finished on gcc farms) and no regression
on check-gcc/g++.

gcc/
2017-09-21  Jon Beniston <j...@beniston.com>

        * tree-vect-patterns.c (vect_opertion_fits_smaller_type): Allow
        half_type for LSHIFT_EXPR.

diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index cdad261..0abf37c 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -1318,7 +1318,12 @@ vect_operation_fits_smaller_type (gimple *stmt, tree
def, tree *new_type,
         break;
 
       case LSHIFT_EXPR:
-        /* Try intermediate type - HALF_TYPE is not enough for sure.  */
+        /* Try half_type.  */
+        if (TYPE_PRECISION (type) == TYPE_PRECISION (half_type) * 2
+           && vect_supportable_shift (code, half_type))
+          break;
+
+        /* Try intermediate type.  */
         if (TYPE_PRECISION (type) < (TYPE_PRECISION (half_type) * 4))
           return false;


Reply via email to