Re: [PATCH 2/3] Refactor widen_plus as internal_fn

Andre Vieira (lists) via Gcc-patches Mon, 15 May 2023 04:54:11 -0700


On 15/05/2023 12:01, Richard Biener wrote:

On Mon, 15 May 2023, Richard Sandiford wrote:

Richard Biener <rguent...@suse.de> writes:

On Fri, 12 May 2023, Richard Sandiford wrote:

Richard Biener <rguent...@suse.de> writes:

On Fri, 12 May 2023, Andre Vieira (lists) wrote:

I have dealt with, I think..., most of your comments. There's quite a few
changes, I think it's all a bit simpler now. I made some other changes to the
costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
the same behaviour as we had with the tree codes before. Also added some extra
checks to tree-cfg.cc that made sense to me.

I am still regression testing the gimple-range-op change, as that was a last
minute change, but the rest survived a bootstrap and regression test on
aarch64-unknown-linux-gnu.

cover letter:

This patch replaces the existing tree_code widen_plus and widen_minus
patterns with internal_fn versions.

DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
except they provide convenience wrappers for defining conversions that require
a hi/lo split.  Each definition for <NAME> will require optabs for _hi and _lo
and each of those will also require a signed and unsigned version in the case
of widening. The hi/lo pair is necessary because the widening and narrowing
operations take n narrow elements as inputs and return n/2 wide elements as
outputs. The 'lo' operation operates on the first n/2 elements of input. The
'hi' operation operates on the second n/2 elements of input. Defining an
internal_fn along with hi/lo variations allows a single internal function to
be returned from a vect_recog function that will later be expanded to hi/lo.


  For example:
  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
(u/s)addl2
                        IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
-> (u/s)addl

This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.


What I still don't understand is how we are so narrowly focused on
HI/LO?  We need a combined scalar IFN for pattern selection (not
sure why that's now called _HILO, I expected no suffix).  Then there's
three possibilities the target can implement this:

  1) with a widen_[su]add<mode> instruction - I _think_ that's what
     RISCV is going to offer since it is a target where vector modes
     have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
     RVV can do a V4HI to V4SI widening and widening add/subtract
     using vwadd[u] and vwsub[u] (the HI->SI widening is actually
     done with a widening add of zero - eh).
     IIRC GCN is the same here.


SVE currently does this too, but the addition and widening are
separate operations.  E.g. in principle there's no reason why
you can't sign-extend one operand, zero-extend the other, and
then add the result together.  Or you could extend them from
different sizes (QI and HI).  All of those are supported
(if the costing allows them).


I see.  So why does the target the expose widen_[su]add<mode> at all?


It shouldn't (need to) do that.  I don't think we should have an optab
for the unsplit operation.

At least on SVE, we really want the extensions to be fused with loads
(where possible) rather than with arithmetic.

We can still do the widening arithmetic in one go.  It's just that
fusing with the loads works for the mixed-sign and mixed-size cases,
and can handle more than just doubling the element size.

If the target has operations to do combined extending and adding (or
whatever), then at the moment we rely on combine to generate them.

So I think this case is separate from Andre's work.  The addition
itself is just an ordinary addition, and any widening happens by
vectorising a CONVERT/NOP_EXPR.

  2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
     codes currently support (exclusively)
  3) similar, but widen_[su]add{_even,_odd}<mode>

that said, things like decomposes_to_hilo_fn_p look to paint us into
a 2) corner without good reason.


I suppose one question is: how much of the patch is really specific
to HI/LO, and how much is just grouping two halves together?


Yep, that I don't know for sure.

  The nice
thing about the internal-fn grouping macros is that, if (3) is
implemented in future, the structure will strongly encourage even/odd
pairs to be supported for all operations that support hi/lo.  That is,
I would expect the grouping macros to be extended to define even/odd
ifns alongside hi/lo ones, rather than adding separate definitions
for even/odd functions.

If so, at least from the internal-fn.* side of things, I think the question
is whether it's OK to stick with hilo names for now, or whether we should
use more forward-looking names.


I think for parts that are independent we could use a more
forward-looking name.  Maybe _halves?


Using _halves for the ifn macros sounds good to me FWIW.

But I'm also not sure
how much of that is really needed (it seems to be tied around
optimizing optabs space?)


Not sure what you mean by "this".  Optabs space shouldn't be a problem
though.  The optab encoding gives us a full int to play with, and it
could easily go up to 64 bits if necessary/convenient.

At least on the internal-fn.* side, the aim is really just to establish
a regular structure, so that we don't have arbitrary differences between
different widening operations, or too much cut-&-paste.


Hmm, I'm looking at the need for the std::map and
internal_fn_hilo_keys_array and internal_fn_hilo_values_array.
The vectorizer pieces contain

+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn ((combined_fn) code);
+      gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+      internal_fn lo, hi;
+      lookup_hilo_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
+      optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype));

so that tries to automatically associate the scalar widening IFN
with the set(s) of IFN pairs we can split to.  But then this
list should be static and there's no need to create a std::map?
Maybe gencfn-macros.cc can be enhanced to output these static
cases?  Or the vectorizer could (as it did previously) simply
open-code the handled cases (I guess since we deal with two
cases only now I'd prefer that).

Thanks,
Richard.

Thanks,
Richard

The patch I uploaded last no longer has std::map norinternal_fn_hilo_keys_array and internal_fn_hilo_values_array. (I'veattached it again)

I'm not sure I understand the _halves, do you mean that for the casewhere I had _hilo or _HILO before we rename that to _halves/_HALVES suchthat it later represents both _hi/_lo separation and _even/_odd?

And am I correct to assume we are just giving up on having aINTERNAL_OPTAB_FN idea for 1)?


Kind regards,
Andre

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
bfc98a8d943467b33390defab9682f44efab5907..ffbbecb9409e1c2835d658c2a8855cd0e955c0f2
 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4626,7 +4626,7 @@
   [(set_attr "type" "neon_<ADDSUB:optab>_long")]
 )
 
-(define_expand "vec_widen_<su>addl_lo_<mode>"
+(define_expand "vec_widen_<su>add_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4638,7 +4638,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>addl_hi_<mode>"
+(define_expand "vec_widen_<su>add_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4650,7 +4650,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_lo_<mode>"
+(define_expand "vec_widen_<su>sub_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4662,7 +4662,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_hi_<mode>"
+(define_expand "vec_widen_<su>sub_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 
8b2882da4fe7da07d22b4e5384d049ba7d3907bf..0fd7e6cce8bbd4ecb8027b702722adcf6c32eb55
 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1811,6 +1811,10 @@ a value from @code{enum annot_expr_kind}, the third is 
an @code{INTEGER_CST}.
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
+@tindex IFN_VEC_WIDEN_PLUS_HI
+@tindex IFN_VEC_WIDEN_PLUS_LO
+@tindex IFN_VEC_WIDEN_MINUS_HI
+@tindex IFN_VEC_WIDEN_MINUS_LO
 @tindex VEC_WIDEN_PLUS_HI_EXPR
 @tindex VEC_WIDEN_PLUS_LO_EXPR
 @tindex VEC_WIDEN_MINUS_HI_EXPR
@@ -1861,6 +1865,33 @@ vector of @code{N/2} products. In the case of 
@code{VEC_WIDEN_MULT_LO_EXPR} the
 low @code{N/2} elements of the two vector are multiplied to produce the
 vector of @code{N/2} products.
 
+@item IFN_VEC_WIDEN_PLUS_HI
+@itemx IFN_VEC_WIDEN_PLUS_LO
+These internal functions represent widening vector addition of the high and low
+parts of the two input vectors, respectively.  Their operands are vectors that
+contain the same number of elements (@code{N}) of the same integral type. The
+result is a vector that contains half as many elements, of an integral type
+whose size is twice as wide.  In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the
+high @code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} products.  In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low
+@code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} products.
+
+@item IFN_VEC_WIDEN_MINUS_HI
+@itemx IFN_VEC_WIDEN_MINUS_LO
+These internal functions represent widening vector subtraction of the high and
+low parts of the two input vectors, respectively.  Their operands are vectors
+that contain the same number of elements (@code{N}) of the same integral type.
+The high/low elements of the second vector are subtracted from the high/low
+elements of the first. The result is a vector that contains half as many
+elements, of an integral type whose size is twice as wide.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second
+vector are subtracted from the high @code{N/2} of the first to produce the
+vector of @code{N/2} products.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second
+vector are subtracted from the low @code{N/2} of the first to produce the
+vector of @code{N/2} products.
+
 @item VEC_WIDEN_PLUS_HI_EXPR
 @itemx VEC_WIDEN_PLUS_LO_EXPR
 These nodes represent widening vector addition of the high and low parts of
diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 
594bd3043f0e944299ddfff219f757ef15a3dd61..66636d82df27626e7911efd0cb8526921b39633f
 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -1187,6 +1187,7 @@ gimple_range_op_handler::maybe_non_standard ()
 {
   range_operator *signed_op = ptr_op_widen_mult_signed;
   range_operator *unsigned_op = ptr_op_widen_mult_unsigned;
+  bool signed1, signed2, signed_ret;
   if (gimple_code (m_stmt) == GIMPLE_ASSIGN)
     switch (gimple_assign_rhs_code (m_stmt))
       {
@@ -1202,32 +1203,55 @@ gimple_range_op_handler::maybe_non_standard ()
          m_op1 = gimple_assign_rhs1 (m_stmt);
          m_op2 = gimple_assign_rhs2 (m_stmt);
          tree ret = gimple_assign_lhs (m_stmt);
-         bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
-         bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
-         bool signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
-
-         /* Normally these operands should all have the same sign, but
-            some passes and violate this by taking mismatched sign args.  At
-            the moment the only one that's possible is mismatch inputs and
-            unsigned output.  Once ranger supports signs for the operands we
-            can properly fix it,  for now only accept the case we can do
-            correctly.  */
-         if ((signed1 ^ signed2) && signed_ret)
-           return;
-
-         m_valid = true;
-         if (signed2 && !signed1)
-           std::swap (m_op1, m_op2);
-
-         if (signed1 || signed2)
-           m_int = signed_op;
-         else
-           m_int = unsigned_op;
+         signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+         signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+         signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
          break;
        }
        default:
-         break;
+         return;
       }
+  else if (gimple_code (m_stmt) == GIMPLE_CALL
+      && gimple_call_internal_p (m_stmt)
+      && gimple_get_lhs (m_stmt) != NULL_TREE)
+    switch (gimple_call_internal_fn (m_stmt))
+      {
+      case IFN_VEC_WIDEN_PLUS_LO:
+      case IFN_VEC_WIDEN_PLUS_HI:
+         {
+           signed_op = ptr_op_widen_plus_signed;
+           unsigned_op = ptr_op_widen_plus_unsigned;
+           m_valid = false;
+           m_op1 = gimple_call_arg (m_stmt, 0);
+           m_op2 = gimple_call_arg (m_stmt, 1);
+           tree ret = gimple_get_lhs (m_stmt);
+           signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+           signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+           signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
+           break;
+         }
+      default:
+       return;
+      }
+  else
+    return;
+
+    /* Normally these operands should all have the same sign, but some passes
+       and violate this by taking mismatched sign args.  At the moment the only
+       one that's possible is mismatch inputs and unsigned output.  Once ranger
+       supports signs for the operands we can properly fix it,  for now only
+       accept the case we can do correctly.  */
+    if ((signed1 ^ signed2) && signed_ret)
+      return;
+
+    m_valid = true;
+    if (signed2 && !signed1)
+      std::swap (m_op1, m_op2);
+
+    if (signed1 || signed2)
+      m_int = signed_op;
+    else
+      m_int = unsigned_op;
 }
 
 // Set up a gimple_range_op_handler for any built in function which can be
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 
5c9da73ea11f8060b18dcf513599c9694fa4f2ad..1acea5ae33046b70de247b1688aea874d9956abc
 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -90,6 +90,19 @@ lookup_internal_fn (const char *name)
   return entry ? *entry : IFN_LAST;
 }
 
+/*  Given an internal_fn IFN that is a HILO function, return its corresponding
+    LO and HI internal_fns.  */
+
+extern void
+lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi)
+{
+  gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+  *lo = internal_fn (ifn + 1);
+  *hi = internal_fn (ifn + 2);
+}
+
+
 /* Fnspec of each internal function, indexed by function number.  */
 const_tree internal_fn_fnspec_array[IFN_LAST + 1];
 
@@ -137,7 +150,16 @@ const direct_internal_fn_info 
direct_internal_fn_array[IFN_LAST + 1] = {
 #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) TYPE##_direct,
 #define DEF_INTERNAL_SIGNED_OPTAB_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \
                                     UNSIGNED_OPTAB, TYPE) TYPE##_direct,
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR, 
SIGNED_OPTAB, \
+                                           UNSIGNED_OPTAB, TYPE)               
  \
+TYPE##_direct, TYPE##_direct, TYPE##_direct,
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE) \
+TYPE##_direct, TYPE##_direct, TYPE##_direct,
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
   not_direct
 };
 
@@ -3852,7 +3874,7 @@ multi_vector_optab_supported_p (convert_optab optab, 
tree_pair types,
 
 /* Return the optab used by internal function FN.  */
 
-static optab
+optab
 direct_internal_fn_optab (internal_fn fn, tree_pair types)
 {
   switch (fn)
@@ -3971,6 +3993,9 @@ commutative_binary_fn_p (internal_fn fn)
     case IFN_UBSAN_CHECK_MUL:
     case IFN_ADD_OVERFLOW:
     case IFN_MUL_OVERFLOW:
+    case IFN_VEC_WIDEN_PLUS_HILO:
+    case IFN_VEC_WIDEN_PLUS_LO:
+    case IFN_VEC_WIDEN_PLUS_HI:
       return true;
 
     default:
@@ -4044,6 +4069,88 @@ first_commutative_argument (internal_fn fn)
     }
 }
 
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as wide as the element size of the input vectors.  */
+
+bool
+widening_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \
+    case IFN_##NAME##_HILO:\
+    case IFN_##NAME##_HI: \
+    case IFN_##NAME##_LO: \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as narrow as the element size of the input vectors.  */
+
+bool
+narrowing_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \
+    case IFN_##NAME##_HILO:\
+    case IFN_##NAME##_HI: \
+    case IFN_##NAME##_LO: \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if FN decomposes to _hi and _lo IFN.  */
+
+bool
+decomposes_to_hilo_fn_p (internal_fn fn)
+{
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \
+    case IFN_##NAME##_HILO:\
+      return true;
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \
+    case IFN_##NAME##_HILO:\
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
 /* Return true if IFN_SET_EDOM is supported.  */
 
 bool
@@ -4071,7 +4178,33 @@ set_edom_supported_p (void)
     optab which_optab = direct_internal_fn_optab (fn, types);          \
     expand_##TYPE##_optab_fn (fn, stmt, which_optab);                  \
   }
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR,         \
+                                           SIGNED_OPTAB, UNSIGNED_OPTAB,   \
+                                           TYPE)                           \
+  static void                                                              \
+  expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED,                   \
+                       gcall *stmt ATTRIBUTE_UNUSED)                       \
+  {                                                                        \
+    gcc_unreachable ();                                                        
    \
+  }                                                                        \
+  DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_HI, FLAGS, SELECTOR, SIGNED_OPTAB,    \
+                              UNSIGNED_OPTAB, TYPE)                        \
+  DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_LO, FLAGS, SELECTOR, SIGNED_OPTAB,    \
+                              UNSIGNED_OPTAB, TYPE)
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE) \
+  static void                                                          \
+  expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED,               \
+                       gcall *stmt ATTRIBUTE_UNUSED)                   \
+  {                                                                    \
+    gcc_unreachable ();                                                        
\
+  }                                                                    \
+  DEF_INTERNAL_OPTAB_FN(CODE##_LO, FLAGS, OPTAB, TYPE)                 \
+  DEF_INTERNAL_OPTAB_FN(CODE##_HI, FLAGS, OPTAB, TYPE)
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_FN
+#undef DEF_INTERNAL_SIGNED_OPTAB_FN
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
 
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
@@ -4080,6 +4213,7 @@ set_edom_supported_p (void)
 
    where STMT is the statement that performs the call. */
 static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = {
+
 #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE,
 #include "internal-fn.def"
   0
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 
7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..012dd323b86dd7cfcc5c13d3a2bb2a453937155d
 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -85,6 +85,13 @@ along with GCC; see the file COPYING3.  If not see
    says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
    group of functions to any integral mode (including vector modes).
 
+   DEF_INTERNAL_SIGNED_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it
+   provides convenience wrappers for defining conversions that require a
+   hi/lo split, like widening and narrowing operations.  Each definition
+   for <NAME> will require an optab named <OPTAB> and two other optabs that
+   you specify for signed and unsigned.
+
+
    Each entry must have a corresponding expander of the form:
 
      void expand_NAME (gimple_call stmt)
@@ -123,6 +130,20 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, FLAGS, SELECTOR, SOPTAB, 
UOPTAB, TYPE) \
+  DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, 
UOPTAB##_lo, TYPE) \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, 
UOPTAB##_hi, TYPE)
+#endif
+
+#ifndef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, FLAGS, OPTAB, TYPE) \
+  DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, OPTAB##_lo, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, OPTAB##_hi, TYPE)
+#endif
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -315,6 +336,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, 
cadd270, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
 DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
+DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_PLUS,
+                                    ECF_CONST | ECF_NOTHROW,
+                                    first,
+                                    vec_widen_sadd, vec_widen_uadd,
+                                    binary)
+DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_MINUS,
+                                    ECF_CONST | ECF_NOTHROW,
+                                    first,
+                                    vec_widen_ssub, vec_widen_usub,
+                                    binary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 
08922ed4254898f5fffca3f33973e96ed9ce772f..8ba07d6d1338e75bc5a451d9e403112a608f3ea2
 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+#include "insn-codes.h"
+#include "insn-opinit.h"
+
+
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.
 
    UNSPEC: Undifferentiated UNIQUE.
@@ -112,6 +116,8 @@ internal_fn_name (enum internal_fn fn)
 }
 
 extern internal_fn lookup_internal_fn (const char *);
+extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn 
*);
+extern optab direct_internal_fn_optab (internal_fn, tree_pair);
 
 /* Return the ECF_* flags for function FN.  */
 
@@ -210,6 +216,9 @@ extern bool commutative_binary_fn_p (internal_fn);
 extern bool commutative_ternary_fn_p (internal_fn);
 extern int first_commutative_argument (internal_fn);
 extern bool associative_binary_fn_p (internal_fn);
+extern bool widening_fn_p (code_helper);
+extern bool narrowing_fn_p (code_helper);
+extern bool decomposes_to_hilo_fn_p (internal_fn);
 
 extern bool set_edom_supported_p (void);
 
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index 
c8e39c82d57a7d726e7da33d247b80f32ec9236c..5a08d91e550b2d92e9572211f811fdba99a33a38
 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1314,7 +1314,15 @@ commutative_optab_p (optab binoptab)
          || binoptab == smul_widen_optab
          || binoptab == umul_widen_optab
          || binoptab == smul_highpart_optab
-         || binoptab == umul_highpart_optab);
+         || binoptab == umul_highpart_optab
+         || binoptab == vec_widen_saddl_hi_optab
+         || binoptab == vec_widen_saddl_lo_optab
+         || binoptab == vec_widen_uaddl_hi_optab
+         || binoptab == vec_widen_uaddl_lo_optab
+         || binoptab == vec_widen_sadd_hi_optab
+         || binoptab == vec_widen_sadd_lo_optab
+         || binoptab == vec_widen_uadd_hi_optab
+         || binoptab == vec_widen_uadd_lo_optab);
 }
 
 /* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 
695f5911b300c9ca5737de9be809fa01aabe5e01..16d121722c8c5723d9b164f5a2c616dc7ec143de
 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -410,6 +410,10 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a")
 OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a")
 OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a")
 OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a")
+OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a")
+OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a")
+OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a")
+OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a")
 OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a")
 OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a")
 OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a")
@@ -422,6 +426,10 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a")
 OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
 OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
 OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
+OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a")
+OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a")
+OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a")
+OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a")
 OPTAB_D (vec_addsub_optab, "vec_addsub$a3")
 OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4")
 OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c 
b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
index 
220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e
 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect"   } } 
*/
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect"   } } 
*/
 /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c 
b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
index 
a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2
 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect"   } } 
*/
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect"   } } 
*/
 /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 
0aeebb67fac864db284985f4a6f0653af281d62b..28464ad9e3a7ea25557ffebcdbdbc1340f9e0d8b
 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -65,6 +65,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "profile.h"
 #include "sreal.h"
+#include "internal-fn.h"
 
 /* This file contains functions for building the Control Flow Graph (CFG)
    for a function tree.  */
@@ -3411,6 +3412,52 @@ verify_gimple_call (gcall *stmt)
          debug_generic_stmt (fn);
          return true;
        }
+      internal_fn ifn = gimple_call_internal_fn (stmt);
+      if (ifn == IFN_LAST)
+       {
+         error ("gimple call has an invalid IFN");
+         debug_generic_stmt (fn);
+         return true;
+       }
+      else if (decomposes_to_hilo_fn_p (ifn))
+       {
+         /* Non decomposed HILO stmts should not appear in IL, these are
+            merely used as an internal representation to the auto-vectorizer
+            pass and should have been expanded to their _LO _HI variants.  */
+         error ("gimple call has an non decomposed HILO IFN");
+         debug_generic_stmt (fn);
+         return true;
+       }
+      else if (ifn == IFN_VEC_WIDEN_PLUS_LO
+              || ifn == IFN_VEC_WIDEN_PLUS_HI
+              || ifn == IFN_VEC_WIDEN_MINUS_LO
+              || ifn == IFN_VEC_WIDEN_MINUS_HI)
+       {
+         tree rhs1_type = TREE_TYPE (gimple_call_arg (stmt, 0));
+         tree rhs2_type = TREE_TYPE (gimple_call_arg (stmt, 1));
+         tree lhs_type = TREE_TYPE (gimple_get_lhs (stmt));
+         if (TREE_CODE (lhs_type) == VECTOR_TYPE)
+           {
+             if (TREE_CODE (rhs1_type) != VECTOR_TYPE
+                 || TREE_CODE (rhs2_type) != VECTOR_TYPE)
+               {
+                 error ("invalid non-vector operands in vector IFN call");
+                 debug_generic_stmt (fn);
+                 return true;
+               }
+             lhs_type = TREE_TYPE (lhs_type);
+             rhs1_type = TREE_TYPE (rhs1_type);
+             rhs2_type = TREE_TYPE (rhs2_type);
+           }
+         if (POINTER_TYPE_P (lhs_type)
+             || POINTER_TYPE_P (rhs1_type)
+             || POINTER_TYPE_P (rhs2_type))
+           {
+             error ("invalid (pointer) operands in vector IFN call");
+             debug_generic_stmt (fn);
+             return true;
+           }
+       }
     }
   else
     {
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 
63a19f8d1d89c6bd5d8e55a299cbffaa324b4b84..d74d8db2173b1ab117250fea89de5212d5e354ec
 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4433,7 +4433,20 @@ estimate_num_insns (gimple *stmt, eni_weights *weights)
        tree decl;
 
        if (gimple_call_internal_p (stmt))
-         return 0;
+         {
+           internal_fn fn = gimple_call_internal_fn (stmt);
+           switch (fn)
+             {
+             case IFN_VEC_WIDEN_PLUS_HI:
+             case IFN_VEC_WIDEN_PLUS_LO:
+             case IFN_VEC_WIDEN_MINUS_HI:
+             case IFN_VEC_WIDEN_MINUS_LO:
+               return 1;
+
+             default:
+               return 0;
+             }
+         }
        else if ((decl = gimple_call_fndecl (stmt))
                 && fndecl_built_in_p (decl))
          {
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 
1778af0242898e3dc73d94d22a5b8505628a53b5..93cebc72beb4f65249a69b2665dfeb8a0991c1d1
 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree 
*common_type)
 
 static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
-                     tree_code widened_code, bool shift_p,
+                     code_helper widened_code, bool shift_p,
                      unsigned int max_nops,
                      vect_unpromoted_value *unprom, tree *common_type,
                      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
+    return 0;
+
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    rhs_code = gimple_assign_rhs_code (stmt);
+  else if (is_gimple_call (stmt))
+    rhs_code = gimple_call_combined_fn (stmt);
+  else
     return 0;
 
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  if (rhs_code != code
+      && rhs_code != widened_code)
     return 0;
 
-  tree type = TREE_TYPE (gimple_assign_lhs (assign));
+  tree lhs = gimple_get_lhs (stmt);
+  tree type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type))
     return 0;
 
@@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info 
stmt_info, tree_code code,
     {
       vect_unpromoted_value *this_unprom = &unprom[next_op];
       unsigned int nops = 1;
-      tree op = gimple_op (assign, i + 1);
+      tree op = gimple_arg (stmt, i);
       if (i == 1 && TREE_CODE (op) == INTEGER_CST)
        {
          /* We already have a common type from earlier operands.
@@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom[2];
-  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, 
WIDEN_MINUS_EXPR,
+  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
+                            IFN_VEC_WIDEN_MINUS_HILO,
                             false, 2, unprom, &half_type))
     return NULL;
 
@@ -1395,14 +1405,16 @@ static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
                             stmt_vec_info last_stmt_info, tree *type_out,
                             tree_code orig_code, code_helper wide_code,
-                            bool shift_p, const char *name)
+                            bool shift_p, const char *name,
+                            optab_subtype *subtype = NULL)
 {
   gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
   if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
-                            shift_p, 2, unprom, &half_type))
+                            shift_p, 2, unprom, &half_type, subtype))
+
     return NULL;
 
   /* Pattern detected.  */
@@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
                              type, pattern_stmt, vecctype);
 }
 
+static gimple *
+vect_recog_widen_op_pattern (vec_info *vinfo,
+                            stmt_vec_info last_stmt_info, tree *type_out,
+                            tree_code orig_code, internal_fn wide_ifn,
+                            bool shift_p, const char *name,
+                            optab_subtype *subtype = NULL)
+{
+  combined_fn ifn = as_combined_fn (wide_ifn);
+  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
+                                     orig_code, ifn, shift_p, name,
+                                     subtype);
+}
+
+
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR
    to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
 
@@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, 
stmt_vec_info last_stmt_info,
 }
 
 /* Try to detect addition on widened inputs, converting PLUS_EXPR
-   to WIDEN_PLUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_PLUS_HILO.  See vect_recog_widen_op_pattern for details.  
*/
 
 static gimple *
 vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
                               tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-                                     PLUS_EXPR, WIDEN_PLUS_EXPR, false,
-                                     "vect_recog_widen_plus_pattern");
+                                     PLUS_EXPR, IFN_VEC_WIDEN_PLUS_HILO,
+                                     false, "vect_recog_widen_plus_pattern",
+                                     &subtype);
 }
 
 /* Try to detect subtraction on widened inputs, converting MINUS_EXPR
-   to WIDEN_MINUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_MINUS_HILO.  See vect_recog_widen_op_pattern for details.  
*/
 static gimple *
 vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
                               tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-                                     MINUS_EXPR, WIDEN_MINUS_EXPR, false,
-                                     "vect_recog_widen_minus_pattern");
+                                     MINUS_EXPR, IFN_VEC_WIDEN_MINUS_HILO,
+                                     false, "vect_recog_widen_minus_pattern",
+                                     &subtype);
 }
 
 /* Function vect_recog_ctz_ffs_pattern
@@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo,
   vect_unpromoted_value unprom[3];
   tree new_type;
   unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
-                                           WIDEN_PLUS_EXPR, false, 3,
+                                           IFN_VEC_WIDEN_PLUS_HILO, false, 3,
                                            unprom, &new_type);
   if (nops == 0)
     return NULL;
@@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_mask_conversion_pattern, "mask_conversion" },
   { vect_recog_widen_plus_pattern, "widen_plus" },
   { vect_recog_widen_minus_pattern, "widen_minus" },
+  /* These must come after the double widening ones.  */
 };
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 
d152ae9ab10b361b88c0f839d6951c43b954750a..24c811ebe01fb8b003100dea494cf64fea72a975
 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5038,7 +5038,9 @@ vectorizable_conversion (vec_info *vinfo,
   bool widen_arith = (code == WIDEN_PLUS_EXPR
                 || code == WIDEN_MINUS_EXPR
                 || code == WIDEN_MULT_EXPR
-                || code == WIDEN_LSHIFT_EXPR);
+                || code == WIDEN_LSHIFT_EXPR
+                || code == IFN_VEC_WIDEN_PLUS_HILO
+                || code == IFN_VEC_WIDEN_MINUS_HILO);
 
   if (!widen_arith
       && !CONVERT_EXPR_CODE_P (code)
@@ -5088,7 +5090,9 @@ vectorizable_conversion (vec_info *vinfo,
       gcc_assert (code == WIDEN_MULT_EXPR
                  || code == WIDEN_LSHIFT_EXPR
                  || code == WIDEN_PLUS_EXPR
-                 || code == WIDEN_MINUS_EXPR);
+                 || code == WIDEN_MINUS_EXPR
+                 || code == IFN_VEC_WIDEN_PLUS_HILO
+                 || code == IFN_VEC_WIDEN_MINUS_HILO);
 
 
       op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
@@ -12478,10 +12482,43 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = vec_unpacks_sbool_lo_optab;
       optab2 = vec_unpacks_sbool_hi_optab;
     }
-  else
+
+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn ((combined_fn) code);
+      gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+      internal_fn lo, hi;
+      lookup_hilo_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = direct_internal_fn_optab (lo, {vectype, vectype});
+      optab2 = direct_internal_fn_optab (hi, {vectype, vectype});
+    }
+  else if (code.is_tree_code ())
     {
-      optab1 = optab_for_tree_code (c1, vectype, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype, optab_default);
+      if (code == FIX_TRUNC_EXPR)
+       {
+         /* The signedness is determined from output operand.  */
+         optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
+         optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
+       }
+      else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ())
+              && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
+              && VECTOR_BOOLEAN_TYPE_P (vectype)
+              && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
+              && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
+       {
+         /* If the input and result modes are the same, a different optab
+            is needed where we pass in the number of units in vectype.  */
+         optab1 = vec_unpacks_sbool_lo_optab;
+         optab2 = vec_unpacks_sbool_hi_optab;
+       }
+      else
+       {
+         optab1 = optab_for_tree_code (c1, vectype, optab_default);
+         optab2 = optab_for_tree_code (c2, vectype, optab_default);
+       }
     }
 
   if (!optab1 || !optab2)
diff --git a/gcc/tree.def b/gcc/tree.def
index 
90ceeec0b512bfa5f983359c0af03cc71de32007..b37b0b35927b92a6536e5c2d9805ffce8319a240
 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", 
tcc_expression, 3)
 DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2)
 
 /* Widening sad (sum of absolute differences).
-   The first two arguments are of type t1 which should be integer.
-   The third argument and the result are of type t2, such that t2 is at least
-   twice the size of t1.  Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
+   The first two arguments are of type t1 which should be a vector of integers.
+   The third argument and the result are of type t2, such that the size of
+   the elements of t2 is at least twice the size of the elements of t1.
+   Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
    equivalent to:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = PLUS_EXPR (tmp2, arg3)
   or:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
  */

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

Reply via email to