RE: [PATCH 1/5]middle-end: Add scaffolding to support narrowing IFNs

Richard Biener Fri, 22 Aug 2025 00:21:26 -0700

On Thu, 21 Aug 2025, Tamar Christina wrote:

> > -----Original Message-----
> > From: Richard Biener <rguent...@suse.de>
> > Sent: Thursday, August 21, 2025 1:17 PM
> > To: Tamar Christina <tamar.christ...@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; rdsandif...@googlemail.com; nd <n...@arm.com>
> > Subject: RE: [PATCH 1/5]middle-end: Add scaffolding to support narrowing 
> > IFNs
> > 
> > On Thu, 21 Aug 2025, Tamar Christina wrote:
> > 
> > > > -----Original Message-----
> > > > From: Richard Biener <rguent...@suse.de>
> > > > Sent: Thursday, August 21, 2025 11:51 AM
> > > > To: Tamar Christina <tamar.christ...@arm.com>
> > > > Cc: gcc-patches@gcc.gnu.org; rdsandif...@googlemail.com; nd
> > <n...@arm.com>
> > > > Subject: RE: [PATCH 1/5]middle-end: Add scaffolding to support narrowing
> > IFNs
> > > >
> > > > On Wed, 20 Aug 2025, Tamar Christina wrote:
> > > >
> > > > > > -----Original Message-----
> > > > > > From: Richard Biener <rguent...@suse.de>
> > > > > > Sent: Wednesday, August 20, 2025 1:48 PM
> > > > > > To: Tamar Christina <tamar.christ...@arm.com>
> > > > > > Cc: gcc-patches@gcc.gnu.org; rdsandif...@googlemail.com; nd
> > > > <n...@arm.com>
> > > > > > Subject: Re: [PATCH 1/5]middle-end: Add scaffolding to support 
> > > > > > narrowing
> > > > IFNs
> > > > > >
> > > > > > On Tue, 19 Aug 2025, Tamar Christina wrote:
> > > > > >
> > > > > > > This adds scaffolding for supporting narrowing IFNs inside the 
> > > > > > > vectorizer in
> > a
> > > > > > > similar way as how widening is supported.  However because 
> > > > > > > narrowing
> > > > > > operations
> > > > > > > always have the same number of elements as the input and output we
> > need
> > > > to
> > > > > > be
> > > > > > > able to combine the results.  One way this could have been done 
> > > > > > > is by
> > using a
> > > > > > > vec_perm_expr but this then can become tricky to recognize as 
> > > > > > > low/hi
> > pairs in
> > > > > > > backends.
> > > > > > >
> > > > > > > As such I've chosen the design where the _hi and _odd variants of 
> > > > > > > the
> > > > > > > instructions must always be RMW.  This simplifies the 
> > > > > > > implementation and
> > > > > > targets
> > > > > > > that don't want this can use the direct conversion variant.
> > > > > >
> > > > > > the canonial way for "narrowing" would be to have a
> > > > > >
> > > > > > vec_pack_saddh_optab
> > > > > >
> > > > > > that takes two input vectors for each operand (we currently have 
> > > > > > such
> > > > > > for conversions, aka the single operand case).  There's no hi/lo
> > > > > > involved, that's only for widening as we can't have two outputs.
> > > > > >
> > > > > > So - no, we don't want this new odd way of doing.  Either only go 
> > > > > > with
> > > > > > vec_saddh_narrow, aka the result mode is of half size, if that suits
> > > > > > you, or please add the first "pack" variant of a binary operation.
> > > > > >
> > > > > > "pack" would imply narrow here.  Alternatively vec_pack_narrow_saddh
> > > > > > and vec_narrow_saddh as the two variants.
> > > > > >
> > > > > > Note that for composition I'd use a CTOR.  Note that in your scheme
> > > > > > the even/odd variant would interleave one result into the other?
> > > > > > Would the binary optab then fill only every 2nd output lane?  The
> > > > > > documentation in 2/n isn't exactly clear here.
> > > > >
> > > > > We've discussed a lot of this on IRC and I believe a lot of this is a
> > > > misunderstanding
> > > > > and related to how the current optabs are documented.  I have, as our
> > current
> > > > > documentation put the detailed documentation with the IFN rather than 
> > > > > the
> > > > optab.
> > > > >
> > > > > This was following the convention seemingly established by the 
> > > > > widening
> > > > variants.
> > > > >
> > > > > But to come back to what we ended up with on IRC.
> > > > >
> > > > > I proposed
> > > > >
> > > > > FOO -> {V4SI, V4SI} -> {V4HI} and FOO_LO -> {V4SI, V4SI} -> V4HI,
> > > > FOO_MERGE_HI -> {V4SI, V4SI, V4HI} -> V8HI
> > > > >
> > > > > And you proposed back
> > > > >
> > > > > FOO_LO -> {V4SI, V4SI} -> V8HI as well
> > > > > FOO_EVEN -> {V4SI, V4SI} -> V8HI
> > > > >
> > > > > Because the x86 variant of these instructions return registers of the 
> > > > > same size
> > as
> > > > the inputs, however the
> > > > > documentation of these instructions [1] state that for the AVX 
> > > > > variants the
> > upper
> > > > half Is zero'd, and for SSE the
> > > > > upper half is undefined.
> > > > >
> > > > > This to me seems like it means that x86 does not have _HI/_LO variant 
> > > > > of
> > these
> > > > instructions and that we're making
> > > > > a change to accommodate an ISA that can't support these instructions. 
> > > > >  I
> > believe
> > > > x86 only has FOO.
> > > > >
> > > > > And for FOO I don't think we should return V8HI because on SSE the 
> > > > > top bits
> > are
> > > > undefined.   You proposed I introduce
> > > > > an explicit zeroing first, but SSE and AVX are not consistent here.  
> > > > > I believe this
> > > > should return V4HI because it is the only
> > > > > bits that *all* ISAs specify what the bits should be, and won't have 
> > > > > an issue
> > with
> > > > endianness.
> > > > >
> > > > > We discussed EVEN and ODD as well.  I think again there is a 
> > > > > fundamental
> > issue
> > > > with EVEN/ODD, one that wasn't encountered
> > > > > because, well. Nothing uses the code. EVEN/ODD unlike HI/LO cannot be
> > > > detected as a pattern.  Because the lanes to permute
> > > > > around just may not be there *unless* you unroll. And patterns can't 
> > > > > force an
> > > > unroll since unroll factors are determined after
> > > > > all pattern matching.
> > > > >
> > > > > That means they can only be detected later on.  This means EVEN/ODD
> > detection
> > > > cannot, fundamentally rely, or relate to the
> > > > > generic FOO.  This patch does not ascribe any definition or 
> > > > > implementation to
> > > > EVEN/ODD aside that ODD must be RMW.
> > > > >
> > > > > I don't think this can be described any differently, because widening 
> > > > > *reads*
> > and
> > > > narrowing *writes*.  So FOO_EVEN -> {V4SI, V4SI} -> V8HI
> > > > > Is already true in this patch, because again, it's not used, so the 
> > > > > modes can be
> > > > anything.  I only added it because for some reason EVEN/ODD
> > > > > was required before even though there isn't an implementation for it. 
> > > > > I didn't
> > add
> > > > it, I'm just following that convention.
> > > > >
> > > > > So Again I propose
> > > > >
> > > > > FOO -> {V4SI, V4SI} -> {V4HI} and FOO_LO -> {V4SI, V4SI} -> V4HI,
> > > > FOO_MERGE_HI -> {V4SI, V4SI, V4HI} -> V8HI
> > > > >
> > > > > Because FOO_LO -> {V4SI, V4SI} -> V8HI is not a native operation for
> > AArch64,
> > > > x86 maps naturally to FOO and it doesn't have endianness
> > > > > issues, so I don't see a good reason to complicate the implementation 
> > > > > for the
> > > > target that can actually support it.
> > > >
> > > > You said that on aarch64 foo_lo zeroes the high part of V8HI.  We also
> > > > raised the issue of endianess, where for big-endian the meaning of
> > > > _hi and _lo swap.
> > >
> > > I think this is strictly only true if we return V8HI, if we return V4HI 
> > > for _lo the
> > > behavior doesn't change because the register doesn't say which part of 
> > > the full
> > > one it is. Similarly this is the same for _hi if the RMW input if V4HI.  
> > > All we're
> > saying
> > > is merge them together without needing to worry about which part is what.
> > 
> > But we _do_ need to worry!  We need the V8HI output in the same lane
> > order as the V4SI input pair.
> > 
> > I was originally suggesting to use FOO_pack with
> > { {V4SI,V4SI}, {V4SI,V4SI} } -> V8HI exactly because that leaves these
> > details to the backend.
> > 
> > You have FOO doing { V4SI, V4SI } -> V4HI - if you add "fake"
> > V8SI you can use packing there as well.
> > 
> > > This is why I was saying that V4HI avoids endianness issues all together, 
> > > because
> > > strictly speaking, in GIMPLE we don't need to know.
> > 
> > Once we split the operation we have to.  Unless you introduce new
> > nomenclature like
> 
> Ok, I think I see the discrepancy.  When I originally made the scaffolding
> I always envisioned them being used in pairs. So you can't use _lo/_hi
> Individually from the vectorizer.  As such the intermediate result before
> the merge operation of the _hi is never usable.
> 
> But I can see your concern if you take the view that you can use them
> independently. Note that the use of _LO in early break doesn't actually
> care about endianness.  It's just after bits.
> 
> > 
> >  foo_narrow_first, foo_narrow_merge_second
> > 
> > aka, abstract away lo vs hi as "first" and "second".
> > 
> > But existing use says we have _lo and _hi which, as existing use shows,
> > _are_ endianess dependent (even though I don't like that very much).
> > 
> > > > So for big-endian the _lo would be the merging
> > > > operation.  So I was proposing to have FOO_MERGE_LO and FOO_MERGE_HI,
> > > > necessarily both {V4SI, V4SI} -> V8HI then, and define_insns that
> > > > would only allow all-zero to-merge-into for aarch64 merge_lo.
> > >
> > > The LO variant wouldn't be easily implementable though. It means for 
> > > AArch64
> > > That we need a paradoxical subreg here, or a vec_merge with zero, which 
> > > would
> > > Then complicate the _hi pattern. As in the non-expansion pattern.
> > 
> > So again, why then bother at all with _hi/lo and _even/odd when all
> > you care is for separate "_lo" (but with small mode), aka FOO,
> > and for the combined operation, aka FOO_PACK?
> > 
> 
> Honestly, It'll sound stupid, because I thought that's what you would like 😊
> But more practically I did so because I had trouble with expand (nothing I 
> can't solve,
> but hitting the roadblock made me think it wasn't the right approach) and
> that with 4 operands you don't get any associativity. Don't think that 
> matters much
> but it just *fit* more naturally in the existing framework.


You mean commutativity?  I guess it's still there, just not easily
accessible.

> > When _LO produces V4HI, how do you make it an input to _MERGE_HI?
> > Pass in the V4HI?  Or "extend" it to V8HI?  IMO the optab interface
> > should be in a way that the naming is consistent and extends to
> > other possible narrowing operations (are there others?).
> 
> So I'm fine with using FOO_MERGE take {{V4SI, V4SI},{V4SI, V4SI}} -> V8HI

FOO_PACK

> and FOO to {V4SI, V4SI} -> V4HI.
> 
> Would that align us?

Apart from the name for the former, yes.

Richard.

> Thanks,
> Tamar
> > 
> > > >
> > > > Note I did not check at all whether x86 actually has instructions doing
> > > > addh - you appearantly did, which one is it?
> > >
> > > It's https://www.felixcloutier.com/x86/addpd note the behavior difference
> > > between SSE and AVX.
> > 
> > That's just packed add, not add high.
> > Yes, legacy SSE (non-VEX) encoding preserves
> > upper register contents for AVX or AVX512 registers, [E]VEX encoding
> > will zero.  But that's just an ugly detail.  Note that x86 exposes
> > V2SI (and V1SI), for those there's nothing automagic from the ISA.
> > 
> > > >
> > > > Your FOO -> {V4SI, V4SI} -> {V4HI} is what you can use for convenience
> > > > in place of FOO_LO with V4HI input.  That's also conveniently endianess
> > > > invariant.
> > > >
> > > > Can you provide links to the documentation on the aarch64 ISA for addh?
> > >
> > > https://developer.arm.com/documentation/100069/0606/SIMD-Vector-
> > Instructions/ADDHN--ADDHN2--vector-
> > >
> > > note the size of Tb for size == 00 (Q == -) is always 64 bits
> > 
> > So you can't swap addhn, addhn2 because addhn will clobber the upper
> > parts.  So it does _not_ just produce V4HI.  The specification does not
> > talk about big-endian at all, so is "upper" and "lower" part here
> > refering to some logical lane numbering that's independent of endianess?
> > 
> > > I could see you maybe wanting FOO to be {V4SI, V4SI} -> {V8HI}, 
> > > especially for
> > ease of implementation
> > 
> > FOO should be {V4SI, V4SI} -> {V4HI}, it has to double-down as scalar
> > operation as well.
> > 
> > > On x86, (though I think you'll struggle with SSE), but I think _LO/_HI 
> > > should use
> > V4HI ☹
> > 
> > x86 doesn't have addh I think, but we obviously can do vaddpd +
> > pack_trunc, but the vectorizer can do this itself hopefully, no need
> > for a pattern.  There is no "problem" with SSE.  Once you have AVX
> > register width you get vex encoding.  Without vex you only have SSE width
> > anyway.
> > 
> > Richard.
> > 
> > >
> > > Thanks,
> > > Tamar
> > >
> > > >
> > > > Richard.
> > > >
> > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > > [1] https://www.felixcloutier.com/x86/addpd
> > > > > >
> > > > > > Thanks,
> > > > > > Richard.
> > > > > >
> > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > > > > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > > > > > > -m32, -m64 and no issues.
> > > > > > >
> > > > > > > Ok for master?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Tamar
> > > > > > >
> > > > > > > gcc/ChangeLog:
> > > > > > >
> > > > > > >   * internal-fn.cc (lookup_hilo_internal_fn,
> > > > > > >   DEF_INTERNAL_NARROWING_OPTAB_FN,
> > lookup_evenodd_internal_fn,
> > > > > > >   narrowing_fn_p, narrowing_evenodd_fn_p): New.
> > > > > > >   * internal-fn.def (DEF_INTERNAL_NARROWING_OPTAB_FN): New.
> > > > > > >   * internal-fn.h (narrowing_fn_p, narrowing_evenodd_fn_p): New.
> > > > > > >   * tree-vect-stmts.cc (simple_integer_narrowing, 
> > > > > > > vectorizable_call,
> > > > > > >   vectorizable_conversion, supportable_widening_operation,
> > > > > > >   supportable_narrowing_operation): Use it.
> > > > > > >   * tree-vectorizer.h (supportable_narrowing_operation): Modify
> > > > > > >   signature.
> > > > > > >
> > > > > > > ---
> > > > > > > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> > > > > > > index
> > > > > >
> > > >
> > bf2fac8180706ec418de7eb97cd1260f1d078c03..83438dd2ff57474cec999adae
> > > > > > abe92c0540e2a51 100644
> > > > > > > --- a/gcc/internal-fn.cc
> > > > > > > +++ b/gcc/internal-fn.cc
> > > > > > > @@ -101,7 +101,7 @@ lookup_internal_fn (const char *name)
> > > > > > >  extern void
> > > > > > >  lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, 
> > > > > > > internal_fn *hi)
> > > > > > >  {
> > > > > > > -  gcc_assert (widening_fn_p (ifn));
> > > > > > > +  gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn));
> > > > > > >
> > > > > > >    switch (ifn)
> > > > > > >      {
> > > > > > > @@ -113,6 +113,11 @@ lookup_hilo_internal_fn (internal_fn ifn,
> > internal_fn
> > > > > > *lo, internal_fn *hi)
> > > > > > >        *lo = internal_fn (IFN_##NAME##_LO);                       
> > > > > > > \
> > > > > > >        *hi = internal_fn (IFN_##NAME##_HI);                       
> > > > > > > \
> > > > > > >        break;
> > > > > > > +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, S, SO, UO,
> > T1,
> > > > T2) \
> > > > > > > +    case IFN_##NAME:                                             
> > > > > > >     \
> > > > > > > +      *lo = internal_fn (IFN_##NAME##_LO);                       
> > > > > > >     \
> > > > > > > +      *hi = internal_fn (IFN_##NAME##_HI);                       
> > > > > > >     \
> > > > > > > +      break;
> > > > > > >  #include "internal-fn.def"
> > > > > > >      }
> > > > > > >  }
> > > > > > > @@ -124,7 +129,7 @@ extern void
> > > > > > >  lookup_evenodd_internal_fn (internal_fn ifn, internal_fn *even,
> > > > > > >                       internal_fn *odd)
> > > > > > >  {
> > > > > > > -  gcc_assert (widening_fn_p (ifn));
> > > > > > > +  gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn));
> > > > > > >
> > > > > > >    switch (ifn)
> > > > > > >      {
> > > > > > > @@ -136,6 +141,11 @@ lookup_evenodd_internal_fn (internal_fn ifn,
> > > > > > internal_fn *even,
> > > > > > >        *even = internal_fn (IFN_##NAME##_EVEN);                   
> > > > > > > \
> > > > > > >        *odd = internal_fn (IFN_##NAME##_ODD);                     
> > > > > > > \
> > > > > > >        break;
> > > > > > > +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, S, SO, UO,
> > T1,
> > > > T2) \
> > > > > > > +    case IFN_##NAME:                                             
> > > > > > >     \
> > > > > > > +      *even = internal_fn (IFN_##NAME##_EVEN);
> > \
> > > > > > > +      *odd = internal_fn (IFN_##NAME##_ODD);                     
> > > > > > >     \
> > > > > > > +      break;
> > > > > > >  #include "internal-fn.def"
> > > > > > >      }
> > > > > > >  }
> > > > > > > @@ -4548,6 +4558,35 @@ widening_fn_p (code_helper code)
> > > > > > >      }
> > > > > > >  }
> > > > > > >
> > > > > > > +/* Return true if this CODE describes an internal_fn that 
> > > > > > > returns a vector
> > > > with
> > > > > > > +   elements twice as narrow as the element size of the input 
> > > > > > > vectors.  */
> > > > > > > +
> > > > > > > +bool
> > > > > > > +narrowing_fn_p (code_helper code)
> > > > > > > +{
> > > > > > > +  if (!code.is_fn_code ())
> > > > > > > +    return false;
> > > > > > > +
> > > > > > > +  if (!internal_fn_p ((combined_fn) code))
> > > > > > > +    return false;
> > > > > > > +
> > > > > > > +  internal_fn fn = as_internal_fn ((combined_fn) code);
> > > > > > > +  switch (fn)
> > > > > > > +    {
> > > > > > > +    #define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, S, SO, UO,
> > T1,
> > > > > > T2) \
> > > > > > > +    case IFN_##NAME:                                             
> > > > > > >         \
> > > > > > > +    case IFN_##NAME##_HI:                                        
> > > > > > >         \
> > > > > > > +    case IFN_##NAME##_LO:                                        
> > > > > > >         \
> > > > > > > +    case IFN_##NAME##_EVEN:                                      
> > > > > > >         \
> > > > > > > +    case IFN_##NAME##_ODD:                                       
> > > > > > >         \
> > > > > > > +      return true;
> > > > > > > +    #include "internal-fn.def"
> > > > > > > +
> > > > > > > +    default:
> > > > > > > +      return false;
> > > > > > > +    }
> > > > > > > +}
> > > > > > > +
> > > > > > >  /* Return true if this CODE describes an internal_fn that 
> > > > > > > returns a vector
> > with
> > > > > > >     elements twice as wide as the element size of the input 
> > > > > > > vectors and
> > > > operates
> > > > > > >     on even/odd parts of the input.  */
> > > > > > > @@ -4575,6 +4614,33 @@ widening_evenodd_fn_p (code_helper code)
> > > > > > >      }
> > > > > > >  }
> > > > > > >
> > > > > > > +/* Return true if this CODE describes an internal_fn that 
> > > > > > > returns a vector
> > > > with
> > > > > > > +   elements twice as narrow as the element size of the input 
> > > > > > > vectors and
> > > > > > > +   operates on even/odd parts of the input.  */
> > > > > > > +
> > > > > > > +bool
> > > > > > > +narrowing_evenodd_fn_p (code_helper code)
> > > > > > > +{
> > > > > > > +  if (!code.is_fn_code ())
> > > > > > > +    return false;
> > > > > > > +
> > > > > > > +  if (!internal_fn_p ((combined_fn) code))
> > > > > > > +    return false;
> > > > > > > +
> > > > > > > +  internal_fn fn = as_internal_fn ((combined_fn) code);
> > > > > > > +  switch (fn)
> > > > > > > +    {
> > > > > > > +    #define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, S, SO, UO,
> > T1,
> > > > > > T2) \
> > > > > > > +    case IFN_##NAME##_EVEN:                                      
> > > > > > >         \
> > > > > > > +    case IFN_##NAME##_ODD:                                       
> > > > > > >         \
> > > > > > > +      return true;
> > > > > > > +    #include "internal-fn.def"
> > > > > > > +
> > > > > > > +    default:
> > > > > > > +      return false;
> > > > > > > +    }
> > > > > > > +}
> > > > > > > +
> > > > > > >  /* Return true if IFN_SET_EDOM is supported.  */
> > > > > > >
> > > > > > >  bool
> > > > > > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> > > > > > > index
> > > > > >
> > > >
> > d2480a1bf7927476215bc7bb99c0b74197d2b7e9..69677dd10b980c83dec364
> > > > > > 87b1214ff066f4789b 100644
> > > > > > > --- a/gcc/internal-fn.def
> > > > > > > +++ b/gcc/internal-fn.def
> > > > > > > @@ -40,6 +40,8 @@ along with GCC; see the file COPYING3.  If not 
> > > > > > > see
> > > > > > >       DEF_INTERNAL_SIGNED_COND_FN (NAME, FLAGS, OPTAB, TYPE)
> > > > > > >       DEF_INTERNAL_WIDENING_OPTAB_FN (NAME, FLAGS, SELECTOR,
> > > > SOPTAB,
> > > > > > UOPTAB,
> > > > > > >                                TYPE)
> > > > > > > +     DEF_INTERNAL_NARROWING_OPTAB_FN (NAME, FLAGS, SELECTOR,
> > > > > > SOPTAB, UOPTAB,
> > > > > > > +                              TYPE_LO, TYPE_HI)
> > > > > > >
> > > > > > >     where NAME is the name of the function, FLAGS is a set of
> > > > > > >     ECF_* flags and FNSPEC is a string describing functions 
> > > > > > > fnspec.
> > > > > > > @@ -122,6 +124,21 @@ along with GCC; see the file COPYING3.  If 
> > > > > > > not
> > see
> > > > > > >     These five internal functions will require two optabs each, a
> > SIGNED_OPTAB
> > > > > > >     and an UNSIGNED_OTPAB.
> > > > > > >
> > > > > > > +   DEF_INTERNAL_NARROWING_OPTAB_FN is a wrapper that defines five
> > > > > > internal
> > > > > > > +   functions with DEF_INTERNAL_SIGNED_OPTAB_FN:
> > > > > > > +   - one that describes a narrowing operation with the same 
> > > > > > > number of
> > > > elements
> > > > > > > +   in the output and input vectors,
> > > > > > > +   - two that describe a pair of high-low narrowing operations 
> > > > > > > where the
> > > > output
> > > > > > > +   vectors each have half the number of elements of the input 
> > > > > > > vectors,
> > > > > > > +   corresponding to the result of the narrowing operation on the 
> > > > > > > top half
> > and
> > > > > > > +   bottom half, these have the suffixes _HI and _LO,
> > > > > > > +   - and two that describe a pair of even-odd narrowing 
> > > > > > > operations where
> > the
> > > > > > > +   output vectors each have half the number of elements of the 
> > > > > > > input
> > > > vectors,
> > > > > > > +   corresponding to the result of the narrowing operation on the 
> > > > > > > even and
> > > > odd
> > > > > > > +   elements, these have the suffixes _EVEN and _ODD.
> > > > > > > +   These five internal functions will require two optabs each, a
> > > > SIGNED_OPTAB
> > > > > > > +   and an UNSIGNED_OTPAB.
> > > > > > > +
> > > > > > >     DEF_INTERNAL_COND_FN is a wrapper that defines 2 internal 
> > > > > > > functions
> > > > with
> > > > > > >     DEF_INTERNAL_OPTAB_FN:
> > > > > > >     - One is COND_* operations that are predicated by mask only. 
> > > > > > > Such
> > > > operations
> > > > > > > @@ -184,6 +201,15 @@ along with GCC; see the file COPYING3.  If 
> > > > > > > not
> > see
> > > > > > >    DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _ODD, FLAGS, SELECTOR,
> > > > > > SOPTAB##_odd, UOPTAB##_odd, TYPE)
> > > > > > >  #endif
> > > > > > >
> > > > > > > +#ifndef DEF_INTERNAL_NARROWING_OPTAB_FN
> > > > > > > +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, FLAGS,
> > SELECTOR,
> > > > > > SOPTAB, UOPTAB, TYPE_LO, TYPE_HI)       \
> > > > > > > +  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR,
> > SOPTAB,
> > > > > > UOPTAB, TYPE_LO)                           \
> > > > > > > +  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR,
> > > > > > SOPTAB##_lo, UOPTAB##_lo, TYPE_LO)       \
> > > > > > > +  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR,
> > > > > > SOPTAB##_hi, UOPTAB##_hi, TYPE_HI)       \
> > > > > > > +  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _EVEN, FLAGS,
> > SELECTOR,
> > > > > > SOPTAB##_even, UOPTAB##_even, TYPE_LO) \
> > > > > > > +  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _ODD, FLAGS,
> > SELECTOR,
> > > > > > SOPTAB##_odd, UOPTAB##_odd, TYPE_HI)
> > > > > > > +#endif
> > > > > > > +
> > > > > > >  #ifndef DEF_INTERNAL_COND_FN
> > > > > > >  #define DEF_INTERNAL_COND_FN(NAME, FLAGS, OPTAB, TYPE)
> > > > \
> > > > > > >    DEF_INTERNAL_OPTAB_FN (COND_##NAME, FLAGS, cond_##OPTAB,
> > > > > > cond_##TYPE)        \
> > > > > > > @@ -608,6 +634,7 @@ DEF_INTERNAL_OPTAB_FN (BIT_ANDN,
> > ECF_CONST,
> > > > > > andn, binary)
> > > > > > >  DEF_INTERNAL_OPTAB_FN (BIT_IORN, ECF_CONST, iorn, binary)
> > > > > > >
> > > > > > >  #undef DEF_INTERNAL_WIDENING_OPTAB_FN
> > > > > > > +#undef DEF_INTERNAL_NARROWING_OPTAB_FN
> > > > > > >  #undef DEF_INTERNAL_SIGNED_COND_FN
> > > > > > >  #undef DEF_INTERNAL_COND_FN
> > > > > > >  #undef DEF_INTERNAL_INT_EXT_FN
> > > > > > > diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
> > > > > > > index
> > > > > >
> > > >
> > fd21694dfebfb8518810fd85f7aa8c45dd4c362e..8c6ad218e4412716ba7b79b24
> > > > > > af708920e11e3be 100644
> > > > > > > --- a/gcc/internal-fn.h
> > > > > > > +++ b/gcc/internal-fn.h
> > > > > > > @@ -220,6 +220,8 @@ extern int first_commutative_argument
> > > > (internal_fn);
> > > > > > >  extern bool associative_binary_fn_p (internal_fn);
> > > > > > >  extern bool widening_fn_p (code_helper);
> > > > > > >  extern bool widening_evenodd_fn_p (code_helper);
> > > > > > > +extern bool narrowing_fn_p (code_helper);
> > > > > > > +extern bool narrowing_evenodd_fn_p (code_helper);
> > > > > > >
> > > > > > >  extern bool set_edom_supported_p (void);
> > > > > > >
> > > > > > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > > > > > > index
> > > > > >
> > > >
> > 675c6e2e683c59df44d5d7d65b87900a70506f50..97b3d4801d19f3168b91c91
> > > > > > 271e882bad3f99f13 100644
> > > > > > > --- a/gcc/tree-vect-stmts.cc
> > > > > > > +++ b/gcc/tree-vect-stmts.cc
> > > > > > > @@ -3157,15 +3157,20 @@ simple_integer_narrowing (tree
> > vectype_out,
> > > > tree
> > > > > > vectype_in,
> > > > > > >        || !INTEGRAL_TYPE_P (TREE_TYPE (vectype_in)))
> > > > > > >      return false;
> > > > > > >
> > > > > > > -  code_helper code;
> > > > > > > +  code_helper code1 = ERROR_MARK, code2 = ERROR_MARK;
> > > > > > >    int multi_step_cvt = 0;
> > > > > > >    auto_vec <tree, 8> interm_types;
> > > > > > >    if (!supportable_narrowing_operation (NOP_EXPR, vectype_out,
> > > > vectype_in,
> > > > > > > -                                 &code, &multi_step_cvt,
> > &interm_types)
> > > > > > > +                                 &code1, &code2,
> > &multi_step_cvt,
> > > > > > > +                                 &interm_types)
> > > > > > >        || multi_step_cvt)
> > > > > > >      return false;
> > > > > > >
> > > > > > > -  *convert_code = code;
> > > > > > > +  /* Simple narrowing never have hi/lo splits.  */
> > > > > > > +  if (code2 != ERROR_MARK)
> > > > > > > +    return false;
> > > > > > > +
> > > > > > > +  *convert_code = code1;
> > > > > > >    return true;
> > > > > > >  }
> > > > > > >
> > > > > > > @@ -3375,6 +3380,7 @@ vectorizable_call (vec_info *vinfo,
> > > > > > >    if (cfn != CFN_LAST
> > > > > > >        && (modifier == NONE
> > > > > > >     || (modifier == NARROW
> > > > > > > +       && !narrowing_fn_p (cfn)
> > > > > > >         && simple_integer_narrowing (vectype_out, vectype_in,
> > > > > > >                                      &convert_code))))
> > > > > > >      ifn = vectorizable_internal_function (cfn, callee, 
> > > > > > > vectype_out,
> > > > > > > @@ -3511,7 +3517,7 @@ vectorizable_call (vec_info *vinfo,
> > > > > > >    if (clz_ctz_arg1)
> > > > > > >      ++vect_nargs;
> > > > > > >
> > > > > > > -  if (modifier == NONE || ifn != IFN_LAST)
> > > > > > > +  if (modifier == NONE || (ifn != IFN_LAST && !narrowing_fn_p 
> > > > > > > (ifn)))
> > > > > > >      {
> > > > > > >        tree prev_res = NULL_TREE;
> > > > > > >        vargs.safe_grow (vect_nargs, true);
> > > > > > > @@ -5058,7 +5064,8 @@ vectorizable_conversion (vec_info *vinfo,
> > > > > > >    if (!widen_arith
> > > > > > >        && !CONVERT_EXPR_CODE_P (code)
> > > > > > >        && code != FIX_TRUNC_EXPR
> > > > > > > -      && code != FLOAT_EXPR)
> > > > > > > +      && code != FLOAT_EXPR
> > > > > > > +      && !narrowing_fn_p (code))
> > > > > > >      return false;
> > > > > > >
> > > > > > >    /* Check types of lhs and rhs.  */
> > > > > > > @@ -5102,7 +5109,8 @@ vectorizable_conversion (vec_info *vinfo,
> > > > > > >      {
> > > > > > >        gcc_assert (code == WIDEN_MULT_EXPR
> > > > > > >             || code == WIDEN_LSHIFT_EXPR
> > > > > > > -           || widening_fn_p (code));
> > > > > > > +           || widening_fn_p (code)
> > > > > > > +           || narrowing_fn_p (code));
> > > > > > >
> > > > > > >        op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
> > > > > > >                                gimple_call_arg (stmt, 0);
> > > > > > > @@ -5285,9 +5293,9 @@ vectorizable_conversion (vec_info *vinfo,
> > > > > > >        break;
> > > > > > >
> > > > > > >      case NARROW_DST:
> > > > > > > -      gcc_assert (op_type == unary_op);
> > > > > > > +      gcc_assert (op_type == unary_op || op_type == binary_op);
> > > > > > >        if (supportable_narrowing_operation (code, vectype_out, 
> > > > > > > vectype_in,
> > > > > > > -                                    &code1, &multi_step_cvt,
> > > > > > > +                                    &code1, &code2,
> > &multi_step_cvt,
> > > > > > >                                      &interm_types))
> > > > > > >   break;
> > > > > > >
> > > > > > > @@ -5307,7 +5315,7 @@ vectorizable_conversion (vec_info *vinfo,
> > > > > > >     else
> > > > > > >       goto unsupported;
> > > > > > >     if (supportable_narrowing_operation (NOP_EXPR, vectype_out,
> > > > > > cvt_type,
> > > > > > > -                                        &code1, &multi_step_cvt,
> > > > > > > +                                        &code1, &code2,
> > &multi_step_cvt,
> > > > > > >                                          &interm_types))
> > > > > > >       break;
> > > > > > >   }
> > > > > > > @@ -5336,7 +5344,7 @@ vectorizable_conversion (vec_info *vinfo,
> > > > > > >     if (cvt_type == NULL_TREE)
> > > > > > >       goto unsupported;
> > > > > > >     if (!supportable_narrowing_operation (NOP_EXPR, cvt_type,
> > vectype_in,
> > > > > > > -                                         &code1, &multi_step_cvt,
> > > > > > > +                                         &code1, &code2,
> > > > > > &multi_step_cvt,
> > > > > > >                                           &interm_types))
> > > > > > >       goto unsupported;
> > > > > > >     if (supportable_convert_operation ((tree_code) code,
> > vectype_out,
> > > > > > > @@ -5553,11 +5561,44 @@ vectorizable_conversion (vec_info *vinfo,
> > > > > > >       vec_oprnds0[i] = new_temp;
> > > > > > >     }
> > > > > > >
> > > > > > > -      vect_create_vectorized_demotion_stmts (vinfo, &vec_oprnds0,
> > > > > > > -                                      multi_step_cvt,
> > > > > > > -                                      stmt_info, vec_dsts, gsi,
> > > > > > > -                                      slp_node, code1,
> > > > > > > -                                      modifier == NARROW_SRC);
> > > > > > > +      if (modifier == NARROW_DST && narrowing_fn_p (code))
> > > > > > > + {
> > > > > > > +   gcc_assert (op_type == binary_op);
> > > > > > > +   vect_get_vec_defs (vinfo, slp_node, op0, &vec_oprnds0,
> > > > > > > +                      op1, &vec_oprnds1);
> > > > > > > +   tree vop0, vop1;
> > > > > > > +   internal_fn ifn1 = as_internal_fn ((combined_fn)code1);
> > > > > > > +   internal_fn ifn2 = as_internal_fn ((combined_fn)code2);
> > > > > > > +   tree small_type
> > > > > > > +     = get_related_vectype_for_scalar_type (TYPE_MODE
> > (vectype_out),
> > > > > > > +                                            TREE_TYPE
> > (vectype_out),
> > > > > > > +                                            exact_div
> > > > > > (TYPE_VECTOR_SUBPARTS (vectype_out), 2));
> > > > > > > +   for (unsigned i = 0; i < vec_oprnds0.length (); i += 2)
> > > > > > > +     {
> > > > > > > +       vop0 = vec_oprnds0[i];
> > > > > > > +       vop1 = vec_oprnds1[i];
> > > > > > > +       gimple *new_stmt
> > > > > > > +         = gimple_build_call_internal (ifn1, 2, vop0, vop1);
> > > > > > > +       tree new_tmp = make_ssa_name (small_type);
> > > > > > > +       gimple_call_set_lhs (new_stmt, new_tmp);
> > > > > > > +       vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, 
> > > > > > > gsi);
> > > > > > > +
> > > > > > > +       vop0 = vec_oprnds0[i + 1];
> > > > > > > +       vop1 = vec_oprnds1[i + 1];
> > > > > > > +       new_stmt
> > > > > > > +         = gimple_build_call_internal (ifn2, 3, vop0, vop1,
> > new_tmp);
> > > > > > > +       new_tmp = make_ssa_name (vec_dest, new_stmt);
> > > > > > > +       gimple_call_set_lhs (new_stmt, new_tmp);
> > > > > > > +       vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, 
> > > > > > > gsi);
> > > > > > > +       slp_node->push_vec_def (new_stmt);
> > > > > > > +     }
> > > > > > > + }
> > > > > > > +      else
> > > > > > > +        vect_create_vectorized_demotion_stmts (vinfo, 
> > > > > > > &vec_oprnds0,
> > > > > > > +                                        multi_step_cvt,
> > > > > > > +                                        stmt_info, vec_dsts, gsi,
> > > > > > > +                                        slp_node, code1,
> > > > > > > +                                        modifier == NARROW_SRC);
> > > > > > >        /* After demoting op0 to cvt_type, convert it to dest.  */
> > > > > > >        if (cvt_type && code == FLOAT_EXPR)
> > > > > > >   {
> > > > > > > @@ -13616,6 +13657,8 @@ supportable_widening_operation (vec_info
> > > > *vinfo,
> > > > > > >     Output:
> > > > > > >     - CODE1 is the code of a vector operation to be used when
> > > > > > >     vectorizing the operation, if available.
> > > > > > > +   - CODE2 is the code of a vector operation for the high part 
> > > > > > > to be used
> > when
> > > > > > > +   vectorizing the operation, if available.
> > > > > > >     - MULTI_STEP_CVT determines the number of required 
> > > > > > > intermediate
> > steps
> > > > in
> > > > > > >     case of multi-step conversion (like int->short->char - in 
> > > > > > > that case
> > > > > > >     MULTI_STEP_CVT will be 1).
> > > > > > > @@ -13625,64 +13668,117 @@ supportable_widening_operation
> > (vec_info
> > > > > > *vinfo,
> > > > > > >  bool
> > > > > > >  supportable_narrowing_operation (code_helper code,
> > > > > > >                            tree vectype_out, tree vectype_in,
> > > > > > > -                          code_helper *code1, int 
> > > > > > > *multi_step_cvt,
> > > > > > > -                                 vec<tree> *interm_types)
> > > > > > > +                          code_helper *code1, code_helper *code2,
> > > > > > > +                          int *multi_step_cvt, vec<tree>
> > *interm_types)
> > > > > > >  {
> > > > > > >    machine_mode vec_mode;
> > > > > > > -  enum insn_code icode1;
> > > > > > > -  optab optab1, interm_optab;
> > > > > > > +  enum insn_code icode1 = CODE_FOR_nothing, icode2 =
> > > > CODE_FOR_nothing;
> > > > > > > +  optab optab1 = unknown_optab, optab2 = unknown_optab,
> > > > interm_optab;
> > > > > > >    tree vectype = vectype_in;
> > > > > > >    tree narrow_vectype = vectype_out;
> > > > > > > -  enum tree_code c1;
> > > > > > > +  code_helper c1 = ERROR_MARK;
> > > > > > >    tree intermediate_type, prev_type;
> > > > > > >    machine_mode intermediate_mode, prev_mode;
> > > > > > >    int i;
> > > > > > >    unsigned HOST_WIDE_INT n_elts;
> > > > > > >    bool uns;
> > > > > > >
> > > > > > > -  if (!code.is_tree_code ())
> > > > > > > -    return false;
> > > > > > > -
> > > > > > > +  vec_mode = TYPE_MODE (vectype);
> > > > > > >    *multi_step_cvt = 0;
> > > > > > > -  switch ((tree_code) code)
> > > > > > > +  if (narrowing_fn_p (code))
> > > > > > > +     {
> > > > > > > +       /* If this is an internal fn then we must check whether 
> > > > > > > the target
> > > > > > > +   supports the narrowing in one go.  */
> > > > > > > +      internal_fn ifn = as_internal_fn ((combined_fn) code);
> > > > > > > +
> > > > > > > +      internal_fn lo, hi, even, odd;
> > > > > > > +      lookup_hilo_internal_fn (ifn, &lo, &hi);
> > > > > > > +      if (BYTES_BIG_ENDIAN)
> > > > > > > + std::swap (lo, hi);
> > > > > > > +      *code1 = as_combined_fn (lo);
> > > > > > > +      *code2 = as_combined_fn (hi);
> > > > > > > +      optab1 = direct_internal_fn_optab (lo, {vectype, vectype});
> > > > > > > +      optab2 = direct_internal_fn_optab (hi, {vectype, vectype});
> > > > > > > +
> > > > > > > +      /* If we don't support low-high, then check for even-odd.  
> > > > > > > */
> > > > > > > +      if (!optab1
> > > > > > > +   || (icode1 = optab_handler (optab1, vec_mode)) ==
> > CODE_FOR_nothing
> > > > > > > +   || !optab2
> > > > > > > +   || (icode2 = optab_handler (optab2, vec_mode)) ==
> > CODE_FOR_nothing)
> > > > > > > + {
> > > > > > > +   lookup_evenodd_internal_fn (ifn, &even, &odd);
> > > > > > > +   *code1 = as_combined_fn (even);
> > > > > > > +   *code2 = as_combined_fn (odd);
> > > > > > > +   optab1 = direct_internal_fn_optab (even, {vectype, vectype});
> > > > > > > +   optab2 = direct_internal_fn_optab (odd, {vectype, vectype});
> > > > > > > + }
> > > > > > > +    }
> > > > > > > +  else if (code.is_tree_code ())
> > > > > > >      {
> > > > > > > -    CASE_CONVERT:
> > > > > > > -      c1 = VEC_PACK_TRUNC_EXPR;
> > > > > > > -      if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
> > > > > > > -   && VECTOR_BOOLEAN_TYPE_P (vectype)
> > > > > > > -   && SCALAR_INT_MODE_P (TYPE_MODE (vectype))
> > > > > > > -   && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts)
> > > > > > > -   && n_elts < BITS_PER_UNIT)
> > > > > > > - optab1 = vec_pack_sbool_trunc_optab;
> > > > > > > -      else
> > > > > > > - optab1 = optab_for_tree_code (c1, vectype, optab_default);
> > > > > > > -      break;
> > > > > > > +      switch ((tree_code) code)
> > > > > > > + {
> > > > > > > + CASE_CONVERT:
> > > > > > > +   c1 = VEC_PACK_TRUNC_EXPR;
> > > > > > > +   if (VECTOR_BOOLEAN_TYPE_P (narrow_vectype)
> > > > > > > +       && VECTOR_BOOLEAN_TYPE_P (vectype)
> > > > > > > +       && SCALAR_INT_MODE_P (TYPE_MODE (vectype))
> > > > > > > +       && TYPE_VECTOR_SUBPARTS (vectype).is_constant (&n_elts)
> > > > > > > +       && n_elts < BITS_PER_UNIT)
> > > > > > > +     optab1 = vec_pack_sbool_trunc_optab;
> > > > > > > +   else
> > > > > > > +     optab1 = optab_for_tree_code ((tree_code)c1, vectype,
> > > > > > > +                                   optab_default);
> > > > > > > +   break;
> > > > > > >
> > > > > > > -    case FIX_TRUNC_EXPR:
> > > > > > > -      c1 = VEC_PACK_FIX_TRUNC_EXPR;
> > > > > > > -      /* The signedness is determined from output operand.  */
> > > > > > > -      optab1 = optab_for_tree_code (c1, vectype_out, 
> > > > > > > optab_default);
> > > > > > > -      break;
> > > > > > > + case FIX_TRUNC_EXPR:
> > > > > > > +   c1 = VEC_PACK_FIX_TRUNC_EXPR;
> > > > > > > +   /* The signedness is determined from output operand.  */
> > > > > > > +   optab1 = optab_for_tree_code ((tree_code)c1, vectype_out,
> > > > > > > +                                 optab_default);
> > > > > > > +   break;
> > > > > > >
> > > > > > > -    case FLOAT_EXPR:
> > > > > > > -      c1 = VEC_PACK_FLOAT_EXPR;
> > > > > > > -      optab1 = optab_for_tree_code (c1, vectype, optab_default);
> > > > > > > -      break;
> > > > > > > + case FLOAT_EXPR:
> > > > > > > +   c1 = VEC_PACK_FLOAT_EXPR;
> > > > > > > +   optab1 = optab_for_tree_code ((tree_code)c1, vectype_out,
> > > > > > > +                                 optab_default);
> > > > > > > +   break;
> > > > > > >
> > > > > > > -    default:
> > > > > > > -      gcc_unreachable ();
> > > > > > > + default:
> > > > > > > +   gcc_unreachable ();
> > > > > > > + }
> > > > > > >      }
> > > > > > > +  else
> > > > > > > +    return false;
> > > > > > >
> > > > > > >    if (!optab1)
> > > > > > >      return false;
> > > > > > >
> > > > > > > -  vec_mode = TYPE_MODE (vectype);
> > > > > > > -  if ((icode1 = optab_handler (optab1, vec_mode)) ==
> > CODE_FOR_nothing)
> > > > > > > -    return false;
> > > > > > > +  if (narrowing_fn_p (code))
> > > > > > > +    {
> > > > > > > +      if (!optab2)
> > > > > > > + return false;
> > > > > > > +      if ((icode1 = optab_handler (optab1, vec_mode)) ==
> > CODE_FOR_nothing
> > > > > > > +   || (icode2 = optab_handler (optab2, vec_mode)) ==
> > CODE_FOR_nothing)
> > > > > > > + return false;
> > > > > > > +    }
> > > > > > > +  else
> > > > > > > +    {
> > > > > > > +      if ((icode1 = optab_handler (optab1, vec_mode)) ==
> > CODE_FOR_nothing)
> > > > > > > + return false;
> > > > > > >
> > > > > > > -  *code1 = c1;
> > > > > > > +      *code1 = c1;
> > > > > > > +    }
> > > > > > >
> > > > > > > -  if (insn_data[icode1].operand[0].mode == TYPE_MODE
> > (narrow_vectype))
> > > > > > > +  machine_mode nmode;
> > > > > > > +  machine_mode vmode = TYPE_MODE (narrow_vectype);
> > > > > > > +  scalar_mode emode = GET_MODE_INNER (vmode);
> > > > > > > +  poly_uint64 hnunits;
> > > > > > > +  if (insn_data[icode1].operand[0].mode == vmode
> > > > > > > +      || (narrowing_fn_p (code)
> > > > > > > +   && known_ne (hnunits = exact_div (GET_MODE_NUNITS
> > (vmode), 2U),
> > > > > > 0U)
> > > > > > > +   && related_vector_mode (vmode, emode, hnunits).exists
> > (&nmode)
> > > > > > > +   && insn_data[icode1].operand[0].mode == nmode
> > > > > > > +   && insn_data[icode2].operand[0].mode == vmode))
> > > > > > >      {
> > > > > > >        if (!VECTOR_BOOLEAN_TYPE_P (vectype))
> > > > > > >   return true;
> > > > > > > @@ -13716,7 +13812,7 @@ supportable_narrowing_operation
> > > > (code_helper
> > > > > > code,
> > > > > > >        intermediate_type
> > > > > > >   = lang_hooks.types.type_for_mode (TYPE_MODE (vectype_out),
> > 0);
> > > > > > >        interm_optab
> > > > > > > - = optab_for_tree_code (c1, intermediate_type, optab_default);
> > > > > > > + = optab_for_tree_code ((tree_code)c1, intermediate_type,
> > optab_default);
> > > > > > >        if (interm_optab != unknown_optab
> > > > > > >     && (icode2 = optab_handler (optab1, vec_mode)) !=
> > CODE_FOR_nothing
> > > > > > >     && insn_data[icode1].operand[0].mode
> > > > > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > > > > > > index
> > > > > >
> > > >
> > 3d8a9466982a0c29099e60ed7a84e0f5ed207fa9..026dfb131b4c2808290fdbd0
> > > > > > 15b63dab5918c7f2 100644
> > > > > > > --- a/gcc/tree-vectorizer.h
> > > > > > > +++ b/gcc/tree-vectorizer.h
> > > > > > > @@ -2463,8 +2463,8 @@ extern bool supportable_widening_operation
> > > > > > (vec_info*, code_helper,
> > > > > > >                                       code_helper*, code_helper*,
> > > > > > >                                       int*, vec<tree> *);
> > > > > > >  extern bool supportable_narrowing_operation (code_helper, tree, 
> > > > > > > tree,
> > > > > > > -                                      code_helper *, int *,
> > > > > > > -                                      vec<tree> *);
> > > > > > > +                                      code_helper *, code_helper 
> > > > > > > *,
> > > > > > > +                                      int *, vec<tree> *);
> > > > > > >  extern bool supportable_indirect_convert_operation (code_helper,
> > > > > > >                                               tree, tree,
> > > > > > >                                               vec<std::pair<tree,
> > tree_code> >
> > > > > > &,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <rguent...@suse.de>
> > > > > > SUSE Software Solutions Germany GmbH,
> > > > > > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > > > Nuernberg)
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguent...@suse.de>
> > > > SUSE Software Solutions Germany GmbH,
> > > > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > Nuernberg)
> > >
> > 
> > --
> > Richard Biener <rguent...@suse.de>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH 1/5]middle-end: Add scaffolding to support narrowing IFNs

Reply via email to