Richard Biener <rguent...@suse.de> writes:
> On Fri, 12 May 2023, Andre Vieira (lists) wrote:
>
>> I have dealt with, I think..., most of your comments. There's quite a few
>> changes, I think it's all a bit simpler now. I made some other changes to the
>> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
>> the same behaviour as we had with the tree codes before. Also added some 
>> extra
>> checks to tree-cfg.cc that made sense to me.
>> 
>> I am still regression testing the gimple-range-op change, as that was a last
>> minute change, but the rest survived a bootstrap and regression test on
>> aarch64-unknown-linux-gnu.
>> 
>> cover letter:
>> 
>> This patch replaces the existing tree_code widen_plus and widen_minus
>> patterns with internal_fn versions.
>> 
>> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
>> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
>> except they provide convenience wrappers for defining conversions that 
>> require
>> a hi/lo split.  Each definition for <NAME> will require optabs for _hi and 
>> _lo
>> and each of those will also require a signed and unsigned version in the case
>> of widening. The hi/lo pair is necessary because the widening and narrowing
>> operations take n narrow elements as inputs and return n/2 wide elements as
>> outputs. The 'lo' operation operates on the first n/2 elements of input. The
>> 'hi' operation operates on the second n/2 elements of input. Defining an
>> internal_fn along with hi/lo variations allows a single internal function to
>> be returned from a vect_recog function that will later be expanded to hi/lo.
>> 
>> 
>>  For example:
>>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
>> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
>> (u/s)addl2
>>                        IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
>> -> (u/s)addl
>> 
>> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
>> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
>
> What I still don't understand is how we are so narrowly focused on
> HI/LO?  We need a combined scalar IFN for pattern selection (not
> sure why that's now called _HILO, I expected no suffix).  Then there's
> three possibilities the target can implement this:
>
>  1) with a widen_[su]add<mode> instruction - I _think_ that's what
>     RISCV is going to offer since it is a target where vector modes
>     have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
>     RVV can do a V4HI to V4SI widening and widening add/subtract
>     using vwadd[u] and vwsub[u] (the HI->SI widening is actually
>     done with a widening add of zero - eh).
>     IIRC GCN is the same here.

SVE currently does this too, but the addition and widening are
separate operations.  E.g. in principle there's no reason why
you can't sign-extend one operand, zero-extend the other, and
then add the result together.  Or you could extend them from
different sizes (QI and HI).  All of those are supported
(if the costing allows them).

If the target has operations to do combined extending and adding (or
whatever), then at the moment we rely on combine to generate them.

So I think this case is separate from Andre's work.  The addition
itself is just an ordinary addition, and any widening happens by
vectorising a CONVERT/NOP_EXPR.

>  2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
>     codes currently support (exclusively)
>  3) similar, but widen_[su]add{_even,_odd}<mode>
>
> that said, things like decomposes_to_hilo_fn_p look to paint us into
> a 2) corner without good reason.

I suppose one question is: how much of the patch is really specific
to HI/LO, and how much is just grouping two halves together?  The nice
thing about the internal-fn grouping macros is that, if (3) is
implemented in future, the structure will strongly encourage even/odd
pairs to be supported for all operations that support hi/lo.  That is,
I would expect the grouping macros to be extended to define even/odd
ifns alongside hi/lo ones, rather than adding separate definitions
for even/odd functions.

If so, at least from the internal-fn.* side of things, I think the question
is whether it's OK to stick with hilo names for now, or whether we should
use more forward-looking names.

Thanks,
Richard

>
> Richard.
>
>> gcc/ChangeLog:
>> 
>> 2023-05-12  Andre Vieira  <andre.simoesdiasvie...@arm.com>
>>             Joel Hutton  <joel.hut...@arm.com>
>>             Tamar Christina  <tamar.christ...@arm.com>
>> 
>>         * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>):
>> Rename
>>         this ...
>>         (vec_widen_<su>add_lo_<mode>): ... to this.
>>         (vec_widen_<su>addl_hi_<mode>): Rename this ...
>>         (vec_widen_<su>add_hi_<mode>): ... to this.
>>         (vec_widen_<su>subl_lo_<mode>): Rename this ...
>>         (vec_widen_<su>sub_lo_<mode>): ... to this.
>>         (vec_widen_<su>subl_hi_<mode>): Rename this ...
>>         (vec_widen_<su>sub_hi_<mode>): ...to this.
>>         * doc/generic.texi: Document new IFN codes.
>>      * internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to
>>      define an
>>         internal_fn that expands into multiple internal_fns for widening.
>>         (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing.
>>      (ifn_cmp): Function to compare ifn's for sorting/searching.
>>      (lookup_hilo_internal_fn): Add lookup function.
>>      (commutative_binary_fn_p): Add widen_plus fn's.
>>      (widening_fn_p): New function.
>>      (narrowing_fn_p): New function.
>>      (decomposes_to_hilo_fn_p): New function.
>>               (direct_internal_fn_optab): Change visibility.
>>      * internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define
>>     widening
>>     plus,minus functions.
>>      (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code.
>>      (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code.
>>      * internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
>>               (direct_internal_fn_optab): Declare new prototype.
>>      (lookup_hilo_internal_fn): Likewise.
>>      (widening_fn_p): Likewise.
>>      (Narrowing_fn_p): Likewise.
>>      (decomposes_to_hilo_fn_p): Likewise.
>>      * optabs.cc (commutative_optab_p): Add widening plus optabs.
>>      * optabs.def (OPTAB_D): Define widen add, sub optabs.
>>         * tree-cfg.cc (verify_gimple_call): Add checks for new widen
>>         add and sub IFNs.
>>         * tree-inline.cc (estimate_num_insns): Return same
>>         cost for widen add and sub IFNs as previous tree_codes.
>>      * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
>>     patterns with a hi/lo split.
>>         (vect_recog_sad_pattern): Refactor to use new IFN codes.
>>         (vect_recog_widen_plus_pattern): Likewise.
>>         (vect_recog_widen_minus_pattern): Likewise.
>>         (vect_recog_average_pattern): Likewise.
>>      * tree-vect-stmts.cc (vectorizable_conversion): Add support for
>>               _HILO IFNs.
>>      (supportable_widening_operation): Likewise.
>>         * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>      * gcc.target/aarch64/vect-widen-add.c: Test that new
>>     IFN_VEC_WIDEN_PLUS is being used.
>>      * gcc.target/aarch64/vect-widen-sub.c: Test that new
>>     IFN_VEC_WIDEN_MINUS is being used.
>> 

Reply via email to