On Fri, 12 May 2023, Andre Vieira (lists) wrote:
> I have dealt with, I think..., most of your comments. There's quite a few
> changes, I think it's all a bit simpler now. I made some other changes to the
> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
> the same behaviour as we had with the tree codes before. Also added some extra
> checks to tree-cfg.cc that made sense to me.
>
> I am still regression testing the gimple-range-op change, as that was a last
> minute change, but the rest survived a bootstrap and regression test on
> aarch64-unknown-linux-gnu.
>
> cover letter:
>
> This patch replaces the existing tree_code widen_plus and widen_minus
> patterns with internal_fn versions.
>
> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
> except they provide convenience wrappers for defining conversions that require
> a hi/lo split. Each definition for <NAME> will require optabs for _hi and _lo
> and each of those will also require a signed and unsigned version in the case
> of widening. The hi/lo pair is necessary because the widening and narrowing
> operations take n narrow elements as inputs and return n/2 wide elements as
> outputs. The 'lo' operation operates on the first n/2 elements of input. The
> 'hi' operation operates on the second n/2 elements of input. Defining an
> internal_fn along with hi/lo variations allows a single internal function to
> be returned from a vect_recog function that will later be expanded to hi/lo.
>
>
> For example:
> IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
> for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> ->
> (u/s)addl2
> IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode>
> -> (u/s)addl
>
> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
What I still don't understand is how we are so narrowly focused on
HI/LO? We need a combined scalar IFN for pattern selection (not
sure why that's now called _HILO, I expected no suffix). Then there's
three possibilities the target can implement this:
1) with a widen_[su]add<mode> instruction - I _think_ that's what
RISCV is going to offer since it is a target where vector modes
have "padding" (aka you cannot subreg a V2SI to get V4HI). Instead
RVV can do a V4HI to V4SI widening and widening add/subtract
using vwadd[u] and vwsub[u] (the HI->SI widening is actually
done with a widening add of zero - eh).
IIRC GCN is the same here.
2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
codes currently support (exclusively)
3) similar, but widen_[su]add{_even,_odd}<mode>
that said, things like decomposes_to_hilo_fn_p look to paint us into
a 2) corner without good reason.
Richard.
> gcc/ChangeLog:
>
> 2023-05-12 Andre Vieira <[email protected]>
> Joel Hutton <[email protected]>
> Tamar Christina <[email protected]>
>
> * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>):
> Rename
> this ...
> (vec_widen_<su>add_lo_<mode>): ... to this.
> (vec_widen_<su>addl_hi_<mode>): Rename this ...
> (vec_widen_<su>add_hi_<mode>): ... to this.
> (vec_widen_<su>subl_lo_<mode>): Rename this ...
> (vec_widen_<su>sub_lo_<mode>): ... to this.
> (vec_widen_<su>subl_hi_<mode>): Rename this ...
> (vec_widen_<su>sub_hi_<mode>): ...to this.
> * doc/generic.texi: Document new IFN codes.
> * internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to
> define an
> internal_fn that expands into multiple internal_fns for widening.
> (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing.
> (ifn_cmp): Function to compare ifn's for sorting/searching.
> (lookup_hilo_internal_fn): Add lookup function.
> (commutative_binary_fn_p): Add widen_plus fn's.
> (widening_fn_p): New function.
> (narrowing_fn_p): New function.
> (decomposes_to_hilo_fn_p): New function.
> (direct_internal_fn_optab): Change visibility.
> * internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define
> widening
> plus,minus functions.
> (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code.
> (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code.
> * internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
> (direct_internal_fn_optab): Declare new prototype.
> (lookup_hilo_internal_fn): Likewise.
> (widening_fn_p): Likewise.
> (Narrowing_fn_p): Likewise.
> (decomposes_to_hilo_fn_p): Likewise.
> * optabs.cc (commutative_optab_p): Add widening plus optabs.
> * optabs.def (OPTAB_D): Define widen add, sub optabs.
> * tree-cfg.cc (verify_gimple_call): Add checks for new widen
> add and sub IFNs.
> * tree-inline.cc (estimate_num_insns): Return same
> cost for widen add and sub IFNs as previous tree_codes.
> * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
> patterns with a hi/lo split.
> (vect_recog_sad_pattern): Refactor to use new IFN codes.
> (vect_recog_widen_plus_pattern): Likewise.
> (vect_recog_widen_minus_pattern): Likewise.
> (vect_recog_average_pattern): Likewise.
> * tree-vect-stmts.cc (vectorizable_conversion): Add support for
> _HILO IFNs.
> (supportable_widening_operation): Likewise.
> * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/vect-widen-add.c: Test that new
> IFN_VEC_WIDEN_PLUS is being used.
> * gcc.target/aarch64/vect-widen-sub.c: Test that new
> IFN_VEC_WIDEN_MINUS is being used.
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)