Re: [PATCH 2/3] Refactor widen_plus as internal_fn

Andre Vieira (lists) via Gcc-patches Fri, 12 May 2023 06:56:10 -0700



On 12/05/2023 14:28, Richard Biener wrote:

On Fri, 12 May 2023, Andre Vieira (lists) wrote:

I have dealt with, I think..., most of your comments. There's quite a few
changes, I think it's all a bit simpler now. I made some other changes to the
costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
the same behaviour as we had with the tree codes before. Also added some extra
checks to tree-cfg.cc that made sense to me.

I am still regression testing the gimple-range-op change, as that was a last
minute change, but the rest survived a bootstrap and regression test on
aarch64-unknown-linux-gnu.

cover letter:

This patch replaces the existing tree_code widen_plus and widen_minus
patterns with internal_fn versions.

DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
except they provide convenience wrappers for defining conversions that require
a hi/lo split.  Each definition for <NAME> will require optabs for _hi and _lo
and each of those will also require a signed and unsigned version in the case
of widening. The hi/lo pair is necessary because the widening and narrowing
operations take n narrow elements as inputs and return n/2 wide elements as
outputs. The 'lo' operation operates on the first n/2 elements of input. The
'hi' operation operates on the second n/2 elements of input. Defining an
internal_fn along with hi/lo variations allows a single internal function to
be returned from a vect_recog function that will later be expanded to hi/lo.


  For example:
  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
(u/s)addl2
                        IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
-> (u/s)addl

This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.


What I still don't understand is how we are so narrowly focused on
HI/LO?  We need a combined scalar IFN for pattern selection (not
sure why that's now called _HILO, I expected no suffix).  Then there's
three possibilities the target can implement this:

  1) with a widen_[su]add<mode> instruction - I _think_ that's what
     RISCV is going to offer since it is a target where vector modes
     have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
     RVV can do a V4HI to V4SI widening and widening add/subtract
     using vwadd[u] and vwsub[u] (the HI->SI widening is actually
     done with a widening add of zero - eh).
     IIRC GCN is the same here.
  2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
     codes currently support (exclusively)
  3) similar, but widen_[su]add{_even,_odd}<mode>

that said, things like decomposes_to_hilo_fn_p look to paint us into
a 2) corner without good reason.

I was kind of just keeping the naming, I had forgotten to mention I wasalso going to add _EVENODD but you are right, the pattern selection IFNdoes not need to be restrictive.

And then at supportable_widening_operation we could check what thetarget offers support for (either 1, 2 or 3). We can then actually justget rid of decomposes_to_hilo_fn_p and just assume that for allnarrowing or widening IFN's there are optabs (that may or may not beimplemented by a target) for all three variants

Having said that, that means we should have an optab to cover 1, whichshould probably just have the original name. Let me write it out...

Say we have a IFN_VEC_WIDEN_PLUS pattern and assume its signed,supportable_widening_operation would then first check if the targetsupported vec_widen_sadd_optab for say V8HI -> V8SI? Risc-V would takethis path I guess?


If the target doesn't then it could check for support for:
vec_widen_sadd_lo_optab V4HI -> V4SI
vec_widen_sadd_hi_optab V4HI -> V4SI

AArch64 Advanced SIMD would implement this.

If the target still didn't support this it would check for (not sureabout the modes here):

vec_widen_sadd_even_optab VNx8HI -> VNx4SI
vec_widen_sadd_odd_optab VNx8HI -> VNx4SI

This is one SVE would implement.


So that would mean that I'd probably end up rewriting

#define DEF_INTERNAL_OPTAB_WIDENING_FN (NAME, FLAGS, SELECTOR, SOPTAB,UOPTAB, TYPE)

as:
for1)

DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB,TYPE)


for 2)

DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_LO, FLAGS, SELECTOR, SOPTAB,UOPTAB, TYPE)DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_HI, FLAGS, SELECTOR, SOPTAB,UOPTAB, TYPE)


for 3)

DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_EVEN, FLAGS, SELECTOR, SOPTAB,UOPTAB, TYPE)DEF_INTERNAL_SIGNED_OPTAB_FN (NAME##_ODD, FLAGS, SELECTOR, SOPTAB,UOPTAB, TYPE)

And the same for narrowing (but with DEF_INTERNAL_OPTAB_FN instead ofSIGNED_OPTAB).

So each widening and narrowing IFN would have optabs for all itsvariants and each target implements the ones it supports.

I'm happy to do this, but implementing support to handle the 1 and 3variants without having optabs for them right now seems a bit odd and itwould delay this patch, so I suggest I add the framework and the optabsbut leave adding the vectorizer support for later? I can add comments towhere I think that should go.

Richard.

gcc/ChangeLog:

2023-05-12  Andre Vieira  <[email protected]>
             Joel Hutton  <[email protected]>
             Tamar Christina  <[email protected]>

         * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>):
Rename
         this ...
         (vec_widen_<su>add_lo_<mode>): ... to this.
         (vec_widen_<su>addl_hi_<mode>): Rename this ...
         (vec_widen_<su>add_hi_<mode>): ... to this.
         (vec_widen_<su>subl_lo_<mode>): Rename this ...
         (vec_widen_<su>sub_lo_<mode>): ... to this.
         (vec_widen_<su>subl_hi_<mode>): Rename this ...
         (vec_widen_<su>sub_hi_<mode>): ...to this.
         * doc/generic.texi: Document new IFN codes.
        * internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to
        define an
         internal_fn that expands into multiple internal_fns for widening.
         (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing.
        (ifn_cmp): Function to compare ifn's for sorting/searching.
        (lookup_hilo_internal_fn): Add lookup function.
        (commutative_binary_fn_p): Add widen_plus fn's.
        (widening_fn_p): New function.
        (narrowing_fn_p): New function.
        (decomposes_to_hilo_fn_p): New function.
                 (direct_internal_fn_optab): Change visibility.
        * internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define
     widening
     plus,minus functions.
        (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code.
        (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code.
        * internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
                 (direct_internal_fn_optab): Declare new prototype.
        (lookup_hilo_internal_fn): Likewise.
        (widening_fn_p): Likewise.
        (Narrowing_fn_p): Likewise.
        (decomposes_to_hilo_fn_p): Likewise.
        * optabs.cc (commutative_optab_p): Add widening plus optabs.
        * optabs.def (OPTAB_D): Define widen add, sub optabs.
         * tree-cfg.cc (verify_gimple_call): Add checks for new widen
         add and sub IFNs.
         * tree-inline.cc (estimate_num_insns): Return same
         cost for widen add and sub IFNs as previous tree_codes.
        * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
     patterns with a hi/lo split.
         (vect_recog_sad_pattern): Refactor to use new IFN codes.
         (vect_recog_widen_plus_pattern): Likewise.
         (vect_recog_widen_minus_pattern): Likewise.
         (vect_recog_average_pattern): Likewise.
        * tree-vect-stmts.cc (vectorizable_conversion): Add support for
                 _HILO IFNs.
        (supportable_widening_operation): Likewise.
         * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.

gcc/testsuite/ChangeLog:

        * gcc.target/aarch64/vect-widen-add.c: Test that new
     IFN_VEC_WIDEN_PLUS is being used.
        * gcc.target/aarch64/vect-widen-sub.c: Test that new
     IFN_VEC_WIDEN_MINUS is being used.

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

Reply via email to