RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2022-06-30 Thread Joel Hutton via Gcc-patches
> We can go with a private vect_gimple_build function until we sort out the API > issue to unblock Tamar (I'll reply to Richards reply with further thoughts on > this) > Done. > > Similarly are you ok with the use of gimple_extract_op? I would lean > towards using it as it is cleaner, but I don'

[ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2022-06-09 Thread Joel Hutton via Gcc-patches
> Before I make any changes, I'd like to check we're all on the same page. > > richi, are you ok with the gimple_build function, perhaps with a different > name if you are concerned with overloading? we could use gimple_ch_build > or gimple_code_helper_build? > > Similarly are you ok with the use

RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2022-06-07 Thread Joel Hutton via Gcc-patches
Thanks Richard, > I thought the potential problem with the above is that gimple_build is a > folding interface, so in principle it's allowed to return an existing SSA_NAME > set by an existing statement (or even a constant). > I think in this context we do need to force a new statement to be creat

RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2022-06-06 Thread Joel Hutton via Gcc-patches
> > Patches attached. They already incorporated the .cc rename, now > > rebased to be after the change to tree.h > > @@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo, >2, oprnd, half_type, unprom, vectype); > >tree var = vect_recog_temp_ssa_var (itype

RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2022-05-31 Thread Joel Hutton via Gcc-patches
> Can you post an updated patch (after the .cc renaming, and code_helper > now already moved to tree.h). > > Thanks, > Richard. Patches attached. They already incorporated the .cc rename, now rebased to be after the change to tree.h Joel 0001-Refactor-to-allow-internal_fn-s.patch Description:

[ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2022-05-25 Thread Joel Hutton via Gcc-patches
Ping! Just checking there is still interest in this. I'm assuming you've been busy with release. Joel > -Original Message- > From: Joel Hutton > Sent: 13 April 2022 16:53 > To: Richard Sandiford > Cc: Richard Biener ; gcc-patches@gcc.gnu.org > Subject: [vect-patterns] Refactor widen_pl

[vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2022-04-13 Thread Joel Hutton via Gcc-patches
Hi all, These patches refactor the widening patterns in vect-patterns to use internal_fn instead of tree_codes. Sorry about the delay, some changes to master made it a bit messier. Bootstrapped and regression tested on aarch64. Joel > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-pa

RE: GCC 11 backport does not build (no "directly_supported_p") - was: Re: pr103523: Check for PLUS/MINUS support

2021-12-14 Thread Joel Hutton via Gcc-patches
> + if (ot_plus == unknown_optab > + || ot_minus == unknown_optab > + || optab_handler (ot_minus, TYPE_MODE (step_vectype)) == > CODE_FOR_nothing > + || optab_handler (ot_plus, TYPE_MODE (step_vectype)) == > + CODE_FOR_nothing) > return false; > > Won't optab_handler just retu

RE: GCC 11 backport does not build (no "directly_supported_p") - was: Re: pr103523: Check for PLUS/MINUS support

2021-12-14 Thread Joel Hutton via Gcc-patches
this is only present in: > >gcc/tree-vect-loop.c: if (!directly_supported_p (PLUS_EXPR, step_vectype) >gcc/tree-vect-loop.c: || !directly_supported_p (MINUS_EXPR, >step_vectype)) > >That's different on mainline, which offers that function. Just as a reminder, backports need regu

Re: GCC 11 backport does not build (no "directly_supported_p") - was: Re: pr103523: Check for PLUS/MINUS support

2021-12-13 Thread Joel Hutton via Gcc-patches
pe) >gcc/tree-vect-loop.c: || !directly_supported_p (MINUS_EXPR, >step_vectype)) > >That's different on mainline, which offers that function. Just as a reminder, backports need regular bootstrap and regtest validation on the respective branches. Richard. >Tobias >

Re: pr103523: Check for PLUS/MINUS support

2021-12-10 Thread Joel Hutton via Gcc-patches
ok for backport to 11? From: Richard Sandiford Sent: 10 December 2021 10:22 To: Joel Hutton Cc: GCC Patches ; Richard Biener Subject: Re: pr103523: Check for PLUS/MINUS support Joel Hutton writes: > Hi all, > > This is to address pr103523. > > bootstrapped and

pr103523: Check for PLUS/MINUS support

2021-12-10 Thread Joel Hutton via Gcc-patches
Hi all, This is to address pr103523. bootstrapped and regression tested on aarch64. Check for PLUS_EXPR/MINUS_EXPR support in vectorizable_induction. PR103523 is an ICE on valid code: void d(float *a, float b, int c) {     float e;     for (; c; c--, e += b)       a[c] = e; } This is due to no

[ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2021-11-25 Thread Joel Hutton via Gcc-patches
Just a quick ping to check this hasn't been forgotten. > -Original Message- > From: Joel Hutton > Sent: 12 November 2021 11:42 > To: Richard Biener > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford > > Subject: RE: [vect-patterns] Refactor widen_plus/widen_minus as > internal_fns > > > p

RE: [vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2021-11-16 Thread Joel Hutton via Gcc-patches
Updated patch 2 with explanation included in commit message and changes requested. Bootstrapped and regression tested on aarch64 > -Original Message- > From: Joel Hutton > Sent: 12 November 2021 11:42 > To: Richard Biener > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford > > Subject: RE:

RE: [vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2021-11-12 Thread Joel Hutton via Gcc-patches
> please use #define INCLUDE_MAP before the system.h include instead. > Is it really necessary to build a new std::map for each optab lookup?! > That looks quite ugly and inefficient. We'd usually - if necessary at all - > build > a auto_vec > and .sort () and .bsearch () it. Ok, I'll rework this

[vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2021-11-11 Thread Joel Hutton via Gcc-patches
Hi all, This refactor allows widening vect patterns (such as widen_plus/widen_minus) to be represented as either internal_fns or tree_codes and replaces the current widen_plus/widen_minus with internal_fn versions. This refactor is split into 3 patches. Boostrapped and regression tested on aar

[ping][vect-patterns][RFC] Refactor widening patterns to allow internal_fn's

2021-08-17 Thread Joel Hutton via Gcc-patches
Ping. Is there still interest in refactoring vect-patterns to internal_fn's? > -Original Message- > From: Joel Hutton > Sent: 07 June 2021 14:30 > To: gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Richard Sandiford > > Subject: [vect-patterns][RFC] Refactor widening patterns to allow >

[vect-patterns][RFC] Refactor widening patterns to allow internal_fn's

2021-06-07 Thread Joel Hutton via Gcc-patches
Hi all, This refactor allows widening patterns (such as widen_plus/widen_minus) to be represented as either internal_fns or tree_codes. The widening patterns were originally added as tree codes with the expectation that they would be refactored later. [vect-patterns] Refactor as internal_fn's

[vect] Support min/max + index pattern

2021-05-05 Thread Joel Hutton via Gcc-patches
Hi all, looking for some feedback on this, one thing I would like to draw attention to is the fact that this pattern requires 2 separate dependent reductions in the epilogue. The accumulator vector for the maximum/minimum elements can be reduced to a scalar result trivially with a min/max, but

RE: [Vect] Fix mask check on Scatter loads/stores

2021-03-10 Thread Joel Hutton via Gcc-patches
>> gcc/testsuite/ChangeLog: >> >> PR target/99102 >> * gcc.target/aarch64/sve/pr99102.c: New test. > >(The filename is out of date, but the git hook would catch that.) Fixed and committed. > >> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c >> b/gcc/testsuite/gcc.dg/vect/pr99102.c >> new

[Vect] Fix mask check on Scatter loads/stores

2021-03-10 Thread Joel Hutton via Gcc-patches
Hi all, This patch fixes PR99102. For masked gather loads/scatter stores the 'loop_masks' variable was checked to be non-null, but the 'final_mask' was the actual mask used. bootstrapped and regression tested on aarch64. Regression tested on aarch64_sve under qemu. [Vect] Fix mask check on Scat

Re: [aarch64][vect] Support V8QI->V8HI WIDEN_ patterns

2021-02-11 Thread Joel Hutton via Gcc-patches
Hi Richard, I've revised the patch, sorry about sloppy formatting in the previous one. Full bootstrap/regression tests are still running, but the changes are pretty trivial. Ok for trunk assuming tests finish clean? >Joel Hutton writes: >> @@ -277,6 +277,81 @@ optab_for_tree_code (enum tree_c

Re: [aarch64][vect] Support V8QI->V8HI WIDEN_ patterns

2021-02-10 Thread Joel Hutton via Gcc-patches
Thanks for the quick review. Updated patch attached. I've addressed your comments below. Tests are still running, OK for trunk assuming tests come out clean? [aarch64][vect] Support V8QI->V8HI WIDEN_ patterns In the case where 8 out of every 16 elements are widened using a widening pattern and

[aarch64][vect] Support V8QI->V8HI WIDEN_ patterns

2021-02-09 Thread Joel Hutton via Gcc-patches
Hi Richards, This patch adds support for the V8QI->V8HI case from widening vect patterns as discussed to target PR98772. Bootstrapped and regression tested on aarch64. [aarch64][vect] Support V8QI->V8HI WIDEN_ patterns In the case where 8 out of every 16 elements are widened using a widening

Re: [RFC] Feedback on approach for adding support for V8QI->V8HI widening patterns

2021-02-03 Thread Joel Hutton via Gcc-patches
> Do you mean a v8qi->v8hi widening subtract or a v16qi->v8hi widening > subtract? I mean the latter, that seemed to be what richi was suggesting previously. > The problem with the latter is that we need to fill the > extra unused elements with something and remove them later. That's fair eno

Re: [RFC] Feedback on approach for adding support for V8QI->V8HI widening patterns

2021-02-03 Thread Joel Hutton via Gcc-patches
>> So emit a v4qi->v8qi gimple conversion >> then a regular widen_lo/hi using the existing backend patterns/optabs? > >I was thinking of using a v8qi->v8hi convert on each operand followed >by a normal v8hi subtraction. That's what we'd generate if the target >didn't define the widening patterns.

Re: [RFC] Feedback on approach for adding support for V8QI->V8HI widening patterns

2021-02-03 Thread Joel Hutton via Gcc-patches
>>> In practice this will only affect targets that choose to use mixed >>> vector sizes, and I think it's reasonable to optimise only for the >>> case in which such targets support widening conversions. So what >>> do you think about the idea of emitting separate conversions and >>> a normal subtr

[RFC] Feedback on approach for adding support for V8QI->V8HI widening patterns

2021-02-01 Thread Joel Hutton via Gcc-patches
Hi Richard(s), I'm just looking to see if I'm going about this the right way, based on the discussion we had on IRC. I've managed to hack something together, I've attached a (very) WIP patch which gives the correct codegen for the testcase in question (https://gcc.gnu.org/bugzilla/show_bug.cgi?

Re: [AArch64] Remove backend support for widen-sub

2021-01-21 Thread Joel Hutton via Gcc-patches
From: Richard Sandiford Sent: 21 January 2021 13:40 To: Richard Biener Cc: Joel Hutton via Gcc-patches ; Joel Hutton Subject: Re: [AArch64] Remove backend support for widen-sub Richard Biener writes: > On Thu, 21 Jan 2021, Richard Sandiford wrote: > >> Joe

[AArch64] Remove backend support for widen-sub

2021-01-21 Thread Joel Hutton via Gcc-patches
Hi all, This patch removes support for the widening subtract operation in the aarch64 backend as it is causing a performance regression. In the following example: #include extern void wdiff( int16_t d[16], uint8_t *restrict pix1, uint8_t *restrict pix2) {    for( int y = 0; y < 4; y++ )   {

[2/2][VECT] pr97929 fix

2020-12-10 Thread Joel Hutton via Gcc-patches
Hi all, This patch addresses PR97929 by adding a missing case for WIDEN_PLUS/MINUS in vect_get_smallest_scalar_type. It also introduces a test to check for regression. One thing to note is that I saw a failure on c-c++-common/builtins.c which disappeared when I ran the test again. I assume th

[1/2][TREE] Add WIDEN_PLUS, WIDEN_MINUS pretty print

2020-12-10 Thread Joel Hutton via Gcc-patches
Hi all, This adds missing pretty print for WIDEN_PLUS/MINUS and VEC_WIDEN_PLUS/MINUS_HI/LO Bootstrapped and regression tested all together on aarch64. Ok for trunk? Add WIDEN_PLUS, WIDEN_MINUS pretty print Add 'w+'/'w-' as WIDEN_PLUS/WIDEN_MINUS respectively. Add 'VEC_WIDEN_PLUS/MINUS_HI/LO<.

Re: [3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-13 Thread Joel Hutton via Gcc-patches
Tests are still running, but I believe I've addressed all the comments. > > +#include > > + > > SVE targets will need a: > > #pragma GCC target "+nosve" > > here, since we'll generate different code for SVE. Fixed. > > +/* { dg-final { scan-assembler-times "shll\t" 1} } */ > > +/* { dg-final

Re: [2/3][vect] Add widening add, subtract vect patterns

2020-11-13 Thread Joel Hutton via Gcc-patches
Tests are still running, but I believe I've addressed all the comments. > Like Richard said, the new patterns need to be documented in md.texi > and the new tree codes need to be documented in generic.texi. Done. > While we're using tree codes, I think we need to make the naming > consistent wit

Re: [1/3][aarch64] Add aarch64 support for vec_widen_add, vec_widen_sub patterns

2020-11-13 Thread Joel Hutton via Gcc-patches
Tests are still running, but I believe I've addressed the comment. > There are ways in which we could reduce the amount of cut-&-paste here, > but I guess everything is a trade-off between clarity and compactness. > One extreme is to write them all out explicitly, another extreme would > be to hav

[1/3][aarch64] Add aarch64 support for vec_widen_add, vec_widen_sub patterns

2020-11-12 Thread Joel Hutton via Gcc-patches
Hi all, This patch adds backend patterns for vec_widen_add, vec_widen_sub on aarch64. All 3 patches together bootstrapped and regression tested on aarch64. Ok for stage 1? gcc/ChangeLog: 2020-11-12  Joel Hutton           * config/aarch64/aarch64-simd.md: New patterns vec_widen_saddl_lo/hi_ F

[3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-12 Thread Joel Hutton via Gcc-patches
Hi all, This patch adds support in the aarch64 backend for the vec_widen_shift vect-pattern and makes a minor mid-end fix to support it. All 3 patches together bootstrapped and regression tested on aarch64. Ok for stage 1? gcc/ChangeLog: 2020-11-12  Joel Hutton           * config/aarch64/aar

[2/3][vect] Add widening add, subtract vect patterns

2020-11-12 Thread Joel Hutton via Gcc-patches
Hi all, This patch adds widening add and widening subtract patterns to tree-vect-patterns. All 3 patches together bootstrapped and regression tested on aarch64. gcc/ChangeLog: 2020-11-12  Joel Hutton           * expr.c (expand_expr_real_2): add widen_add,widen_subtract cases         * optabs-

[SLP][VECT] Add check to fix 96837

2020-09-29 Thread Joel Hutton via Gcc-patches
Hi All, The following patch adds a simple check to prevent slp stmts from vector constructors being rearranged. vect_attempt_slp_rearrange_stmts tries to rearrange to avoid a load permutation. This fixes PR target/96837 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96827 gcc/ChangeLog: 2020-09