from:"xionghu luo via Gcc\-patches"

[PATCH] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2022-08-07 Thread Xionghu Luo via Gcc-patches

The native RTL expression for vec_mrghw should be same for BE and LE as they are register and endian-independent. So both BE and LE need generate exactly same RTL with index [0 4 1 5] when expanding vec_mrghw with vec_select and vec_concat. (set (reg:V4SI 141) (vec_select:V4SI (vec_concat:V8SI

Re: [PATCH v2] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2022-08-09 Thread Xionghu Luo via Gcc-patches

On 2022/8/9 11:01, Kewen.Lin wrote: Hi Xionghu, Thanks for the fix. on 2022/8/8 11:42, Xionghu Luo wrote: The native RTL expression for vec_mrghw should be same for BE and LE as they are register and endian-independent. So both BE and LE need generate exactly same RTL with index [0 4 1 5]

Re: [PATCH v2] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2022-08-10 Thread Xionghu Luo via Gcc-patches

On 2022/8/11 01:07, Segher Boessenkool wrote: On Wed, Aug 10, 2022 at 02:39:02PM +0800, Xionghu Luo wrote: On 2022/8/9 11:01, Kewen.Lin wrote: I have some concern on those changed "altivec_*_direct", IMHO the suffix "_direct" is normally to indicate the define_insn is mapped to the correspon

Ping: [PATCH v4] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2023-02-27 Thread Xionghu Luo via Gcc-patches

Hi Segher, Ping this for stage 4... On 2023/2/10 10:59, Xionghu Luo via Gcc-patches wrote: Resend this patch... v4: Update per comments. v3: rename altivec_vmrghb_direct_le to altivec_vmrglb_direct_le to match the actual output ASM vmrglb. Likewise for all similar xxx_direct_le patterns. v2

[PATCH 1/2] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-01 Thread Xionghu Luo via Gcc-patches

When spliting edge with self loop, the split edge should be placed just next to the edge_in->src, otherwise it may generate different position latch bbs for two consecutive self loops. For details, please refer to: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93680#c4 Regression tested pass on x8

[PATCH 2/2] gcov: Fix incorrect gimple line LOCATION [PR97923]

2023-03-01 Thread Xionghu Luo via Gcc-patches

For case like belowi test.c: 1:int foo(char c) 2:{ 3: return ((c >= 'A' && c <= 'Z') 4: || (c >= 'a' && c <= 'z') 5: || (c >= '0' && c <='0'));} the generated line number is incorrect for condition c>='A' of block 2: Thus correct the condition op0 location. gcno diff before and with

Re: [PATCH 2/2] gcov: Fix incorrect gimple line LOCATION [PR97923]

2023-03-02 Thread Xionghu Luo via Gcc-patches

On 2023/3/2 16:16, Richard Biener wrote: On Thu, Mar 2, 2023 at 3:31 AM Xionghu Luo via Gcc-patches wrote: For case like belowi test.c: 1:int foo(char c) 2:{ 3: return ((c >= 'A' && c <= 'Z') 4: || (c >= 'a' && c <= &#x

Re: [PATCH 1/2] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-02 Thread Xionghu Luo via Gcc-patches

On 2023/3/2 16:41, Richard Biener wrote: On Thu, Mar 2, 2023 at 3:31 AM Xionghu Luo via Gcc-patches wrote: When spliting edge with self loop, the split edge should be placed just next to the edge_in->src, otherwise it may generate different position latch bbs for two consecutive self lo

Re: [PATCH 1/2] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-05 Thread Xionghu Luo via Gcc-patches

On 2023/3/2 18:45, Richard Biener wrote: small.gcno: 648: block 2:`small.c':1, 3, 4, 6 small.gcno: 688:0145: 36:LINES small.gcno: 700: block 3:`small.c':8, 9 small.gcno: 732:0145: 32:LINES small.gcno: 744:

Re: [PATCH 1/2] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-06 Thread Xionghu Luo via Gcc-patches

On 2023/3/6 16:11, Richard Biener wrote: On Mon, Mar 6, 2023 at 8:22 AM Xionghu Luo wrote: On 2023/3/2 18:45, Richard Biener wrote: small.gcno: 648: block 2:`small.c':1, 3, 4, 6 small.gcno: 688:0145: 36:LINES small.gcno: 700: blo

Re: [PATCH 1/2] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-07 Thread Xionghu Luo via Gcc-patches

On 2023/3/7 16:53, Richard Biener wrote: On Tue, 7 Mar 2023, Xionghu Luo wrote: Unfortunately this change (flag_test_coverage -> !optimize ) caused hundred of gfortran cases execution failure with O0. Take gfortran.dg/index.f90 for example: .gimple: __attribute__((fn spec (". "))) void p

[PATCH v3] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-08 Thread Xionghu Luo via Gcc-patches

On 2023/3/7 19:25, Richard Biener wrote: It would be nice to avoid creating blocks / preserving labels we'll immediately remove again. For that we do need some analysis before creating basic-blocks that determines whether a label is possibly reached by a non-falltru edge. : p = 0; switch

Re: [PATCH v4] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-13 Thread Xionghu Luo via Gcc-patches

On 2023/3/9 20:02, Richard Biener wrote: On Wed, 8 Mar 2023, Xionghu Luo wrote: On 2023/3/7 19:25, Richard Biener wrote: It would be nice to avoid creating blocks / preserving labels we'll immediately remove again. For that we do need some analysis before creating basic-blocks that determ

[PATCH v4] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-15 Thread Xionghu Luo via Gcc-patches

On 2023/3/9 20:02, Richard Biener wrote: On Wed, 8 Mar 2023, Xionghu Luo wrote: On 2023/3/7 19:25, Richard Biener wrote: It would be nice to avoid creating blocks / preserving labels we'll immediately remove again. For that we do need some analysis before creating basic-blocks that determ

[PATCH] rs6000: Fix vec insert ilp32 ICE and test failures [PR98799]

2021-01-25 Thread Xionghu Luo via Gcc-patches

From: "luo...@cn.ibm.com" UNSPEC_SI_FROM_SF is not supported when TARGET_DIRECT_MOVE_64BIT is false for -m32, don't generate VIEW_CONVERT_EXPR(ARRAY_REF) for variable vector insert. Remove rs6000_expand_vector_set_var helper function, adjust the p8 and p9 definitions position and make them stati

Re: [PATCH] rs6000: Fix vec insert ilp32 ICE and test failures [PR98799]

2021-01-26 Thread Xionghu Luo via Gcc-patches

Hi, On 2021/1/27 03:00, David Edelsohn wrote: > On Tue, Jan 26, 2021 at 2:46 AM Xionghu Luo wrote: >> >> From: "luo...@cn.ibm.com" >> >> UNSPEC_SI_FROM_SF is not supported when TARGET_DIRECT_MOVE_64BIT >> is false for -m32, don't generate VIEW_CONVERT_EXPR(ARRAY_REF) for >> variable vector inser

[PATCH] testsuite: Run vec_insert case on P8 and P9 with option specified

2021-01-28 Thread Xionghu Luo via Gcc-patches

Move common functions to header file for cleanup. gcc/testsuite/ChangeLog: 2021-01-27 Xionghu Luo * gcc.target/powerpc/pr79251.p8.c: Move definition to ... * gcc.target/powerpc/pr79251.h: ...this. * gcc.target/powerpc/pr79251.p9.c: Likewise. * gcc.target/powerp

[PATCH] testsuite: Update pr79251 ilp32 store regex.

2021-01-31 Thread Xionghu Luo via Gcc-patches

BE ilp32 Linux generates extra stack stwu instructions which shouldn't be counted in, \m … \M is needed around each instruction, not just the beginning and end of the entire pattern. Pre-approved, committing. gcc/testsuite/ChangeLog: 2021-02-01 Xionghu Luo * gcc.target/powerpc/pr79251

[PATCH] rs6000: Convert the vector element register to SImode [PR98914]

2021-02-03 Thread Xionghu Luo via Gcc-patches

v[k] will also be expanded to IFN VEC_SET if k is long type when built with -Og. -O0 didn't exposed the issue due to v is TREE_ADDRESSABLE, -O1 and above also didn't capture it because of v[k] is not optimized to VIEW_CONVERT_EXPR(v)[k_1]. vec_insert defines the element argument type to be signed

Ping: [PATCH] rs6000: Convert the vector element register to SImode [PR98914]

2021-02-17 Thread Xionghu Luo via Gcc-patches

Gentle ping, thanks. On 2021/2/3 17:01, Xionghu Luo wrote: v[k] will also be expanded to IFN VEC_SET if k is long type when built with -Og. -O0 didn't exposed the issue due to v is TREE_ADDRESSABLE, -O1 and above also didn't capture it because of v[k] is not optimized to VIEW_CONVERT_EXPR(v)[k

[PATCH v2] rs6000: Convert the vector element register to SImode [PR98914]

2021-02-23 Thread Xionghu Luo via Gcc-patches

vec_insert defines the element argument type to be signed int by ELFv2 ABI, When expanding a vector with a variable rtx, convert the rtx type SImode. gcc/ChangeLog: 2021-02-24 Xionghu Luo PR target/98914 * config/rs6000/rs6000.c (rs6000_expand_vector_set): Convert elt_

Re: [PATCH v2] rs6000: Convert the vector element register to SImode [PR98914]

2021-02-24 Thread Xionghu Luo via Gcc-patches

On 2021/2/25 00:57, Segher Boessenkool wrote: > Hi! > > On Wed, Feb 24, 2021 at 09:06:24AM +0800, Xionghu Luo wrote: >> vec_insert defines the element argument type to be signed int by ELFv2 >> ABI, When expanding a vector with a variable rtx, convert the rtx type >> SImode. > > But that is tr

Ping: [PATCH v2] rs6000: Convert the vector element register to SImode [PR98914]

2021-03-02 Thread Xionghu Luo via Gcc-patches

On 2021/2/25 14:33, Xionghu Luo via Gcc-patches wrote: > > > On 2021/2/25 00:57, Segher Boessenkool wrote: >> Hi! >> >> On Wed, Feb 24, 2021 at 09:06:24AM +0800, Xionghu Luo wrote: >>> vec_insert defines the element argument type to be signed int by ELFv2 &

[PATCH] Fix loop split incorrect count and probability

2021-08-03 Thread Xionghu Luo via Gcc-patches

loop split condition is moved between loop1 and loop2, the split bb's count and probability should also be duplicated instead of (100% vs INV), secondly, the original loop1 and loop2 count need be propotional from the original loop. Regression tested pass, OK for master? diff base/loop-cond-split

Re: [PATCH] Fix loop split incorrect count and probability

2021-08-03 Thread Xionghu Luo via Gcc-patches

I' like to split this patch: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576488.html to two patches: 0001-Fix-loop-split-incorrect-count-and-probability.patch 0002-Don-t-move-cold-code-out-of-loop-by-checking-bb-coun.patch since they are solving two different things, please help to r

Re: [PATCH] Fix loop split incorrect count and probability

2021-08-08 Thread Xionghu Luo via Gcc-patches

Thanks, On 2021/8/6 19:46, Richard Biener wrote: > On Tue, 3 Aug 2021, Xionghu Luo wrote: > >> loop split condition is moved between loop1 and loop2, the split bb's >> count and probability should also be duplicated instead of (100% vs INV), >> secondly, the original loop1 and loop2 count need be

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-08-09 Thread Xionghu Luo via Gcc-patches

Hi, On 2021/8/6 20:15, Richard Biener wrote: > On Mon, Aug 2, 2021 at 7:05 AM Xiong Hu Luo wrote: >> >> There was a patch trying to avoid move cold block out of loop: >> >> https://gcc.gnu.org/pipermail/gcc/2014-November/215551.html >> >> Richard suggested to "never hoist anything from a bb with

Re: [PATCH] Fix loop split incorrect count and probability

2021-08-11 Thread Xionghu Luo via Gcc-patches

On 2021/8/10 22:47, Richard Biener wrote: > On Mon, 9 Aug 2021, Xionghu Luo wrote: > >> Thanks, >> >> On 2021/8/6 19:46, Richard Biener wrote: >>> On Tue, 3 Aug 2021, Xionghu Luo wrote: >>> loop split condition is moved between loop1 and loop2, the split bb's count and probability sho

Re: [PATCH] Fix loop split incorrect count and probability

2021-08-11 Thread Xionghu Luo via Gcc-patches

On 2021/8/11 17:16, Richard Biener wrote: On Wed, 11 Aug 2021, Xionghu Luo wrote: On 2021/8/10 22:47, Richard Biener wrote: On Mon, 9 Aug 2021, Xionghu Luo wrote: Thanks, On 2021/8/6 19:46, Richard Biener wrote: On Tue, 3 Aug 2021, Xionghu Luo wrote: loop split condition is moved be

Re: [PATCH] Fix incorrect computation in fill_always_executed_in_1

2021-08-16 Thread Xionghu Luo via Gcc-patches

Hi, On 2021/8/16 19:46, Richard Biener wrote: On Mon, 16 Aug 2021, Xiong Hu Luo wrote: It seems to me that ALWAYS_EXECUTED_IN is not computed correctly for nested loops. inn_loop is updated to inner loop, so it need be restored when exiting from innermost loop. With this patch, the store inst

Re: [PATCH] Fix incorrect computation in fill_always_executed_in_1

2021-08-16 Thread Xionghu Luo via Gcc-patches

On 2021/8/17 13:17, Xionghu Luo via Gcc-patches wrote: Hi, On 2021/8/16 19:46, Richard Biener wrote: On Mon, 16 Aug 2021, Xiong Hu Luo wrote: It seems to me that ALWAYS_EXECUTED_IN is not computed correctly for nested loops. inn_loop is updated to inner loop, so it need be restored when

[PATCH v2] Fix incomplete computation in fill_always_executed_in_1

2021-08-17 Thread Xionghu Luo via Gcc-patches

On 2021/8/17 15:12, Richard Biener wrote: > On Tue, 17 Aug 2021, Xionghu Luo wrote: > >> Hi, >> >> On 2021/8/16 19:46, Richard Biener wrote: >>> On Mon, 16 Aug 2021, Xiong Hu Luo wrote: >>> It seems to me that ALWAYS_EXECUTED_IN is not computed correctly for nested loops. inn_loop is

Re: [PATCH v2] Fix incomplete computation in fill_always_executed_in_1

2021-08-18 Thread Xionghu Luo via Gcc-patches

On 2021/8/17 17:10, Xionghu Luo via Gcc-patches wrote: > > > On 2021/8/17 15:12, Richard Biener wrote: >> On Tue, 17 Aug 2021, Xionghu Luo wrote: >> >>> Hi, >>> >>> On 2021/8/16 19:46, Richard Biener wrote: >>>> On Mon, 16

[PATCH v2] Don't move cold code out of loop by checking bb count

2021-08-18 Thread Xionghu Luo via Gcc-patches

On 2021/8/10 12:25, Ulrich Drepper wrote: > On Tue, Aug 10, 2021 at 4:03 AM Xionghu Luo via Gcc-patches > wrote: >> For this case, theorotically I think the master GCC will optimize it to: >> >>invariant; >>for (;;) >>

Re: [PATCH v2] Fix incomplete computation in fill_always_executed_in_1

2021-08-24 Thread Xionghu Luo via Gcc-patches

On 2021/8/19 20:11, Richard Biener wrote: >> - class loop *inn_loop = loop; >> >> if (ALWAYS_EXECUTED_IN (loop->header) == NULL) >> { >> @@ -3232,19 +3231,6 @@ fill_always_executed_in_1 (class loop *loop, sbitmap >> contains_call) >> to disprove this if possible). */ >>

Re: [PATCH v3] Fix incomplete computation in fill_always_executed_in_1

2021-08-25 Thread Xionghu Luo via Gcc-patches

On 2021/8/24 16:20, Richard Biener wrote: > On Tue, 24 Aug 2021, Xionghu Luo wrote: > >> >> >> On 2021/8/19 20:11, Richard Biener wrote: - class loop *inn_loop = loop; if (ALWAYS_EXECUTED_IN (loop->header) == NULL) { @@ -3232,19 +3231,6 @@ fill_always_e

Re: [PATCH v3] Fix incomplete computation in fill_always_executed_in_1

2021-08-30 Thread Xionghu Luo via Gcc-patches

On 2021/8/27 15:45, Richard Biener wrote: On Thu, 26 Aug 2021, Xionghu Luo wrote: On 2021/8/24 16:20, Richard Biener wrote: On Tue, 24 Aug 2021, Xionghu Luo wrote: On 2021/8/19 20:11, Richard Biener wrote: - class loop *inn_loop = loop; if (ALWAYS_EXECUTED_IN (loop->header

Re: [PATCH v3] Fix incomplete computation in fill_always_executed_in_1

2021-08-31 Thread Xionghu Luo via Gcc-patches

On 2021/8/30 17:19, Richard Biener wrote: bitmap_set_bit (work_set, loop->header->index); + unsigned bb_index; - for (i = 0; i < loop->num_nodes; i++) - { - edge_iterator ei; - bb = bbs[i]; + unsigned array_size = last_basic_block_for_fn (cfun) + 1;

[PATCH v3 1/2] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR

2020-09-21 Thread xionghu luo via Gcc-patches

Thanks for the review, On 2020/9/21 16:31, Richard Biener wrote: + +static gimple * +gimple_expand_vec_set_expr (gimple_stmt_iterator *gsi) +{ + enum tree_code code; + gcall *new_stmt = NULL; + gassign *ass_stmt = NULL; + + /* Only consider code == GIMPLE_ASSIGN. */ + gassign *stmt = dyn_

Re: [PATCH v3 1/2] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR

2020-09-23 Thread xionghu luo via Gcc-patches

Hi, On 2020/9/23 19:33, Richard Biener wrote: >> The first loop is for rhs stmt process, this loop is for lhs stmt process. >> I thought vec_extract also need to generate IFN before, but seems not >> necessary now? And that the first loop needs to update the lhs stmt while >> then second doesn't.

Re: [PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-24 Thread xionghu luo via Gcc-patches

Hi Segher, The attached two patches are updated and split from "[PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]" as your comments. [PATCH v3 2/3] rs6000: Fix lvsl&lvsr mode and change rs6000_expand_vector_set param This one is preparation work of fix lvsl&lvs

Re: [PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-24 Thread xionghu luo via Gcc-patches

Hi, On 2020/9/24 21:27, Richard Biener wrote: > On Thu, Sep 24, 2020 at 10:21 AM xionghu luo wrote: > > I'll just comment that > > xxperm 34,34,33 > xxinsertw 34,0,12 > xxperm 34,34,32 > > doesn't look like a variable-position insert instruction but > this is a varia

[PATCH v4 1/3] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR

2020-09-24 Thread xionghu luo via Gcc-patches

Hi, On 2020/9/24 20:39, Richard Sandiford wrote: > xionghu luo writes: >> @@ -2658,6 +2659,43 @@ expand_vect_cond_mask_optab_fn (internal_fn, gcall >> *stmt, convert_optab optab) >> >> #define expand_vec_cond_mask_optab_fn expand_vect_cond_mask_optab_fn >> >> +/* Expand VEC_SET internal

Re: [PATCH v4 1/3] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR

2020-09-26 Thread xionghu luo via Gcc-patches

On 2020/9/25 21:28, Richard Sandiford wrote: > xionghu luo writes: >> @@ -2658,6 +2659,45 @@ expand_vect_cond_mask_optab_fn (internal_fn, gcall >> *stmt, convert_optab optab) >> >> #define expand_vec_cond_mask_optab_fn expand_vect_cond_mask_optab_fn >> >> +/* Expand VEC_SET internal fu

[PATCH 1/4] rs6000: Change rs6000_expand_vector_set param

2020-10-10 Thread Xionghu Luo via Gcc-patches

rs6000_expand_vector_set could accept insert either to constant position or variable position, so change the operand to reg_or_cint_operand. gcc/ChangeLog: 2020-10-10 Xionghu Luo * config/rs6000/rs6000-call.c (altivec_expand_vec_set_builtin): Change call param 2 from type int

[PATCH 4/4] rs6000: Update testcases' instruction count

2020-10-10 Thread Xionghu Luo via Gcc-patches

gcc/testsuite/ChangeLog: 2020-10-10 Xionghu Luo * gcc.target/powerpc/fold-vec-insert-char-p8.c: Adjust instruction counts. * gcc.target/powerpc/fold-vec-insert-char-p9.c: Likewise. * gcc.target/powerpc/fold-vec-insert-double.c: Likewise. * gcc.target/pow

[PATCH 3/4] rs6000: Enable vec_insert for P8 with rs6000_expand_vector_set_var_p8

2020-10-10 Thread Xionghu Luo via Gcc-patches

gcc/ChangeLog: 2020-10-10 Xionghu Luo * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Generate ARRAY_REF(VIEW_CONVERT_EXPR) for P8 and later platforms. * config/rs6000/rs6000.c (rs6000_expand_vector_set_var): Update to call different pat

[PATCH 2/4] rs6000: Support variable insert and Expand vec_insert in expander [PR79251]

2020-10-10 Thread Xionghu Luo via Gcc-patches

vec_insert accepts 3 arguments, arg0 is input vector, arg1 is the value to be insert, arg2 is the place to insert arg1 to arg0. Current expander generates stxv+stwx+lxv if arg2 is variable instead of constant, which causes serious store hit load performance issue on Power. This patch tries 1) Bu

[PATCH 0/4] rs6000: Enable variable vec_insert with IFN VEC_SET

2020-10-10 Thread Xionghu Luo via Gcc-patches

Originated from https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554240.html with patch split and some refinement per review comments. Patch of IFN VEC_SET for ARRAY_REF(VIEW_CONVERT_EXPR) is committed, this patch set enables expanding IFN VEC_SET for Power9 and Power8 with specfic instruc

[PATCH] Fix incorrect loop exit edge probability [PR103270]

2021-11-22 Thread Xionghu Luo via Gcc-patches

r12-4526 cancelled jump thread path rotates loop. It exposes a issue in profile-estimate when predict_extra_loop_exits, outer loop's exit edge is marked as inner loop's extra loop exit and set with incorrect prediction, then a hot inner loop will become cold loop finally through optimizations, this

Re: [PATCH] Fix incorrect loop exit edge probability [PR103270]

2021-11-22 Thread Xionghu Luo via Gcc-patches

On 2021/11/23 13:51, Xionghu Luo wrote: > r12-4526 cancelled jump thread path rotates loop. It exposes a issue in > profile-estimate when predict_extra_loop_exits, outer loop's exit edge > is marked as inner loop's extra loop exit and set with incorrect > prediction, then a hot inner loop will b

[PATCH v2] Fix incorrect loop exit edge probability [PR103270]

2021-11-23 Thread Xionghu Luo via Gcc-patches

On 2021/11/23 17:50, Jan Hubicka wrote: >> On Tue, Nov 23, 2021 at 6:52 AM Xionghu Luo wrote: >>> >>> r12-4526 cancelled jump thread path rotates loop. It exposes a issue in >>> profile-estimate when predict_extra_loop_exits, outer loop's exit edge >>> is marked as inner loop's extra loop exit and

Re: [PATCH v3 1/4] Fix loop split incorrect count and probability

2021-11-23 Thread Xionghu Luo via Gcc-patches

Gentle ping, thanks. [PATCH v3] Fix loop split incorrect count and probability https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583626.html On 2021/11/8 14:09, Xionghu Luo via Gcc-patches wrote: > > > On 2021/10/27 15:44, Jan Hubicka wrote: >>> On Wed, 27 Oct 2021,

Ping: [PATCH v7 2/2] Don't move cold code out of loop by checking bb count

2021-11-23 Thread Xionghu Luo via Gcc-patches

Gentle ping and is this patch still suitable for stage 3? Thanks. [PATCH v7 2/2] Don't move cold code out of loop by checking bb count https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583911.html On 2021/11/10 11:08, Xionghu Luo via Gcc-patches wrote: > > > On 2

Re: [PATCH v8 2/2] Don't move cold code out of loop by checking bb count

2021-12-05 Thread Xionghu Luo via Gcc-patches

On 2021/12/1 18:09, Richard Biener wrote: > On Wed, Nov 10, 2021 at 4:08 AM Xionghu Luo wrote: >> >> >> >> On 2021/11/4 21:00, Richard Biener wrote: >>> On Wed, Nov 3, 2021 at 2:29 PM Xionghu Luo wrote: > + while (outmost_loop != loop) > +{ > + if (bb_colder_tha

Ping: [PATCH v2] Fix incorrect loop exit edge probability [PR103270]

2021-12-05 Thread Xionghu Luo via Gcc-patches

Hi Honza, Gentle ping for this :), thanks. https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585289.html On 2021/11/24 13:03, Xionghu Luo via Gcc-patches wrote: > On 2021/11/23 17:50, Jan Hubicka wrote: >>> On Tue, Nov 23, 2021 at 6:52 AM Xionghu Luo wrote: >>>>

Re: [PATCH v8 2/2] Don't move cold code out of loop by checking bb count

2021-12-05 Thread Xionghu Luo via Gcc-patches

On 2021/12/6 13:09, Xionghu Luo via Gcc-patches wrote: > > > On 2021/12/1 18:09, Richard Biener wrote: >> On Wed, Nov 10, 2021 at 4:08 AM Xionghu Luo wrote: >>> >>> >>> >>> On 2021/11/4 21:00, Richard Biener wrote: >

[PATCH 0/3] Dependency patches for hoist LIM code to cold loop

2021-12-07 Thread Xionghu Luo via Gcc-patches

This patchset is a recollect of previously sent patches. Thanks Richard that The "Don't move cold code out of loop by checking bb count" is approved[1], but there are still 3 prerequesite patches to supplement or avoid regression. 1) Patch [1/3] is the RTL part of not hoisting LIM code out of col

[PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-07 Thread Xionghu Luo via Gcc-patches

gcc/ChangeLog: * loop-invariant.c (find_invariants_bb): Check profile count before motion. (find_invariants_body): Add argument. --- gcc/loop-invariant.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/gcc/loop-invariant.c b/gcc/loop-invarian

[PATCH 2/3] Fix incorrect loop exit edge probability [PR103270]

2021-12-07 Thread Xionghu Luo via Gcc-patches

r12-4526 cancelled jump thread path rotates loop. It exposes a issue in profile-estimate when predict_extra_loop_exits, outer loop's exit edge is marked as inner loop's extra loop exit and set with incorrect prediction, then a hot inner loop will become cold loop finally through optimizations, this

[PATCH 3/3] Fix loop split incorrect count and probability

2021-12-07 Thread Xionghu Luo via Gcc-patches

In tree-ssa-loop-split.c, split_loop and split_loop_on_cond does two kind of split. split_loop only works for single loop and insert edge at exit when split, while split_loop_on_cond is not limited to single loop and insert edge at latch when split. Both split behavior should consider loop count a

Re: [PATCH v8 2/2] Don't move cold code out of loop by checking bb count

2021-12-07 Thread Xionghu Luo via Gcc-patches

On 2021/12/7 20:17, Richard Biener wrote: >>> + class loop *coldest_loop = coldest_outermost_loop[loop->num]; >>> + if (loop_depth (coldest_loop) < loop_depth (outermost_loop)) >>> +{ >>> + class loop *hotter_loop = hotter_than_inner_loop[loop->num]; >>> + if (!hotter_loop >>> +

[PATCH] rs6000: powerpc suboptimal boolean test of contiguous bits [PR102239]

2021-12-12 Thread Xionghu Luo via Gcc-patches

Add specialized version to combine two instructions from 9: {r123:CC=cmp(r124:DI&0x6,0);clobber scratch;} REG_DEAD r124:DI 10: pc={(r123:CC==0)?L15:pc} REG_DEAD r123:CC to: 10: {pc={(r123:DI&0x6==0)?L15:pc};clobber scratch;clobber %0:CC;} then split2 will split i

Re: [PATCH 3/3] Fix loop split incorrect count and probability

2021-12-13 Thread Xionghu Luo via Gcc-patches

On 2021/12/9 07:47, Jeff Law wrote: >> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c >> index 3f6ad046623..33128061aab 100644 >> --- a/gcc/tree-ssa-loop-split.c >> +++ b/gcc/tree-ssa-loop-split.c >> >> @@ -607,6 +610,38 @@ split_loop (class loop *loop1) >> tree guard_n

Re: [PATCH v4] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2022-08-16 Thread Xionghu Luo via Gcc-patches

On 2022/8/16 14:53, Kewen.Lin wrote: Hi Xionghu, Thanks for the updated version of patch, some comments are inlined. on 2022/8/11 14:15, Xionghu Luo wrote: On 2022/8/11 01:07, Segher Boessenkool wrote: On Wed, Aug 10, 2022 at 02:39:02PM +0800, Xionghu Luo wrote: On 2022/8/9 11:01, Kewen

Ping: [PATCH v4] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2022-08-23 Thread Xionghu Luo via Gcc-patches

Hi Segher, I'd like to resend and ping for this patch. Thanks. From 23bffdacdf0eb1140c7a3571e6158797f4818d57 Mon Sep 17 00:00:00 2001 From: Xionghu Luo Date: Thu, 4 Aug 2022 03:44:58 + Subject: [PATCH v4] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069] v4: Update

Ping: [PATCH 0/4] rs6000: Enable variable vec_insert with IFN VEC_SET

2020-11-04 Thread Xionghu Luo via Gcc-patches

Ping. On 2020/10/10 16:08, Xionghu Luo wrote: Originated from https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554240.html with patch split and some refinement per review comments. Patch of IFN VEC_SET for ARRAY_REF(VIEW_CONVERT_EXPR) is committed, this patch set enables expanding IFN V

Ping^2: [PATCH 0/4] rs6000: Enable variable vec_insert with IFN VEC_SET

2020-11-12 Thread Xionghu Luo via Gcc-patches

Ping^2, thanks. On 2020/11/5 09:34, Xionghu Luo via Gcc-patches wrote: Ping. On 2020/10/10 16:08, Xionghu Luo wrote: Originated from https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554240.html with patch split and some refinement per review comments. Patch of IFN VEC_SET for

Re: [PATCH] rs6000: Don't split constant operator add before reload, move to temp register for future optimization

2020-11-13 Thread Xionghu Luo via Gcc-patches

Hi, On 2020/10/27 05:10, Segher Boessenkool wrote: > On Wed, Oct 21, 2020 at 03:25:29AM -0500, Xionghu Luo wrote: >> Don't split code from add3 for SDI to allow a later pass to split. > > This is very problematic. > >> This allows later logic to hoist out constant load in add instructions. > >

Re: [PATCH] tree-optimization/102155 - fix LIM fill_always_executed_in CFG walk

2021-09-01 Thread Xionghu Luo via Gcc-patches

On 2021/9/1 17:58, Richard Biener wrote: This fixes the CFG walk order of fill_always_executed_in to use RPO oder rather than the dominator based order computed by get_loop_body_in_dom_order. That fixes correctness issues with unordered dominator children. The RPO order computed by rev_post_

Re: [PATCH] tree-optimization/102155 - fix LIM fill_always_executed_in CFG walk

2021-09-02 Thread Xionghu Luo via Gcc-patches

On 2021/9/2 16:50, Richard Biener wrote: > On Thu, 2 Sep 2021, Richard Biener wrote: > >> On Thu, 2 Sep 2021, Xionghu Luo wrote: >> >>> >>> >>> On 2021/9/1 17:58, Richard Biener wrote: This fixes the CFG walk order of fill_always_executed_in to use RPO oder rather than the dominator b

Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]

2021-09-02 Thread Xionghu Luo via Gcc-patches

Resend the patch that addressed Will's comments. fmod/fmodf and remainder/remainderf could be expanded instead of library call when fast-math build, which is much faster. fmodf: fdivs f0,f1,f2 frizf0,f0 fnmsubs f1,f2,f0,f1 remainderf: fdivs f0,f1,f2 frinf0,f

Ping ^ 2: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]

2021-09-05 Thread Xionghu Luo via Gcc-patches

Ping^2, thanks. https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html On 2021/6/30 09:42, Xionghu Luo via Gcc-patches wrote: Gentle ping, thanks. https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html On 2021/5/14 14:57, Xionghu Luo via Gcc-patches wrote: Hi, On 2021/5/13

Ping ^ 2: [PATCH] rs6000: Remove unspecs for vec_mrghl[bhw]

2021-09-05 Thread Xionghu Luo via Gcc-patches

Ping^2, thanks. https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572330.html On 2021/6/30 09:47, Xionghu Luo via Gcc-patches wrote: Gentle ping, thanks. https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572330.html On 2021/6/9 16:03, Xionghu Luo via Gcc-patches wrote: Hi, On 2021/6

Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]

2021-09-06 Thread Xionghu Luo via Gcc-patches

On 2021/9/4 05:44, Segher Boessenkool wrote: Hi! On Fri, Sep 03, 2021 at 10:31:24AM +0800, Xionghu Luo wrote: fmod/fmodf and remainder/remainderf could be expanded instead of library call when fast-math build, which is much faster. Thank you very much for this patch. Some trivial comments

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-09-08 Thread Xionghu Luo via Gcc-patches

On 2021/8/26 19:33, Richard Biener wrote: On Tue, Aug 10, 2021 at 4:03 AM Xionghu Luo wrote: Hi, On 2021/8/6 20:15, Richard Biener wrote: On Mon, Aug 2, 2021 at 7:05 AM Xiong Hu Luo wrote: There was a patch trying to avoid move cold block out of loop: https://gcc.gnu.org/pipermail/gcc

Re: [PATCH] tree-optimization/102155 - fix LIM fill_always_executed_in CFG walk

2021-09-08 Thread Xionghu Luo via Gcc-patches

On 2021/9/2 18:37, Richard Biener wrote: On Thu, 2 Sep 2021, Xionghu Luo wrote: On 2021/9/2 16:50, Richard Biener wrote: On Thu, 2 Sep 2021, Richard Biener wrote: On Thu, 2 Sep 2021, Xionghu Luo wrote: On 2021/9/1 17:58, Richard Biener wrote: This fixes the CFG walk order of fill_a

Re: [PATCH] tree-optimization/102155 - fix LIM fill_always_executed_in CFG walk

2021-09-10 Thread Xionghu Luo via Gcc-patches

On 2021/9/9 18:55, Richard Biener wrote: diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 5d6845478e7..4b187c2cdaf 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -3074,15 +3074,13 @@ fill_always_executed_in_1 (class loop *loop, sbitmap contains_call)

Re: [PATCH] tree-optimization/102155 - fix LIM fill_always_executed_in CFG walk

2021-09-12 Thread Xionghu Luo via Gcc-patches

On 2021/9/10 21:54, Xionghu Luo via Gcc-patches wrote: On 2021/9/9 18:55, Richard Biener wrote: diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 5d6845478e7..4b187c2cdaf 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -3074,15 +3074,13

Re: [PATCH] tree-optimization/102155 - fix LIM fill_always_executed_in CFG walk

2021-09-13 Thread Xionghu Luo via Gcc-patches

On 2021/9/13 16:17, Richard Biener wrote: On Mon, 13 Sep 2021, Xionghu Luo wrote: On 2021/9/10 21:54, Xionghu Luo via Gcc-patches wrote: On 2021/9/9 18:55, Richard Biener wrote: diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 5d6845478e7..4b187c2cdaf 100644 --- a

Re: Ping ^ 3: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]

2021-09-15 Thread Xionghu Luo via Gcc-patches

Ping^3, thanks. https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html On 2021/9/6 08:52, Xionghu Luo via Gcc-patches wrote: Ping^2, thanks. https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html On 2021/6/30 09:42, Xionghu Luo via Gcc-patches wrote: Gentle ping, thanks

[PATCH v2 2/2] rs6000: Fold xxsel to vsel since they have same semantics

2021-09-16 Thread Xionghu Luo via Gcc-patches

Fold xxsel to vsel like xxperm/vperm to avoid duplicate code. gcc/ChangeLog: 2021-09-17 Xionghu Luo * config/rs6000/altivec.md: Add vsx register constraints. * config/rs6000/vsx.md (vsx_xxsel): Delete. (vsx_xxsel2): Likewise. (vsx_xxsel3): Likewise. (vs

[PATCH v2 0/2] Fix vec_sel code generation and merge xxsel to vsel

2021-09-16 Thread Xionghu Luo via Gcc-patches

These two patches are updated version from: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579490.html Changes: 1. Fix alignment error in md files. 2. Replace rtx_equal_p with match_dup. 3. Use register_operand instead of gpc_reg_operand to align with vperm/xxperm. 4. Regression teste

[PATCH v2 1/2] rs6000: Fix wrong code generation for vec_sel [PR94613]

2021-09-16 Thread Xionghu Luo via Gcc-patches

The vsel instruction is a bit-wise select instruction. Using an IF_THEN_ELSE to express it in RTL is wrong and leads to wrong code being generated in the combine pass. Per element selection is a subset of per bit-wise selection,with the patch the pattern is written using bit operations. But ther

Re: Ping ^ 3: [PATCH] rs6000: Fix wrong code generation for vec_sel [PR94613]

2021-09-16 Thread Xionghu Luo via Gcc-patches

ns? Other than that question / suggestion, this patch is okay. Please coordinate with Bill and his builtin patches. OK. Thanks, David On Wed, Sep 15, 2021 at 3:50 AM Xionghu Luo wrote: Ping^3, thanks. https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570333.html On 2021/9/6 08:52,

Re: [PATCH] Fix loop split incorrect count and probability

2021-09-22 Thread Xionghu Luo via Gcc-patches

On 2021/8/11 17:16, Richard Biener wrote: On Wed, 11 Aug 2021, Xionghu Luo wrote: On 2021/8/10 22:47, Richard Biener wrote: On Mon, 9 Aug 2021, Xionghu Luo wrote: Thanks, On 2021/8/6 19:46, Richard Biener wrote: On Tue, 3 Aug 2021, Xionghu Luo wrote: loop split condition is moved be

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-09-22 Thread Xionghu Luo via Gcc-patches

On 2021/9/22 17:14, Richard Biener wrote: On Thu, Sep 9, 2021 at 3:56 AM Xionghu Luo wrote: On 2021/8/26 19:33, Richard Biener wrote: On Tue, Aug 10, 2021 at 4:03 AM Xionghu Luo wrote: Hi, On 2021/8/6 20:15, Richard Biener wrote: On Mon, Aug 2, 2021 at 7:05 AM Xiong Hu Luo wrote:

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-09-22 Thread Xionghu Luo via Gcc-patches

On 2021/9/23 10:13, Xionghu Luo via Gcc-patches wrote: On 2021/9/22 17:14, Richard Biener wrote: On Thu, Sep 9, 2021 at 3:56 AM Xionghu Luo wrote: On 2021/8/26 19:33, Richard Biener wrote: On Tue, Aug 10, 2021 at 4:03 AM Xionghu Luo wrote: Hi, On 2021/8/6 20:15, Richard Biener

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-09-23 Thread Xionghu Luo via Gcc-patches

Update the patch to v3, not sure whether you prefer the paste style and continue to link the previous thread as Segher dislikes this... [PATCH v3] Don't move cold code out of loop by checking bb count Changes: 1. Handle max_loop in determine_max_movement instead of outermost_invariant_loop. 2.

Re: [PATCH v2 2/4] Refactor loop_version

2021-10-31 Thread Xionghu Luo via Gcc-patches

On 2021/10/29 19:52, Richard Biener wrote: > On Wed, 27 Oct 2021, Xionghu Luo wrote: > >> loop_version currently does lv_adjust_loop_entry_edge >> before it loopifys the copy inserted on the header. This patch moves >> the condition generation later and thus we have four pieces to help >> unde

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-11-03 Thread Xionghu Luo via Gcc-patches

On 2021/10/29 19:48, Richard Biener wrote: > I'm talking about the can_sm_ref_p call, in that context 'loop' will > be the outermost loop of > interest, and we are calling this for all stores in a loop. We're doing > > +bool > +ref_in_loop_hot_body::operator () (mem_ref_loc *loc) > +{ > + bas

[PATCH] rs6000: Fix incorrect fusion constraint [PR102991]

2021-11-03 Thread Xionghu Luo via Gcc-patches

The clobber constraint should match operand's constraint. fusion.md was generated by genfusion.pl, but it is disabled now, update both places with correct clobber constraint. gcc/ChangeLog: * config/rs6000/fusion.md: Fix incorrect clobber constraint. * config/rs6000/genfusion.pl:

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-11-03 Thread Xionghu Luo via Gcc-patches

On 2021/10/29 19:48, Richard Biener wrote: > I'm talking about the can_sm_ref_p call, in that context 'loop' will > be the outermost loop of > interest, and we are calling this for all stores in a loop. We're doing > > +bool > +ref_in_loop_hot_body::operator () (mem_ref_loc *loc) > +{ > + bas

Re: [PATCH] rs6000: Fix incorrect fusion constraint [PR102991]

2021-11-03 Thread Xionghu Luo via Gcc-patches

On 2021/11/3 23:13, David Edelsohn wrote: > Did you manually change fusion.md or did you regenerate it after > fixing genfusion.pl? > > If you regenerated it, the ChangeLog entry should be "Regenerated" and > the "Fix incorrect clobber constraint." should refer to the > genfusion.pl change. >

Re: [PATCH] rs6000: Fix incorrect fusion constraint [PR102991]

2021-11-04 Thread Xionghu Luo via Gcc-patches

On 2021/11/4 09:59, David Edelsohn wrote: > On Wed, Nov 3, 2021 at 9:46 PM Xionghu Luo wrote: >> >> On 2021/11/3 23:13, David Edelsohn wrote: >>> Did you manually change fusion.md or did you regenerate it after >>> fixing genfusion.pl? >>> >>> If you regenerated it, the ChangeLog entry should b

Re: [PATCH] rs6000: Fix incorrect fusion constraint [PR102991]

2021-11-04 Thread Xionghu Luo via Gcc-patches

On 2021/11/5 08:58, David Edelsohn wrote: > On Thu, Nov 4, 2021 at 8:50 PM Xionghu Luo wrote: > >> [PATCH] rs6000: Fix incorrect fusion constraint [PR102991] >> >> gcc/ChangeLog: >> >> * config/rs6000/fusion.md: Regenerate. >> * config/rs6000/genfusion.pl: Fix incorrect clobber

Re: [PATCH v3 1/4] Fix loop split incorrect count and probability

2021-11-07 Thread Xionghu Luo via Gcc-patches

On 2021/10/27 15:44, Jan Hubicka wrote: >> On Wed, 27 Oct 2021, Jan Hubicka wrote: >> gcc/ChangeLog: * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. (do_split_loop_on_cond): Likewise. --- gcc/tree-ssa-loop-split.c | 25 --

[PATCH v7 2/2] Don't move cold code out of loop by checking bb count

2021-11-09 Thread Xionghu Luo via Gcc-patches

On 2021/11/4 21:00, Richard Biener wrote: > On Wed, Nov 3, 2021 at 2:29 PM Xionghu Luo wrote: >> >> >>> + while (outmost_loop != loop) >>> +{ >>> + if (bb_colder_than_loop_preheader (loop_preheader_edge >>> (outmost_loop)->src, >>> +loop_prehead

[PATCH] rs6000: Remove unspecs for vec_mrghl[bhw]

2021-05-24 Thread Xionghu Luo via Gcc-patches

From: Xiong Hu Luo vmrghb only accepts permute index {0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, 22, 7, 23} no matter for BE or LE in ISA, similarly for vmrghlb. Remove UNSPEC_VMRGH_DIRECT/UNSPEC_VMRGL_DIRECT pattern as vec_select + vec_concat as normal RTL. Tested pass on P8LE, P9LE and P8BE{

[PATCH] rs6000: Support doubleword swaps removal in rot64 load store [PR100085]

2021-06-02 Thread Xionghu Luo via Gcc-patches

On P8LE, extra rot64+rot64 load or store instructions are generated in float128 to vector __int128 conversion. This patch teaches pass swaps to also handle such pattens to remove extra swap instructions. (insn 7 6 8 2 (set (subreg:V1TI (reg:KF 123) 0) (rotate:V1TI (mem/u/c:V1TI (reg/f:DI

1 2 3 >

1 - 100 of 215 matches

Mail list logo