Re: [PATCH] rs6000: Define movsf_from_si2 to extract high part SF element from DImode[PR89310]

2020-07-20 Thread luoxhu via Gcc-patches
On 2020/7/20 23:31, Segher Boessenkool wrote: On Mon, Jul 13, 2020 at 02:30:28PM +0800, luoxhu wrote: For extracting high part element from DImode register like: {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;} split it before reload with "and mask" to avoid generating sh

Re: [PATCH v2] dse: Remove partial load after full store for high part access[PR71309]

2020-07-22 Thread luoxhu via Gcc-patches
Hi, On 2020/7/21 23:30, Richard Sandiford wrote: > Xiong Hu Luo writes:>> @@ -1872,9 +1872,27 @@ > get_stored_val (store_info *store_info, machine_mode read_mode, >> { >> poly_int64 shift = gap * BITS_PER_UNIT; >> poly_int64 access_size = GET_MODE_SIZE (read_mode) + gap; >>

Re: [PATCH v3] dse: Remove partial load after full store for high part access[PR71309]

2020-07-22 Thread luoxhu via Gcc-patches
Hi, On 2020/7/22 19:05, Richard Sandiford wrote: > This wasn't really what I meant. Using subregs is fine, but I was > thinking of: > >/* Also try a wider mode if the necessary punning is either not >desirable or not possible. */ >if (!CONSTANT_P (store_info->rhs) >

Re: [PATCH v3] dse: Remove partial load after full store for high part access[PR71309]

2020-07-23 Thread luoxhu via Gcc-patches
On 2020/7/23 04:30, Richard Sandiford wrote: > > I now realise the reason is that the starting mode is too wide. > I think we should fix that by doing: > >FOR_EACH_MODE_IN_CLASS (new_mode_iter, MODE_INT) > { >… > > and then add: > >if (maybe_lt (GET_MODE_SIZE (new_mo

[PATCH v4] dse: Remove partial load after full store for high part access[PR71309]

2020-07-24 Thread luoxhu via Gcc-patches
Hi Richard, This is the updated version that could pass all regression test on Power9-LE. Just need another "maybe_lt (GET_MODE_SIZE (new_mode), access_size)" before generating shift for store_info->const_rhs to ensure correct constant is generated, take testsuite/gfortran1/equiv_2.x for example

Re: [PATCH v4] dse: Remove partial load after full store for high part access[PR71309]

2020-07-28 Thread luoxhu via Gcc-patches
Gentle ping in case this mail is missed, Thanks :) https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550602.html Xionghu On 2020/7/24 18:47, luoxhu via Gcc-patches wrote: Hi Richard, This is the updated version that could pass all regression test on Power9-LE. Just need another "may

[PATCH v5] dse: Remove partial load after full store for high part access[PR71309]

2020-08-02 Thread luoxhu via Gcc-patches
Thanks, the v5 update as comments: 1. Move const_rhs shift out of loop; 2. Iterate from int size for read_mode. This patch could optimize(works for char/short/int/void*): 6: r119:TI=[r118:DI+0x10] 7: [r118:DI]=r119:TI 8: r121:DI=[r118:DI+0x8] => 6: r119:TI=[r118:DI+0x10] 16: r122:DI=r119:TI#

Re: [PATCH v5] dse: Remove partial load after full store for high part access[PR71309]

2020-08-03 Thread luoxhu via Gcc-patches
On 2020/8/3 22:01, Richard Sandiford wrote: /* Try a wider mode if truncating the store mode to NEW_MODE requires a real instruction. */ if (maybe_lt (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (store_mode)) @@ -1779,6 +1780,25 @@ find_shift_sequence (poly_int64 access_s

Re: [PATCH v5] dse: Remove partial load after full store for high part access[PR71309]

2020-08-05 Thread luoxhu via Gcc-patches
Hi Richard, On 2020/8/3 22:01, Richard Sandiford wrote: /* Try a wider mode if truncating the store mode to NEW_MODE requires a real instruction. */ if (maybe_lt (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (store_mode)) @@ -1779,6 +1780,25 @@ find_shift_sequence (poly_int6

Re: [PATCH] rs6000: Save/restore r31 if frame_pointer_needed is true

2020-03-26 Thread luoxhu via Gcc-patches
On 2020/3/27 07:59, Segher Boessenkool wrote: > Hi! > > On Wed, Mar 25, 2020 at 11:15:22PM -0500, luo...@linux.ibm.com wrote: >> frame_pointer_needed is set to true in reload pass setup_can_eliminate, >> but regs_ever_live[31] is false, so pro_and_epilogue doesn't save/restore >> r31 even it is

Re: [PATCH] rs6000: Don't split constant oprator when add, move to temp register for future optimization

2020-03-29 Thread luoxhu via Gcc-patches
On 2020/3/27 22:33, Segher Boessenkool wrote: > Hi! > > On Thu, Mar 26, 2020 at 05:06:43AM -0500, luo...@linux.ibm.com wrote: >> Remove split code from add3 to allow a later pass to split. >> This allows later logic to hoist out constant load in add instructions. >> In loop, lis+ori could be ho

Re: [PATCH] rs6000: Save/restore r31 if frame_pointer_needed is true

2020-03-29 Thread luoxhu via Gcc-patches
On 2020/3/28 00:04, Segher Boessenkool wrote: Hi! On Fri, Mar 27, 2020 at 09:34:00AM +0800, luoxhu wrote: On 2020/3/27 07:59, Segher Boessenkool wrote: On Wed, Mar 25, 2020 at 11:15:22PM -0500, luo...@linux.ibm.com wrote: frame_pointer_needed is set to true in reload pass

Re: [PATCH] rs6000: Don't split constant oprator when add, move to temp register for future optimization

2020-04-02 Thread luoxhu via Gcc-patches
On 2020/4/3 06:16, Segher Boessenkool wrote: > Hi! > > On Mon, Mar 30, 2020 at 11:59:57AM +0800, luoxhu wrote: >>> Do we want something later in the RTL pipeline to make "addi"s etc. again? > > (This would be a good thing to consider -- maybe a define_in

[PATCH v2] rs6000: Don't use HARD_FRAME_POINTER_REGNUM if it's not live in pro_and_epilogue (PR91518)

2020-04-12 Thread luoxhu via Gcc-patches
This bug is exposed by FRE refactor of r263875. Comparing the fre dump file shows no obvious change of the segment fault function proves it to be a target issue. frame_pointer_needed is set to true in reload pass setup_can_eliminate, but regs_ever_live[31] is false, pro_and_epilogue uses it withou

[PATCH] Fold (add -1; zero_ext; add +1) operations to zero_ext when not zero (PR37451, PR61837)

2020-04-15 Thread luoxhu--- via Gcc-patches
From: Xionghu Luo This "subtract/extend/add" existed for a long time and still annoying us (PR37451, PR61837) when converting from 32bits to 64bits, as the ctr register is used as 64bits on powerpc64, Andraw Pinski had a patch but caused some issue and reverted by Joseph S. Myers(PR37451, PR37782

Re: [PATCH v2] rs6000: Don't use HARD_FRAME_POINTER_REGNUM if it's not live in pro_and_epilogue (PR91518)

2020-04-16 Thread luoxhu via Gcc-patches
On 2020/4/17 08:52, Segher Boessenkool wrote: > Hi! > > On Mon, Apr 13, 2020 at 10:11:43AM +0800, luoxhu wrote: >> frame_pointer_needed is set to true in reload pass setup_can_eliminate, >> but regs_ever_live[31] is false, pro_and_epilogue uses it without live >>

Re: [PATCH] Fold (add -1; zero_ext; add +1) operations to zero_ext when not zero (PR37451, PR61837)

2020-04-20 Thread luoxhu via Gcc-patches
Hi, On 2020/4/18 00:32, Segher Boessenkool wrote: > On Thu, Apr 16, 2020 at 08:21:40PM -0500, Segher Boessenkool wrote: >> On Wed, Apr 15, 2020 at 10:18:16AM +0100, Richard Sandiford wrote: >>> luoxhu--- via Gcc-patches writes: >>>> -count = simplify_gen_binary

Re: [PATCH] Fold (add -1; zero_ext; add +1) operations to zero_ext when not zero (PR37451, PR61837)

2020-04-20 Thread luoxhu via Gcc-patches
Tiny update to accommodate unsigned int compare. On 2020/4/20 16:21, luoxhu via Gcc-patches wrote: Hi, On 2020/4/18 00:32, Segher Boessenkool wrote: On Thu, Apr 16, 2020 at 08:21:40PM -0500, Segher Boessenkool wrote: On Wed, Apr 15, 2020 at 10:18:16AM +0100, Richard Sandiford wrote: luoxhu

Re: [PATCH] Add value range info for affine combination to improve store motion (PR83403)

2020-04-28 Thread luoxhu via Gcc-patches
On 2020/4/28 15:01, Richard Biener wrote: > On Tue, 28 Apr 2020, Xionghu Luo wrote: > >> From: Xionghu Luo >> >> Get and propagate value range info to convert expressions with convert >> operation on PLUS_EXPR/MINUS_EXPR/MULT_EXPR when not overflow. i.e.: >> >> (long unsigned int)((unsigned

Re: [PATCH] Add value range info for affine combination to improve store motion (PR83403)

2020-04-29 Thread luoxhu via Gcc-patches
On 2020/4/28 18:30, Richard Biener wrote: > > OK, I guess instead of get_range_info expr_to_aff_combination could > simply use determine_value_range (op0, &minv, &maxv) == VR_RANGE > (the && TREE_CODE (op0) == SSA_NAME check can then be removed)? > Tried with determine_value_range, it works

[PATCH v2] Add handling of MULT_EXPR/PLUS_EXPR for wrapping overflow in affine combination(PR83403)

2020-04-30 Thread luoxhu via Gcc-patches
Update the patch with overflow check. Bootstrap and regression tested PASS on Power8-LE. Use determine_value_range to get value range info for fold convert expressions with internal operation PLUS_EXPR/MINUS_EXPR/MULT_EXPR when not overflow on wrapping overflow inner type. i.e.: (long unsigne

Re: [PATCH v2] Add handling of MULT_EXPR/PLUS_EXPR for wrapping overflow in affine combination(PR83403)

2020-05-11 Thread luoxhu via Gcc-patches
在 2020-05-06 20:09,Richard Biener 写道: On Thu, 30 Apr 2020, luoxhu wrote: Update the patch with overflow check. Bootstrap and regression tested PASS on Power8-LE. Use determine_value_range to get value range info for fold convert expressions with internal operation PLUS_EXPR/MINUS_EXPR

[PATCH v2] Fold (add -1; zero_ext; add +1) operations to zero_ext when not overflow (PR37451, part of PR61837)

2020-05-11 Thread luoxhu via Gcc-patches
Minor refine of checking iterations nonoverflow and a testcase for stage 1. This "subtract/extend/add" existed for a long time and still annoying us (PR37451, part of PR61837) when converting from 32bits to 64bits, as the ctr register is used as 64bits on powerpc64, Andraw Pinski had a patch but

Re: [PATCH v2] Fold (add -1; zero_ext; add +1) operations to zero_ext when not overflow (PR37451, part of PR61837)

2020-05-13 Thread luoxhu via Gcc-patches
On 2020/5/13 02:24, Richard Sandiford wrote: > luoxhu writes: >> + /* Fold (add -1; zero_ext; add +1) operations to zero_ext. i.e: >> + >> + 73: r145:SI=r123:DI#0-0x1 >> + 74: r144:DI=zero_extend (r145:SI) >> + 75: r143:DI=r144:DI+0x1 >> +

Re: [PATCH] rs6000: Use REAL_TYPE to copy when block move array in structure[PR65421]

2020-06-07 Thread luoxhu via Gcc-patches
Hi, On 2020/6/3 04:32, Segher Boessenkool wrote: > Hi Xiong Hu, > > On Tue, Jun 02, 2020 at 04:41:50AM -0500, Xionghu Luo wrote: >> Double array in structure as function arguments or return value is accessed >> by BLKmode, they are stored to stack and load from stack with redundant >> conversion

Ping^1 : [PATCH] [stage1] ipa-cp: Fix PGO regression caused by r278808

2020-06-15 Thread luoxhu via Gcc-patches
Gentle ping... On 2020/6/1 09:45, Xionghu Luo wrote: resend the patch for stage1: https://gcc.gnu.org/pipermail/gcc-patches/2020-January/538186.html The performance of exchange2 built with PGO will decrease ~28% by r278808 due to profile count set incorrectly. The cloned nodes are updated to

<    1   2