On 2020/9/15 14:51, Richard Biener wrote:
>> I only see VAR_DECL and PARM_DECL, is there any function to check the tree
>> variable is global? I added DECL_REGISTER, but the RTL still expands to
>> stack:
>
> is_global_var () or alternatively !auto_var_in_fn_p (), I think doing
> IFN_SET onl
On 2020/9/14 17:47, Richard Biener wrote:
On Mon, Sep 14, 2020 at 10:05 AM luoxhu wrote:
Not sure whether this reflects the issues you discussed above.
I constructed below test cases and tested with and without this patch,
only if "a+c"(which means store only), the performance
On 2020/9/10 18:08, Richard Biener wrote:
> On Wed, Sep 9, 2020 at 6:03 PM Segher Boessenkool
> wrote:
>>
>> On Wed, Sep 09, 2020 at 04:28:19PM +0200, Richard Biener wrote:
>>> On Wed, Sep 9, 2020 at 3:49 PM Segher Boessenkool
>>> wrote:
Hi!
On Tue, Sep 08, 2020 at 10:26:51
On 2020/9/8 16:26, Richard Biener wrote:
>> Seems not only pseudo, for example "v = vec_insert (i, v, n);"
>> the vector variable will be store to stack first, then [r112:DI] is a
>> memory here to be processed. So the patch loads it from stack(insn #10) to
>> temp vector register first, and st
Hi Richi,
On 2020/9/7 19:57, Richard Biener wrote:
> + if (TREE_CODE (to) == ARRAY_REF)
> + {
> + tree op0 = TREE_OPERAND (to, 0);
> + if (TREE_CODE (op0) == VIEW_CONVERT_EXPR
> + && expand_view_convert_to_vec_set (to, from, to_rtx))
> + {
> +
Hi,
On 2020/9/4 18:23, Segher Boessenkool wrote:
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 03b00738a5e..00c65311f76 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
/* Build *(((arg1_inner_type*)&(vector type){arg1})+arg2)
On 2020/9/4 15:23, Richard Biener wrote:
> On Fri, Sep 4, 2020 at 9:19 AM Richard Biener
> wrote:
>>
>> On Fri, Sep 4, 2020 at 8:38 AM luoxhu wrote:
>>>
>>>
>>>
>>> On 2020/9/4 14:16, luoxhu via Gcc-patches wrote:
>>>>
On 2020/9/4 14:16, luoxhu via Gcc-patches wrote:
Hi,
Yes, I checked and found that both vec_set and vec_extract doesn't support
variable index for most targets, store_bit_field_1 and extract_bit_field_1
would only consider use optabs when index is integer value. Anyway, it
shouldn
Hi,
On 2020/9/3 18:29, Richard Biener wrote:
> On Thu, Sep 3, 2020 at 11:20 AM luoxhu wrote:
>>
>>
>>
>> On 2020/9/2 17:30, Richard Biener wrote:
>>>> so maybe bypass convert_vector_to_array_for_subscript for special
>>>> circumstance
>>&
On 2020/9/2 17:30, Richard Biener wrote:
>> so maybe bypass convert_vector_to_array_for_subscript for special
>> circumstance
>> like "i = v[n%4]" or "v[n&3]=i" to generate vec_extract or vec_insert builtin
>> call a relative simpler method?
> I think you have it backward. You need to work wit
Hi,
On 2020/9/1 21:07, Richard Biener wrote:
> On Tue, Sep 1, 2020 at 10:11 AM luoxhu via Gcc-patches
> wrote:
>>
>> Hi,
>>
>> On 2020/9/1 01:04, Segher Boessenkool wrote:
>>> Hi!
>>>
>>> On Mon, Aug 31, 2020 at 04:06:47AM -0500, Xiong H
Hi,
On 2020/9/1 00:47, will schmidt wrote:
>> + tmode = TYPE_MODE (TREE_TYPE (arg0));
>> + mode1 = TYPE_MODE (TREE_TYPE (TREE_TYPE (arg0)));
>> + mode2 = TYPE_MODE ((TREE_TYPE (arg2)));
>> + gcc_assert (VECTOR_MODE_P (tmode));
>> +
>> + op0 = expand_expr (arg0, NULL_RTX, tmode, EXPAND_NORMAL)
Hi,
On 2020/9/1 01:04, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Aug 31, 2020 at 04:06:47AM -0500, Xiong Hu Luo wrote:
>> vec_insert accepts 3 arguments, arg0 is input vector, arg1 is the value
>> to be insert, arg2 is the place to insert arg1 to arg0. This patch adds
>> __builtin_vec_insert_v
Hi,
On 2020/8/13 20:52, Jan Hubicka wrote:
>> Since there are no other callers outside of these specialized nodes, the
>> guessed profile count should be same equal? Perf tool shows that even
>> each specialized node is called only once, none of them take same time for
>> each call:
>>
>>40.6
Hi,
On 2020/8/13 01:53, Jan Hubicka wrote:
> Hello,
> with Martin we spent some time looking into exchange2 and my
> understanding of the problem is the following:
>
> There is the self recursive function digits_2 with the property that it
> has 10 nested loops and calls itself from the innermost
Hi Richard,
On 2020/8/3 22:01, Richard Sandiford wrote:
/* Try a wider mode if truncating the store mode to NEW_MODE
requires a real instruction. */
if (maybe_lt (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (store_mode))
@@ -1779,6 +1780,25 @@ find_shift_sequence (poly_int6
On 2020/8/3 22:01, Richard Sandiford wrote:
/* Try a wider mode if truncating the store mode to NEW_MODE
requires a real instruction. */
if (maybe_lt (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (store_mode))
@@ -1779,6 +1780,25 @@ find_shift_sequence (poly_int64 access_s
Thanks, the v5 update as comments:
1. Move const_rhs shift out of loop;
2. Iterate from int size for read_mode.
This patch could optimize(works for char/short/int/void*):
6: r119:TI=[r118:DI+0x10]
7: [r118:DI]=r119:TI
8: r121:DI=[r118:DI+0x8]
=>
6: r119:TI=[r118:DI+0x10]
16: r122:DI=r119:TI#
Gentle ping in case this mail is missed, Thanks :)
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550602.html
Xionghu
On 2020/7/24 18:47, luoxhu via Gcc-patches wrote:
Hi Richard,
This is the updated version that could pass all regression test on
Power9-LE.
Just need another "may
Hi Richard,
This is the updated version that could pass all regression test on
Power9-LE.
Just need another "maybe_lt (GET_MODE_SIZE (new_mode), access_size)"
before generating shift for store_info->const_rhs to ensure correct
constant is generated, take testsuite/gfortran1/equiv_2.x for example
On 2020/7/23 04:30, Richard Sandiford wrote:
>
> I now realise the reason is that the starting mode is too wide.
> I think we should fix that by doing:
>
>FOR_EACH_MODE_IN_CLASS (new_mode_iter, MODE_INT)
> {
>…
>
> and then add:
>
>if (maybe_lt (GET_MODE_SIZE (new_mo
Hi,
On 2020/7/22 19:05, Richard Sandiford wrote:
> This wasn't really what I meant. Using subregs is fine, but I was
> thinking of:
>
>/* Also try a wider mode if the necessary punning is either not
>desirable or not possible. */
>if (!CONSTANT_P (store_info->rhs)
>
Hi,
On 2020/7/21 23:30, Richard Sandiford wrote:
> Xiong Hu Luo writes:>> @@ -1872,9 +1872,27 @@
> get_stored_val (store_info *store_info, machine_mode read_mode,
>> {
>> poly_int64 shift = gap * BITS_PER_UNIT;
>> poly_int64 access_size = GET_MODE_SIZE (read_mode) + gap;
>>
On 2020/7/20 23:31, Segher Boessenkool wrote:
On Mon, Jul 13, 2020 at 02:30:28PM +0800, luoxhu wrote:
For extracting high part element from DImode register like:
{%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}
split it before reload with "and mask" to avoid generating sh
Power8-LE, I re-run these cases on
Power8-LE, and confirmed these could pass, what is your platform please?
BTW, TARGET_NO_SF_SUBREG ensured TARGET_POWERPC64 for this
define_insn_and_split.
Thanks.
Xionghu
>
> Thanks, David
>
> On Mon, Jul 13, 2020 at 2:30 AM luoxhu wrote:
>>
&g
Hi,
On 2020/7/11 08:54, Segher Boessenkool wrote:
> Hi!
>
> On Fri, Jul 10, 2020 at 09:39:40AM +0800, luoxhu wrote:
>> OK, seems the md file needs a format tool too...
>
> Heh. Just make sure it looks good (that is, does what it looks like),
> looks like the res
On 2020/7/11 08:28, Segher Boessenkool wrote:
Hi!
On Thu, Jul 09, 2020 at 09:14:45PM -0500, Xiong Hu Luo wrote:
* config/rs6000/rs6000.md (rotl_unspec): New
define_insn_and_split.
+; rldimi with UNSPEC_SI_FROM_SF.
+(define_insn_and_split "*rotl_unspec"
Please have rotldi
On 2020/7/10 03:25, Segher Boessenkool wrote:
>
>> + "TARGET_NO_SF_SUBREG"
>> + "#"
>> + "&& vsx_reg_sfsubreg_ok (operands[0], SFmode)"
>
> Put this in the insn condition? And since this is just a predicate,
> you can just use it instead of gpc_reg_operand.
>
> (The split condition become
Update patch to keep the logic for non TARGET_P8_VECTOR targets.
Please ignore the previous [PATCH 1/2], Sorry!
Move V4SF to V4SI, init vector like V4SI and move to V4SF back.
Better instruction sequence could be generated on Power9:
lfs + xxpermdi + xvcvdpsp + vmrgew
=>
lwz + (sldi + or) + mtvs
Hi,
On 2020/7/10 03:25, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Jul 09, 2020 at 11:09:42AM +0800, luoxhu wrote:
>>> Maybe change it back to just SI? It won't match often at all for QI or
>>> HI anyway, it seems. Sorry for that detour. Should be go
On 2020/7/9 06:43, Segher Boessenkool wrote:
> Hi!
>
> On Wed, Jul 08, 2020 at 11:19:21AM +0800, luoxhu wrote:
>> For extracting high part element from DImode register like:
>>
>> {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}
>>
>> sp
On 2020/7/8 05:31, Segher Boessenkool wrote:
> Hi!
>
> On Tue, Jul 07, 2020 at 04:39:58PM +0800, luoxhu wrote:
>>> Lots of questions, sorry!
>>
>> Thanks for the nice suggestions of the initial patch contains many issues:),
>
> Pretty much all of it should
On 2020/7/7 08:18, Segher Boessenkool wrote:
> Hi!
>
> On Sun, Jul 05, 2020 at 09:17:57PM -0500, Xionghu Luo wrote:
>> For extracting high part element from DImode register like:
>>
>> {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}
>>
>> split it before reload with "and mask" to avoid gene
Gentle ping...
On 2020/6/1 09:45, Xionghu Luo wrote:
resend the patch for stage1:
https://gcc.gnu.org/pipermail/gcc-patches/2020-January/538186.html
The performance of exchange2 built with PGO will decrease ~28% by r278808
due to profile count set incorrectly. The cloned nodes are updated to
Hi,
On 2020/6/3 04:32, Segher Boessenkool wrote:
> Hi Xiong Hu,
>
> On Tue, Jun 02, 2020 at 04:41:50AM -0500, Xionghu Luo wrote:
>> Double array in structure as function arguments or return value is accessed
>> by BLKmode, they are stored to stack and load from stack with redundant
>> conversion
On 2020/5/13 02:24, Richard Sandiford wrote:
> luoxhu writes:
>> + /* Fold (add -1; zero_ext; add +1) operations to zero_ext. i.e:
>> +
>> + 73: r145:SI=r123:DI#0-0x1
>> + 74: r144:DI=zero_extend (r145:SI)
>> + 75: r143:DI=r144:DI+0x1
>> +
Minor refine of checking iterations nonoverflow and a testcase for stage 1.
This "subtract/extend/add" existed for a long time and still annoying us
(PR37451, part of PR61837) when converting from 32bits to 64bits, as the ctr
register is used as 64bits on powerpc64, Andraw Pinski had a patch but
在 2020-05-06 20:09,Richard Biener 写道:
On Thu, 30 Apr 2020, luoxhu wrote:
Update the patch with overflow check. Bootstrap and regression tested
PASS on Power8-LE.
Use determine_value_range to get value range info for fold convert
expressions
with internal operation PLUS_EXPR/MINUS_EXPR
Update the patch with overflow check. Bootstrap and regression tested PASS on
Power8-LE.
Use determine_value_range to get value range info for fold convert expressions
with internal operation PLUS_EXPR/MINUS_EXPR/MULT_EXPR when not overflow on
wrapping overflow inner type. i.e.:
(long unsigne
On 2020/4/28 18:30, Richard Biener wrote:
>
> OK, I guess instead of get_range_info expr_to_aff_combination could
> simply use determine_value_range (op0, &minv, &maxv) == VR_RANGE
> (the && TREE_CODE (op0) == SSA_NAME check can then be removed)?
>
Tried with determine_value_range, it works
On 2020/4/28 15:01, Richard Biener wrote:
> On Tue, 28 Apr 2020, Xionghu Luo wrote:
>
>> From: Xionghu Luo
>>
>> Get and propagate value range info to convert expressions with convert
>> operation on PLUS_EXPR/MINUS_EXPR/MULT_EXPR when not overflow. i.e.:
>>
>> (long unsigned int)((unsigned
Tiny update to accommodate unsigned int compare.
On 2020/4/20 16:21, luoxhu via Gcc-patches wrote:
Hi,
On 2020/4/18 00:32, Segher Boessenkool wrote:
On Thu, Apr 16, 2020 at 08:21:40PM -0500, Segher Boessenkool wrote:
On Wed, Apr 15, 2020 at 10:18:16AM +0100, Richard Sandiford wrote:
luoxhu
Hi,
On 2020/4/18 00:32, Segher Boessenkool wrote:
> On Thu, Apr 16, 2020 at 08:21:40PM -0500, Segher Boessenkool wrote:
>> On Wed, Apr 15, 2020 at 10:18:16AM +0100, Richard Sandiford wrote:
>>> luoxhu--- via Gcc-patches writes:
>>>> -count = simplify_gen_binary
On 2020/4/17 08:52, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Apr 13, 2020 at 10:11:43AM +0800, luoxhu wrote:
>> frame_pointer_needed is set to true in reload pass setup_can_eliminate,
>> but regs_ever_live[31] is false, pro_and_epilogue uses it without live
>>
From: Xionghu Luo
This "subtract/extend/add" existed for a long time and still annoying us
(PR37451, PR61837) when converting from 32bits to 64bits, as the ctr
register is used as 64bits on powerpc64, Andraw Pinski had a patch but
caused some issue and reverted by Joseph S. Myers(PR37451, PR37782
This bug is exposed by FRE refactor of r263875. Comparing the fre
dump file shows no obvious change of the segment fault function proves
it to be a target issue.
frame_pointer_needed is set to true in reload pass setup_can_eliminate,
but regs_ever_live[31] is false, pro_and_epilogue uses it withou
On 2020/4/3 06:16, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Mar 30, 2020 at 11:59:57AM +0800, luoxhu wrote:
>>> Do we want something later in the RTL pipeline to make "addi"s etc. again?
>
> (This would be a good thing to consider -- maybe a define_in
On 2020/3/28 00:04, Segher Boessenkool wrote:
Hi!
On Fri, Mar 27, 2020 at 09:34:00AM +0800, luoxhu wrote:
On 2020/3/27 07:59, Segher Boessenkool wrote:
On Wed, Mar 25, 2020 at 11:15:22PM -0500, luo...@linux.ibm.com wrote:
frame_pointer_needed is set to true in reload pass
On 2020/3/27 22:33, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Mar 26, 2020 at 05:06:43AM -0500, luo...@linux.ibm.com wrote:
>> Remove split code from add3 to allow a later pass to split.
>> This allows later logic to hoist out constant load in add instructions.
>> In loop, lis+ori could be ho
On 2020/3/27 07:59, Segher Boessenkool wrote:
> Hi!
>
> On Wed, Mar 25, 2020 at 11:15:22PM -0500, luo...@linux.ibm.com wrote:
>> frame_pointer_needed is set to true in reload pass setup_can_eliminate,
>> but regs_ever_live[31] is false, so pro_and_epilogue doesn't save/restore
>> r31 even it is
From: Xionghu Luo
Remove split code from add3 to allow a later pass to split.
This allows later logic to hoist out constant load in add instructions.
In loop, lis+ori could be hoisted out to improve performance compared with
previous addis+addi (About 15% on typical case), weak point is
one more
From: Xionghu Luo
This P1 bug is exposed by FRE refactor of r263875. Comparing the fre
dump file shows no obvious change of the segment fault function proves
it to be a target issue.
frame_pointer_needed is set to true in reload pass setup_can_eliminate,
but regs_ever_live[31] is false, so pro_a
On 2020/3/10 05:28, Segher Boessenkool wrote:
On Thu, Mar 05, 2020 at 02:21:58AM -0600, luo...@linux.ibm.com wrote:
From: Xionghu Luo
Backport the patch to fix failures on P9 and P8BE, P7LE for PR94036.
No changes were needed?
Yes, no conflicts of the patch and instruction counts are sam
From: Xionghu Luo
Backport the patch to fix failures on P9 and P8BE, P7LE for PR94036.
Tested pass on P9/P8/P7, ok to commit?
(gcc-8 is not needed as the test doesn't exists.)
P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
It can have longer latency,
341
Author: hubicka
Date: Thu Nov 28 14:16:29 2019 +
* ipa-cp.c (update_profiling_info): Fix scaling.
Fix v3 patch and logs are here:
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00764.html
Thanks
Xionghu
On 2020/1/14 14:45, luoxhu wrote:
> Hi,
>
> On 2020/1/3 00:58
On 2020/2/18 17:57, Richard Biener wrote:
> On Tue, 18 Feb 2020, Xionghu Luo wrote:
>
>> Store-merging pass should run twice, the reason is pass fre/pre will do
>> some kind of optimizations to instructions by:
>>1. Converting the load from address to load from function arguments
>>(store_
Ping,
attachment of
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00764/exchange2.tar.gz
shows the profile count difference on cloned nodes digits_2.constprop.[0...8]
without/with this patch. Thanks!
Xiong Hu
On 2020/1/14 14:45, luoxhu wrote:
> Hi,
>
> On 2020/1/3 00:58, Jan Hubi
On 2020/1/11 20:20, Tamar Christina wrote:
Hi Martin,
This change (r280099) is causing a major performance regression on exchange2 in
SPEC2017 dropping the benchmark by more than 30%.
It seems the parameters no longer do anything. i.e. -flto --param
ipa-cp-eval-threshold=1 --param ipa-cp-u
On 2020/1/10 19:08, Jan Hubicka wrote:
> OK. You will need to do the obvious updates for Martin's patch
> which turned some member functions into static functions.
>
> Honza
Thanks a lot! Rebased & updated, will commit below patch shortly when git push
is ready.
v8:
1. Rebase to master with
On 2020/1/8 22:54, Martin Liška wrote:
diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index bd44063a1ac..789564ba335 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -1148,8 +1148,7 @@ symbol_table::materialize_all_clones (void)
if (symtab->dump_file)
On 2020/1/7 23:40, Jan Hubicka wrote:
>>
>>
>> On 2020/1/7 16:40, Jan Hubicka wrote:
On Mon, 2020-01-06 at 01:03 -0600, Xiong Hu Luo wrote:
> Inline should return failure either (newsize > param_large_function_insns)
> OR (newsize > limit). Sometimes newsize is larger than
> par
On 2020/1/7 16:40, Jan Hubicka wrote:
>> On Mon, 2020-01-06 at 01:03 -0600, Xiong Hu Luo wrote:
>>> Inline should return failure either (newsize > param_large_function_insns)
>>> OR (newsize > limit). Sometimes newsize is larger than
>>> param_large_function_insns, but smaller than limit, inlin
On 2020/1/7 02:01, Jeff Law wrote:
On Mon, 2020-01-06 at 01:03 -0600, Xiong Hu Luo wrote:
Inline should return failure either (newsize > param_large_function_insns)
OR (newsize > limit). Sometimes newsize is larger than
param_large_function_insns, but smaller than limit, inline doesn't return
f
gcc_checking_assert (src_val);
}
}
XiongHu
Feng
____
From: luoxhu
Sent: Monday, December 30, 2019 4:11 PM
To: Jan Hubicka; Martin Jambor
Cc: Martin Liška; gcc-patches@gcc.gnu.org; seg...@kernel.crashing.org;
wschm...@linux.ibm.com; g
v2 Changes:
1. Enable proportion orig_sum to the new nodes for self recursive node:
new_sum = (orig_sum + new_sum) \
* self_recursive_probability * (1 / param_ipa_cp_max_recursive_depth).
2. Add value range for param_ipa_cp_max_recursive_depth.
The performance of exchange2 built with PGO wil
>>> profile_count indir_cnt = indirect->count;
>>> indirect = indirect->clone (id->dst_node, call_stmt,
>>> gimple_uid (stmt),
>>> num, den,
>>>
On 2019/12/18 23:48, Jan Hubicka wrote:
>> The size_info of ipa_size_summary are created by r277424. It should be
>> duplicated for cloned nodes, otherwise self_size and
>> estimated_self_stack_size
>> would be 0, causing param large-function-insns and large-function-growth
>> working
>> inaccur
Ping :)
Patch is here:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00099.html
On 2019/12/3 10:31, luoxhu wrote:
Hi Martin and Honza,
On 2019/11/18 21:02, Martin Liška wrote:
On 11/16/19 10:59 AM, luoxhu wrote:
Sorry that I don't quite understand your meanning here. I didn'
Thanks Honza,
On 2019/12/10 19:06, Jan Hubicka wrote:
>> Hi,
>>
>> On Tue, Dec 10 2019, Jan Hubicka wrote:
>>> Hi,
>>> I think the updating should treat self recursive edges as loops: that is
>>> calculate SUM of counts incomming edges which are not self recursive,
>>> calculate probability of sel
Hi Martin and Honza,
On 2019/11/18 21:02, Martin Liška wrote:
> On 11/16/19 10:59 AM, luoxhu wrote:
>> Sorry that I don't quite understand your meanning here. I didn't grep the
>> word "cgraph_edge_summary" in source code, do you mean add new structure
On 2019/11/4 11:42, luoxhu wrote:
On 2019/11/2 00:23, Joseph Myers wrote:
On Thu, 31 Oct 2019, Xiong Hu Luo wrote:
+@code{-finline} enables inlining of function declared \"inline\".
+@code{-finline} is enabled at levels -O1, -O2, -O3 and -Os, but not -Og.
Use @option{} to mark
Thanks,
On 2019/11/26 18:15, Jan Hubicka wrote:
>> Hi,
>>
>> On 2019/11/26 16:04, Jan Hubicka wrote:
Summary variables should be deleted at the end of write_summary.
It's first newed in generate_summary, and second newed in read_summary.
Therefore, delete the first in write_summary,
Hi,
On 2019/11/26 16:04, Jan Hubicka wrote:
Summary variables should be deleted at the end of write_summary.
It's first newed in generate_summary, and second newed in read_summary.
Therefore, delete the first in write_summary, delete the second in
execute.
gcc/ChangeLog:
2019-11-26 Lu
Summary variables should be deleted at the end of write_summary.
It's first newed in generate_summary, and second newed in read_summary.
Therefore, delete the first in write_summary, delete the second in
execute.
gcc/ChangeLog:
2019-11-26 Luo Xiong Hu
* ipa-pure-const.c (pure_
Hi,
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr72804-1.c
>
>> +/* store generates difference instructions as below:
>> + P9: mtvsrdd;xxlnot;stxv.
>> + P8/P7/P6 LE: not;not;std;std.
>> + P8 BE: mtvsrd;mtvsrd;xxpermdi;xxlnor;stxvd2x.
>> + P7/P6 BE: std;std;addi;lxvd2x;xxlnor;stxvd2x. */
>
Hi Segher,
Update the code as you wish, Thanks:
P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
Update the test case to fix failures.
v4:
Define and use check_effective_target_xxx etc.
power9+: power9, power10 ...
power8: power8 only.
gcc/testsuite/Cha
P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
Update the test case to fix failures.
v3:
Define and use check_effective_target_xxx etc.
pre_power8: ... power6, power7.
power8: power8 only.
post_power8: power8, power9 ...
post_power9: power9, power10 ...
Hi,
On 2019/11/15 18:17, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Nov 14, 2019 at 09:12:32PM -0600, Xiong Hu Luo wrote:
>> P9LE generated instruction is not worse than P8LE.
>> mtvsrdd;xxlnot;stxv vs. not;not;std;std.
>> Update the test case to fix failures.
>
> So this no longer runs it for
Hi Thanks,
On 2019/11/14 17:04, Jan Hubicka wrote:
>> PR ipa/69678
>> * cgraph.c (symbol_table::create_edge): Init speculative_id.
>> (cgraph_edge::make_speculative): Add param for setting speculative_id.
>> (cgraph_edge::speculative_call_info): Find reference by
>> specul
On 2019/11/15 17:19, Jan Hubicka wrote:
>> On Fri, Nov 15, 2019 at 9:10 AM Jan Hubicka wrote:
>>>
next is initialized only in the loop before, it is never updated
in it's own loop.
gcc/ChangeLog
2019-11-15 Xiong Hu Luo
* ipa-inline.c (inl
On 2019/11/15 11:12, Xiong Hu Luo wrote:
P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
Update the test case to fix failures.
gcc/testsuite/ChangeLog:
2019-11-15 Luo Xiong Hu
testsuite/pr92398
* gcc.target/powerpc/pr7280
Rebase to trunk including void gimple_ic_transform.
This patch aims to fix PR69678 caused by PGO indirect call profiling
performance issues.
The bug that profiling data is never working was fixed by Martin's pull
back of topN patches, performance got GEOMEAN ~1% improvement(+24% for
511.povray_r
Tested pass and committed to r277904.
gcc/testsuite/ChangeLog:
2019-11-07 Xiong Hu Luo
* gcc.target/powerpc/pr72804.c: Move inline options from
dg-require-effective-target to dg-options.
---
gcc/testsuite/gcc.target/powerpc/pr72804.c | 4 ++--
1 file changed, 2 inser
On 2019/11/6 02:20, Joseph Myers wrote:
> On Tue, 5 Nov 2019, Kewen.Lin wrote:
>
>> Very good point! Since gcc doesn't pursue 100% testsuite pass rate, I
>> noticed
>> there are a few failures exposed/caused by some PRs all the time. Could we
>> just leave the test case there without any pre wo
On 2019/10/22 22:07, Martin Liška wrote:
On 9/27/19 9:13 AM, luoxhu wrote:
Thanks for your time of so many round of reviews.
You're welcome. One last request would be please to make
gimple_ic_transform a void function. See attached patch.
I'll remind the patch today to Honza
Hi,
On 2019/11/5 06:57, Joseph Myers wrote:
> On Mon, 4 Nov 2019, luoxhu wrote:
>
>> -finline-functions is enabled by default for O2 since r276469, update the
>> test cases with -fno-inline-functions.
>>
>> v2: disable inlining for the failed cases. Add two more fa
On 2019/11/2 00:23, Joseph Myers wrote:
> On Thu, 31 Oct 2019, Xiong Hu Luo wrote:
>
>> +@code{-finline} enables inlining of function declared \"inline\".
>> +@code{-finline} is enabled at levels -O1, -O2, -O3 and -Os, but not -Og.
>
> Use @option{} to mark up option names (both -finline and all
-finline-functions is enabled by default for O2 since r276469, update the
test cases with -fno-inline-functions.
v2: disable inlining for the failed cases. Add two more failed cases
not listed in BZ. Tested on P8LE, P8BE and P9LE.
gcc/testsuite/ChangeLog:
2019-10-30 Xiong Hu Luo
Hi,
On 2019/10/17 16:23, Feng Xue OS wrote:
> IPA does not allow constant propagation on parameter that is used to control
> function recursion.
>
> recur_fn (i)
> {
>if ( !terminate_recursion (i))
> {
>...
>recur_fn (i + 1);
>...
> }
>...
> }
>
> This
Hi Feng,
Thanks for the patch. It works for me as expected.
I am not a reviewer, just tiny comment after tried.
This is quite a good case for newbies to go through the ipa-cp pass.
Is it necessary to update the test case a bit as attached to include more
circumstances for callee's aggregate in
Hi Feng,
On 2019/10/17 16:23, Feng Xue OS wrote:
> IPA does not allow constant propagation on parameter that is used to control
> function recursion.
>
> recur_fn (i)
> {
>if ( !terminate_recursion (i))
> {
>...
>recur_fn (i + 1);
>...
> }
>...
> }
>
> T
Ping:
Attachment: v5-0001-Missed-function-specialization-partial-devirtuali.patch:
https://gcc.gnu.org/ml/gcc-patches/2019-09/txtuTT17jV7n5.txt
Thanks,
Xiong Hu
On 2019/9/27 15:13, luoxhu wrote:
Hi Martin,
Thanks for your time of so many round of reviews.
It really helped me a lot
On 2019/10/14 00:32, Jeff Law wrote:
> On 10/8/19 4:45 AM, Martin Jambor wrote:
>> Hi,
>>
>> On Tue, Oct 08 2019, luoxhu wrote:
>>> '}' is missed at the end.
>>
>> heh, yeah, I wonder for how long.
>>
>> If it irritates you, I
'}' is missed at the end.
gcc/ChangeLog:
tree-sra.c (dump_access): Add missing braces.
---
gcc/tree-sra.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 48589323a1e..cb59b91f20e 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
Hi,
This is the formal documentation patch for IPA passes. Thanks.
None of the IPA passes are documented in passes.texi. This patch adds
a section IPA passes just before GIMPLE passes and RTL passes in
Chapter 9 "Passes and Files of the Compiler". Also, a short description
for each IPA pass i
Hi Segher,
On 2019/9/30 00:17, Segher Boessenkool wrote:
> Hi!
>
> Just some editorial comments... The idea of the patch is fine IMHO.
> (I am not maintainer of this, take all my comments for what they are).
>
> On Sun, Sep 29, 2019 at 02:56:37AM -0500, Xiong Hu Luo wrote:
>> To simplify deve
Hi Martin,
Thanks for your time of so many round of reviews.
It really helped me a lot.
Updated with your comments and attached for Honza's review and approve. :)
Xiong Hu
BR
On 2019/9/26 16:36, Martin Liška wrote:
On 9/26/19 7:23 AM, luoxhu wrote:
Thanks Martin,
On 2019/9/25
Thanks Martin,
On 2019/9/25 18:57, Martin Liška wrote:
On 9/25/19 5:45 AM, luoxhu wrote:
Hi,
Sorry for replying so late due to cauldron conference and other LTO issues
I was working on.
Hello.
That's fine, we still have plenty of time for patch review.
Not fixed issues which I rep
Hi,
Sorry for replying so late due to cauldron conference and other LTO issues
I was working on.
v4 Changes:
1. Rebase to trunk.
2. Remove num_of_ics and use vector's length to avoid redundancy.
3. Update the code in ipa-profile.c to improve review feasibility.
4. Add function has_indirect_ca
This is the backport patch to gcc-9-branch, please ignore the previous
mail.
Backport r274411 of "Enable math functions linking with static library
for LTO" from mainline to gcc-9-branch.
Bootstrapped/Regression-tested on Linux POWER8 LE.
gcc/ChangeLog
2019-08-26 Xiong Hu Luo
Backpo
1 - 100 of 126 matches
Mail list logo