Re: Ping: [PATCH V4] Extend IPA-CP to support arithmetically-computed value-passing on by-ref argument (PR ipa/91682)

2019-10-23 Thread luoxhu
Hi Feng, Thanks for the patch. It works for me as expected. I am not a reviewer, just tiny comment after tried. This is quite a good case for newbies to go through the ipa-cp pass. Is it necessary to update the test case a bit as attached to include more circumstances for callee's aggregate in

Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-10-23 Thread luoxhu
Hi, On 2019/10/17 16:23, Feng Xue OS wrote: > IPA does not allow constant propagation on parameter that is used to control > function recursion. > > recur_fn (i) > { >if ( !terminate_recursion (i)) > { >... >recur_fn (i + 1); >... > } >... > } > > This

[PATCH v2] PR92090: Fix testcase failures by r276469

2019-11-03 Thread luoxhu
-finline-functions is enabled by default for O2 since r276469, update the test cases with -fno-inline-functions. v2: disable inlining for the failed cases. Add two more failed cases not listed in BZ. Tested on P8LE, P8BE and P9LE. gcc/testsuite/ChangeLog: 2019-10-30 Xiong Hu Luo

Re: [PATCH] Add explicit description for -finline

2019-11-03 Thread luoxhu
On 2019/11/2 00:23, Joseph Myers wrote: > On Thu, 31 Oct 2019, Xiong Hu Luo wrote: > >> +@code{-finline} enables inlining of function declared \"inline\". >> +@code{-finline} is enabled at levels -O1, -O2, -O3 and -Os, but not -Og. > > Use @option{} to mark up option names (both -finline and all

Re: [PATCH v3] PR92090: Fix testcase failures by r276469

2019-11-04 Thread luoxhu
Hi, On 2019/11/5 06:57, Joseph Myers wrote: > On Mon, 4 Nov 2019, luoxhu wrote: > >> -finline-functions is enabled by default for O2 since r276469, update the >> test cases with -fno-inline-functions. >> >> v2: disable inlining for the failed cases. Add two more fa

Ping: [PATCH v5] Missed function specialization + partial devirtualization

2019-11-05 Thread luoxhu
On 2019/10/22 22:07, Martin Liška wrote: On 9/27/19 9:13 AM, luoxhu wrote: Thanks for your time of so many round of reviews. You're welcome. One last request would be please to make gimple_ic_transform a void function. See attached patch. I'll remind the patch today to Honza

Re: [PATCH v2] PR92090: Fix testcase failures by r276469

2019-11-05 Thread luoxhu
On 2019/11/6 02:20, Joseph Myers wrote: > On Tue, 5 Nov 2019, Kewen.Lin wrote: > >> Very good point! Since gcc doesn't pursue 100% testsuite pass rate, I >> noticed >> there are a few failures exposed/caused by some PRs all the time. Could we >> just leave the test case there without any pre wo

[PATCH] Fix copy-paste typo syntax error by r277872

2019-11-06 Thread luoxhu
Tested pass and committed to r277904. gcc/testsuite/ChangeLog: 2019-11-07 Xiong Hu Luo * gcc.target/powerpc/pr72804.c: Move inline options from dg-require-effective-target to dg-options. --- gcc/testsuite/gcc.target/powerpc/pr72804.c | 4 ++-- 1 file changed, 2 inser

Re: [PATCH v7] Missed function specialization + partial devirtualization

2020-01-12 Thread luoxhu
On 2020/1/10 19:08, Jan Hubicka wrote: > OK. You will need to do the obvious updates for Martin's patch > which turned some member functions into static functions. > > Honza Thanks a lot! Rebased & updated, will commit below patch shortly when git push is ready. v8: 1. Rebase to master with

Re: [PATCH] Add Optimization for various IPA parameters.

2020-01-13 Thread luoxhu
On 2020/1/11 20:20, Tamar Christina wrote: Hi Martin, This change (r280099) is causing a major performance regression on exchange2 in SPEC2017 dropping the benchmark by more than 30%. It seems the parameters no longer do anything. i.e. -flto --param ipa-cp-eval-threshold=1 --param ipa-cp-u

Ping^1: [PATCH v3] ipa-cp: Fix PGO regression caused by r278808

2020-02-09 Thread luoxhu
Ping, attachment of https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00764/exchange2.tar.gz shows the profile count difference on cloned nodes digits_2.constprop.[0...8] without/with this patch. Thanks! Xiong Hu On 2020/1/14 14:45, luoxhu wrote: > Hi, > > On 2020/1/3 00:58, Jan Hubi

Re: [RFC] ipa-cp: Fix PGO regression caused by r278808

2019-12-13 Thread luoxhu
Thanks Honza, On 2019/12/10 19:06, Jan Hubicka wrote: >> Hi, >> >> On Tue, Dec 10 2019, Jan Hubicka wrote: >>> Hi, >>> I think the updating should treat self recursive edges as loops: that is >>> calculate SUM of counts incomming edges which are not self recursive, >>> calculate probability of sel

*Ping* Re: [PATCH v6] Missed function specialization + partial devirtualization

2019-12-17 Thread luoxhu
Ping :) Patch is here: https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00099.html On 2019/12/3 10:31, luoxhu wrote: Hi Martin and Honza, On 2019/11/18 21:02, Martin Liška wrote: On 11/16/19 10:59 AM, luoxhu wrote: Sorry that I don't quite understand your meanning here.  I didn'

Re: [PATCH] [RFC] ipa: duplicate ipa_size_summary for cloned nodes

2019-12-18 Thread luoxhu
On 2019/12/18 23:48, Jan Hubicka wrote: >> The size_info of ipa_size_summary are created by r277424. It should be >> duplicated for cloned nodes, otherwise self_size and >> estimated_self_stack_size >> would be 0, causing param large-function-insns and large-function-growth >> working >> inaccur

[PATCH v7] Missed function specialization + partial devirtualization

2019-12-25 Thread luoxhu
>>> profile_count indir_cnt = indirect->count; >>> indirect = indirect->clone (id->dst_node, call_stmt, >>> gimple_uid (stmt), >>> num, den, >>>

[PATCH v2] ipa-cp: Fix PGO regression caused by r278808

2019-12-30 Thread luoxhu
v2 Changes: 1. Enable proportion orig_sum to the new nodes for self recursive node: new_sum = (orig_sum + new_sum) \ * self_recursive_probability * (1 / param_ipa_cp_max_recursive_depth). 2. Add value range for param_ipa_cp_max_recursive_depth. The performance of exchange2 built with PGO wil

Re: [PATCH v2] ipa-cp: Fix PGO regression caused by r278808

2019-12-30 Thread luoxhu
gcc_checking_assert (src_val); } } XiongHu Feng ____ From: luoxhu Sent: Monday, December 30, 2019 4:11 PM To: Jan Hubicka; Martin Jambor Cc: Martin Liška; gcc-patches@gcc.gnu.org; seg...@kernel.crashing.org; wschm...@linux.ibm.com; g

Re: [PATCH] ipa-inline: Adjust condition for caller_growth_limits

2020-01-06 Thread luoxhu
On 2020/1/7 02:01, Jeff Law wrote: On Mon, 2020-01-06 at 01:03 -0600, Xiong Hu Luo wrote: Inline should return failure either (newsize > param_large_function_insns) OR (newsize > limit). Sometimes newsize is larger than param_large_function_insns, but smaller than limit, inline doesn't return f

Re: [PATCH] ipa-inline: Adjust condition for caller_growth_limits

2020-01-07 Thread luoxhu
On 2020/1/7 16:40, Jan Hubicka wrote: >> On Mon, 2020-01-06 at 01:03 -0600, Xiong Hu Luo wrote: >>> Inline should return failure either (newsize > param_large_function_insns) >>> OR (newsize > limit). Sometimes newsize is larger than >>> param_large_function_insns, but smaller than limit, inlin

Re: [PATCH] ipa-inline: Adjust condition for caller_growth_limits

2020-01-07 Thread luoxhu
On 2020/1/7 23:40, Jan Hubicka wrote: >> >> >> On 2020/1/7 16:40, Jan Hubicka wrote: On Mon, 2020-01-06 at 01:03 -0600, Xiong Hu Luo wrote: > Inline should return failure either (newsize > param_large_function_insns) > OR (newsize > limit). Sometimes newsize is larger than > par

Re: [PATCH] Use cgraph_node::dump_{asm_},name where possible.

2020-01-08 Thread luoxhu
On 2020/1/8 22:54, Martin Liška wrote: diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c index bd44063a1ac..789564ba335 100644 --- a/gcc/cgraphclones.c +++ b/gcc/cgraphclones.c @@ -1148,8 +1148,7 @@ symbol_table::materialize_all_clones (void) if (symtab->dump_file)

Re: [RFC] Run store-merging pass once more before pass fre/pre

2020-02-26 Thread luoxhu
On 2020/2/18 17:57, Richard Biener wrote: > On Tue, 18 Feb 2020, Xionghu Luo wrote: > >> Store-merging pass should run twice, the reason is pass fre/pre will do >> some kind of optimizations to instructions by: >>1. Converting the load from address to load from function arguments >>(store_

*Ping^1* [PATCH v3] ipa-cp: Fix PGO regression caused by r278808

2020-03-03 Thread luoxhu
341 Author: hubicka Date: Thu Nov 28 14:16:29 2019 + * ipa-cp.c (update_profiling_info): Fix scaling. Fix v3 patch and logs are here: https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00764.html Thanks Xionghu On 2020/1/14 14:45, luoxhu wrote: > Hi, > > On 2020/1/3 00:58

[PATCH] Backport to gcc-9: PR92398: Fix testcase failure of pr72804.c

2020-03-05 Thread luoxhu
From: Xionghu Luo Backport the patch to fix failures on P9 and P8BE, P7LE for PR94036. Tested pass on P9/P8/P7, ok to commit? (gcc-8 is not needed as the test doesn't exists.) P9LE generated instruction is not worse than P8LE. mtvsrdd;xxlnot;stxv vs. not;not;std;std. It can have longer latency,

Re: [PATCH] Backport to gcc-9: PR92398: Fix testcase failure of pr72804.c

2020-03-09 Thread luoxhu
On 2020/3/10 05:28, Segher Boessenkool wrote: On Thu, Mar 05, 2020 at 02:21:58AM -0600, luo...@linux.ibm.com wrote: From: Xionghu Luo Backport the patch to fix failures on P9 and P8BE, P7LE for PR94036. No changes were needed? Yes, no conflicts of the patch and instruction counts are sam

[PATCH v4] Missed function specialization + partial devirtualization

2019-09-24 Thread luoxhu
Hi, Sorry for replying so late due to cauldron conference and other LTO issues I was working on. v4 Changes: 1. Rebase to trunk. 2. Remove num_of_ics and use vector's length to avoid redundancy. 3. Update the code in ipa-profile.c to improve review feasibility. 4. Add function has_indirect_ca

Re: [PATCH v4] Missed function specialization + partial devirtualization

2019-09-25 Thread luoxhu
Thanks Martin, On 2019/9/25 18:57, Martin Liška wrote: On 9/25/19 5:45 AM, luoxhu wrote: Hi, Sorry for replying so late due to cauldron conference and other LTO issues I was working on. Hello. That's fine, we still have plenty of time for patch review. Not fixed issues which I rep

Re: [PATCH v5] Missed function specialization + partial devirtualization

2019-09-27 Thread luoxhu
Hi Martin, Thanks for your time of so many round of reviews. It really helped me a lot. Updated with your comments and attached for Honza's review and approve. :) Xiong Hu BR On 2019/9/26 16:36, Martin Liška wrote: On 9/26/19 7:23 AM, luoxhu wrote: Thanks Martin, On 2019/9/25

Re: [RFC] Come up with ipa passes introduction in gccint documentation

2019-09-29 Thread luoxhu
Hi Segher, On 2019/9/30 00:17, Segher Boessenkool wrote: > Hi! > > Just some editorial comments... The idea of the patch is fine IMHO. > (I am not maintainer of this, take all my comments for what they are). > > On Sun, Sep 29, 2019 at 02:56:37AM -0500, Xiong Hu Luo wrote: >> To simplify deve

Re: [PATCH] Come up with ipa passes introduction in gccint documentation

2019-10-08 Thread luoxhu
Hi, This is the formal documentation patch for IPA passes. Thanks. None of the IPA passes are documented in passes.texi. This patch adds a section IPA passes just before GIMPLE passes and RTL passes in Chapter 9 "Passes and Files of the Compiler". Also, a short description for each IPA pass i

[PATCH] Fix dump message issue

2019-10-08 Thread luoxhu
'}' is missed at the end. gcc/ChangeLog: tree-sra.c (dump_access): Add missing braces. --- gcc/tree-sra.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 48589323a1e..cb59b91f20e 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c

Re: [PATCH] Fix dump message issue

2019-10-13 Thread luoxhu
On 2019/10/14 00:32, Jeff Law wrote: > On 10/8/19 4:45 AM, Martin Jambor wrote: >> Hi, >> >> On Tue, Oct 08 2019, luoxhu wrote: >>> '}' is missed at the end. >> >> heh, yeah, I wonder for how long. >> >> If it irritates you, I&#x

Ping: [PATCH v5] Missed function specialization + partial devirtualization

2019-10-15 Thread luoxhu
Ping: Attachment: v5-0001-Missed-function-specialization-partial-devirtuali.patch: https://gcc.gnu.org/ml/gcc-patches/2019-09/txtuTT17jV7n5.txt Thanks, Xiong Hu On 2019/9/27 15:13, luoxhu wrote: Hi Martin, Thanks for your time of so many round of reviews. It really helped me a lot

Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-10-17 Thread luoxhu
Hi Feng, On 2019/10/17 16:23, Feng Xue OS wrote: > IPA does not allow constant propagation on parameter that is used to control > function recursion. > > recur_fn (i) > { >if ( !terminate_recursion (i)) > { >... >recur_fn (i + 1); >... > } >... > } > > T

Ping*2: [PATCH v5] Missed function specialization + partial devirtualization

2019-11-13 Thread luoxhu
Rebase to trunk including void gimple_ic_transform. This patch aims to fix PR69678 caused by PGO indirect call profiling performance issues. The bug that profiling data is never working was fixed by Martin's pull back of topN patches, performance got GEOMEAN ~1% improvement(+24% for 511.povray_r

Re: [PATCH] PR92398: Fix testcase failure of pr72804.c

2019-11-14 Thread luoxhu
On 2019/11/15 11:12, Xiong Hu Luo wrote: P9LE generated instruction is not worse than P8LE. mtvsrdd;xxlnot;stxv vs. not;not;std;std. Update the test case to fix failures. gcc/testsuite/ChangeLog: 2019-11-15 Luo Xiong Hu testsuite/pr92398 * gcc.target/powerpc/pr7280

Re: [PATCH 1/2] Update iterator of next

2019-11-15 Thread luoxhu
On 2019/11/15 17:19, Jan Hubicka wrote: >> On Fri, Nov 15, 2019 at 9:10 AM Jan Hubicka wrote: >>> next is initialized only in the loop before, it is never updated in it's own loop. gcc/ChangeLog 2019-11-15 Xiong Hu Luo * ipa-inline.c (inl

Re: Ping*2: [PATCH v5] Missed function specialization + partial devirtualization

2019-11-16 Thread luoxhu
Hi Thanks, On 2019/11/14 17:04, Jan Hubicka wrote: >> PR ipa/69678 >> * cgraph.c (symbol_table::create_edge): Init speculative_id. >> (cgraph_edge::make_speculative): Add param for setting speculative_id. >> (cgraph_edge::speculative_call_info): Find reference by >> specul

Re: [PATCH v2] PR92398: Fix testcase failure of pr72804.c

2019-11-18 Thread luoxhu
Hi, On 2019/11/15 18:17, Segher Boessenkool wrote: > Hi! > > On Thu, Nov 14, 2019 at 09:12:32PM -0600, Xiong Hu Luo wrote: >> P9LE generated instruction is not worse than P8LE. >> mtvsrdd;xxlnot;stxv vs. not;not;std;std. >> Update the test case to fix failures. > > So this no longer runs it for

Re: [PATCH v3] PR92398: Fix testcase failure of pr72804.c

2019-11-19 Thread luoxhu
P9LE generated instruction is not worse than P8LE. mtvsrdd;xxlnot;stxv vs. not;not;std;std. Update the test case to fix failures. v3: Define and use check_effective_target_xxx etc. pre_power8: ... power6, power7. power8: power8 only. post_power8: power8, power9 ... post_power9: power9, power10 ...

Re: [PATCH v3] PR92398: Fix testcase failure of pr72804.c

2019-11-21 Thread luoxhu
Hi Segher, Update the code as you wish, Thanks: P9LE generated instruction is not worse than P8LE. mtvsrdd;xxlnot;stxv vs. not;not;std;std. Update the test case to fix failures. v4: Define and use check_effective_target_xxx etc. power9+: power9, power10 ... power8: power8 only. gcc/testsuite/Cha

Re: [PATCH v3] PR92398: Fix testcase failure of pr72804.c

2019-11-24 Thread luoxhu
Hi, >> +++ b/gcc/testsuite/gcc.target/powerpc/pr72804-1.c > >> +/* store generates difference instructions as below: >> + P9: mtvsrdd;xxlnot;stxv. >> + P8/P7/P6 LE: not;not;std;std. >> + P8 BE: mtvsrd;mtvsrd;xxpermdi;xxlnor;stxvd2x. >> + P7/P6 BE: std;std;addi;lxvd2x;xxlnor;stxvd2x. */ >

[PATCH] Fix two potential memory leak

2019-11-25 Thread luoxhu
Summary variables should be deleted at the end of write_summary. It's first newed in generate_summary, and second newed in read_summary. Therefore, delete the first in write_summary, delete the second in execute. gcc/ChangeLog: 2019-11-26 Luo Xiong Hu * ipa-pure-const.c (pure_

Re: [PATCH] Fix two potential memory leak

2019-11-26 Thread luoxhu
Hi, On 2019/11/26 16:04, Jan Hubicka wrote: Summary variables should be deleted at the end of write_summary. It's first newed in generate_summary, and second newed in read_summary. Therefore, delete the first in write_summary, delete the second in execute. gcc/ChangeLog: 2019-11-26 Lu

Re: [PATCH] Fix two potential memory leak

2019-11-26 Thread luoxhu
Thanks, On 2019/11/26 18:15, Jan Hubicka wrote: >> Hi, >> >> On 2019/11/26 16:04, Jan Hubicka wrote: Summary variables should be deleted at the end of write_summary. It's first newed in generate_summary, and second newed in read_summary. Therefore, delete the first in write_summary,

Ping: [PATCH] Add explicit description for -finline

2019-11-27 Thread luoxhu
On 2019/11/4 11:42, luoxhu wrote: On 2019/11/2 00:23, Joseph Myers wrote: On Thu, 31 Oct 2019, Xiong Hu Luo wrote: +@code{-finline} enables inlining of function declared \"inline\". +@code{-finline} is enabled at levels -O1, -O2, -O3 and -Os, but not -Og. Use @option{} to mark

[PATCH v6] Missed function specialization + partial devirtualization

2019-12-02 Thread luoxhu
Hi Martin and Honza, On 2019/11/18 21:02, Martin Liška wrote: > On 11/16/19 10:59 AM, luoxhu wrote: >> Sorry that I don't quite understand your meanning here.  I didn't grep the >> word "cgraph_edge_summary" in source code, do you mean add new structure

Re: [PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization

2019-06-18 Thread luoxhu
Hi, On 2019/6/18 13:51, Martin Liška wrote: On 6/18/19 3:45 AM, Xiong Hu Luo wrote: Hello. Thank you for the interest in the area. This patch aims to fix PR69678 caused by PGO indirect call profiling bugs. Currently the default instrument function can only find the indirect function that cal

Re: [PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization

2019-06-18 Thread luoxhu
Hi Martin, On 2019/6/18 17:34, Martin Liška wrote: On 6/18/19 11:02 AM, luoxhu wrote: Hi, On 2019/6/18 13:51, Martin Liška wrote: On 6/18/19 3:45 AM, Xiong Hu Luo wrote: Hello. Thank you for the interest in the area. This patch aims to fix PR69678 caused by PGO indirect call profiling

Re: [PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization

2019-06-19 Thread luoxhu
Hi Martin, On 2019/6/18 18:21, Martin Liška wrote: On 6/18/19 3:45 AM, Xiong Hu Luo wrote: 6.2. SPEC2017 peakrate: 523.xalancbmk_r (+4.87%); 538.imagick_r (+4.59%); 511.povray_r (+13.33%); 525.x264_r (-5.29%). Can you please elaborate what are the key indirect call pr

Re: [PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization

2019-06-19 Thread luoxhu
On 2019/6/19 20:18, Martin Liška wrote: On 6/19/19 10:56 AM, Martin Liška wrote: Thank you very much for the numbers. Today, I'm going to prepare the generalization of single-value counter to track N values. Ok, here's a patch candidate that does tracking of most common N values. For your

Re: [PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization

2019-06-19 Thread luoxhu
Hi Martin, On 2019/6/20 09:59, luoxhu wrote: On 2019/6/19 20:18, Martin Liška wrote: On 6/19/19 10:56 AM, Martin Liška wrote: Thank you very much for the numbers. Today, I'm going to prepare the generalization of single-value counter to track N values. Ok, here's a patch cand

Re: [PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization

2019-06-23 Thread luoxhu
Hi Honza, Thanks very much to get so many useful comments from you. As a newbie to GCC, not sure whether my questions are described clearly enough. Thanks for your patience in advance. :) On 2019/6/20 21:47, Jan Hubicka wrote: Hi, some comments on the ipa part of the patch (and thanks for wor

Re: [PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization

2019-06-24 Thread luoxhu
On 2019/6/24 10:34, luoxhu wrote: Hi Honza, Thanks very much to get so many useful comments from you. As a newbie to GCC, not sure whether my questions are described clearly enough.  Thanks for your patience in advance.  :) On 2019/6/20 21:47, Jan Hubicka wrote: Hi, some comments on the

[PATCH v3] Missed function specialization + partial devirtualization

2019-07-30 Thread luoxhu
This patch aims to fix PR69678 caused by PGO indirect call profiling performance issues. The bug that profiling data is never working was fixed by Martin's pull back of topN patches, performance got GEOMEAN ~1% improvement. Still, currently the default profile only generates SINGLE indirect target

[Patch v2] Enable math functions linking with static library for LTO

2019-08-11 Thread luoxhu
Hi Richard, Thanks for your comments, updated the v2 patch as below: 1. Define and use builtin_with_linkage_p. 2. Add comments. 3. Add a testcase. In LTO mode, if static library and dynamic library contains same function and both libraries are passed as arguments, linker will link the function in

Re: [Patch v2] Enable math functions linking with static library for LTO

2019-08-12 Thread luoxhu
Hi Richard, On 2019/8/12 16:51, Richard Biener wrote: On Mon, Aug 12, 2019 at 8:50 AM luoxhu wrote: Hi Richard, Thanks for your comments, updated the v2 patch as below: 1. Define and use builtin_with_linkage_p. 2. Add comments. 3. Add a testcase. In LTO mode, if static library and dynamic

Re: [Patch v2] Enable math functions linking with static library for LTO

2019-08-12 Thread luoxhu
On 2019/8/13 10:22, luoxhu wrote: diff --git a/gcc/testsuite/gcc.dg/pr91287.c b/gcc/testsuite/gcc.dg/pr91287.c new file mode 100644 index 000..c816e0537aa --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr91287.c @@ -0,0 +1,40 @@ +/* { dg-do assemble } */ +/* { dg-options "-O2"

Re: [PATCH] Add MD Function type check for builtin_md vectorize

2019-08-21 Thread luoxhu
On 2019/8/21 15:40, Richard Biener wrote: On Tue, 20 Aug 2019, Xiong Hu Luo wrote: The DECL_MD_FUNCTION_CODE added in r274404(PR 91421) by rsandifo requires that DECL to be a BUILTIN_IN_MD class built-in, asserts will happen when lto as the patch r274411(PR 91287) outputs some math function sym

Re: [Patch v2] Enable math functions linking with static library for LTO

2019-08-22 Thread luoxhu
Hi Richard, On 2019/8/13 17:10, Richard Biener wrote: On Tue, Aug 13, 2019 at 4:22 AM luoxhu wrote: Hi Richard, On 2019/8/12 16:51, Richard Biener wrote: On Mon, Aug 12, 2019 at 8:50 AM luoxhu wrote: Hi Richard, Thanks for your comments, updated the v2 patch as below: 1. Define and use

[PATCH] Backport r274411 from trunk to gcc-9-branch

2019-08-25 Thread luoxhu
This is the backport patch to gcc-9-branch, please ignore the previous mail. Backport r274411 of "Enable math functions linking with static library for LTO" from mainline to gcc-9-branch. Bootstrapped/Regression-tested on Linux POWER8 LE. gcc/ChangeLog 2019-08-26 Xiong Hu Luo Backpo

Re: [PATCH v3] Generalize get_most_common_single_value to return k_th value & count

2019-07-15 Thread luoxhu
Currently get_most_common_single_value could only return the max hist , add qsort to enable this function return nth value. Rename it to get_nth_most_common_value. v3 Changes: 1. Move sort to profile.c after loading values from disk. Simplify get_nth_most_common_value. 2. Make qsort stable

Re: [PATCH v4] Generalize get_most_common_single_value to return k_th value & count

2019-07-16 Thread luoxhu
Currently get_most_common_single_value could only return the max hist , add sort after reading from disk, then it return nth value in later use. Rename it to get_nth_most_common_value. Hi Martin, Thanks for your review, v4 Changes as below: 1. Use decrease bubble sort. BTW, I have a question abo

Re: [PATCH v4] Generalize get_most_common_single_value to return k_th value & count

2019-07-17 Thread luoxhu
Hi Martin, On 2019/7/17 15:55, Martin Liška wrote: On 7/17/19 7:44 AM, luoxhu wrote: Hi Martin, Thanks for your review, v4 Changes as below: 1. Use decrease bubble sort. BTW, I have a question about hist->hvalue.counters[2], when will it become -1, please? Thanks. Currently, if it is

[PATCH] luoxhu - backport from trunk r255555, r257253 and r258137

2019-02-18 Thread luoxhu
From: Xiong Hu Luo This is a backport of r25, r257253 and r258137 of trunk to gcc-7-branch. The patches were on trunk before GCC 8 forked already. Totally 5 files need mannual resolve due to code changes for r25. r257253 and r258137 are dependent testcases require vsx support need merge t

[PATCH] PR c/43673 - Incorrect warning in dfp printf.

2019-02-25 Thread luoxhu
From: Xiong Hu Luo dfp printf/scanf of Ha/HA, Da/DA and DDa/DDA is not set properly, cause incorrect warning happens: "use of 'D' length modifier with 'a' type character". Regression-tested on powerpc64le-linux, OK for trunk and gcc-8? gcc/c-family/ChangeLog: 2019-02-25 Xiong Hu Luo

[PATCH v3] luoxhu - backport r250477, r255555, r257253 and r258137

2019-03-04 Thread luoxhu
From: Xiong Hu Luo This is a backport of r250477, r25, r257253 and r258137 from trunk to gcc-7-branch to support built-in functions: vec_extract_fp_from_shorth, vec_extract_fp_from_shortl, vec_extract_fp32_from_shorth and vec_extract_fp32_from_shortl, etc. The patches were on trunk before GCC

[PATCH] backport r268834 from mainline to gcc-7-branch

2019-03-05 Thread luoxhu
From: Xiong Hu Luo Backport r268834 of "Add support for the vec_sbox_be, vec_cipher_be etc." from mainline to gcc-8-branch. Regression-tested on Linux POWER8 LE. Backport patch for gcc-8-branch already got approved and commited. OK for gcc-7-branch? gcc/ChangeLog: 2019-03-05 Xiong Hu Luo

[PATCH] backport r257541, r259936, r260294, r260623, r261098, r261333, r268585.

2019-04-04 Thread luoxhu
From: Xiong Hu Luo These patches are followed changes for r25 on testcases vsx-vector-6*.c. backport them to update file names and fix regressions for GCC7 on power9. Regression tested on power7-be, power8-be, power8-le, power9. gcc/ChangeLog: 2019-04-03 Xiong Hu Luo backport f

Re: *Ping* Re: [PATCH] PR c/43673 - Incorrect warning in dfp printf.

2019-05-20 Thread luoxhu
Ping for GCC-10. Thanks Xionghu On 2019/3/4 09:13, Xiong Hu Luo wrote: Ping: https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01949.html Thanks Xionghu On 2019/2/26 AM9:13, luo...@linux.ibm.com wrote: From: Xiong Hu Luo dfp printf/scanf of Ha/HA, Da/DA and DDa/DDA is not set properly, cause

[PATCH] luoxhu - backport from trunk r255555:

2019-01-22 Thread luoxhu
From: carll backport from trunk to gcc-7-branch. gcc/ChangeLog: 2017-12-11 Carl Love * config/rs6000/altivec.h (vec_extract_fp32_from_shorth, vec_extract_fp32_from_shortl]): Add #defines. * config/rs6000/rs6000-builtin.def (VSLDOI_2DI): Add macro expansion. *

[PATCH] rs6000: Add support for the vec_sbox_be, vec_cipher_be etc. builtins.

2019-01-23 Thread luoxhu
From: Xiong Hu Luo The 5 new builtins vec_sbox_be, vec_cipher_be, vec_cipherlast_be, vec_ncipher_be and vec_ncipherlast_be only support vector unsigned char type parameters. Add new instruction crypto_vsbox_ and crypto__ to handle them accordingly, where the new mode CR_vqdi can be expanded to ve

[PATCH 1/2] fix tab alignment issue.

2019-01-23 Thread luoxhu
From: Xiong Hu Luo commited in r268228. --- ChangeLog 2019-01-24 Xiong Hu Luo * ChangeLog: replace space with tab. * MAINTAINERS: delete 1 tab to keep alignment. --- ChangeLog | 4 ++-- MAINTAINERS | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git

[PATCH 2/2] fix comments typo.

2019-01-23 Thread luoxhu
From: Xiong Hu Luo commited in 268229. --- gcc/ChangeLog 2019-01-24 Xiong Hu Luo * tree-ssa-dom.c (test_for_singularity): fix a comment typo. * vr-values.c (find_case_label_ranges): fix a comment typo. --- gcc/tree-ssa-dom.c | 2 +- gcc/vr-values.c| 2 +- 2 files changed

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-14 Thread luoxhu via Gcc-patches
On 2020/9/10 18:08, Richard Biener wrote: > On Wed, Sep 9, 2020 at 6:03 PM Segher Boessenkool > wrote: >> >> On Wed, Sep 09, 2020 at 04:28:19PM +0200, Richard Biener wrote: >>> On Wed, Sep 9, 2020 at 3:49 PM Segher Boessenkool >>> wrote: Hi! On Tue, Sep 08, 2020 at 10:26:51

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-14 Thread luoxhu via Gcc-patches
On 2020/9/14 17:47, Richard Biener wrote: On Mon, Sep 14, 2020 at 10:05 AM luoxhu wrote: Not sure whether this reflects the issues you discussed above. I constructed below test cases and tested with and without this patch, only if "a+c"(which means store only), the performance

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-15 Thread luoxhu via Gcc-patches
On 2020/9/15 14:51, Richard Biener wrote: >> I only see VAR_DECL and PARM_DECL, is there any function to check the tree >> variable is global? I added DECL_REGISTER, but the RTL still expands to >> stack: > > is_global_var () or alternatively !auto_var_in_fn_p (), I think doing > IFN_SET onl

Re: [PATCH] ipa-inline: Improve growth accumulation for recursive calls

2020-08-12 Thread luoxhu via Gcc-patches
Hi, On 2020/8/13 01:53, Jan Hubicka wrote: > Hello, > with Martin we spent some time looking into exchange2 and my > understanding of the problem is the following: > > There is the self recursive function digits_2 with the property that it > has 10 nested loops and calls itself from the innermost

Re: [PATCH] ipa-inline: Improve growth accumulation for recursive calls

2020-08-14 Thread luoxhu via Gcc-patches
Hi, On 2020/8/13 20:52, Jan Hubicka wrote: >> Since there are no other callers outside of these specialized nodes, the >> guessed profile count should be same equal? Perf tool shows that even >> each specialized node is called only once, none of them take same time for >> each call: >> >>40.6

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-01 Thread luoxhu via Gcc-patches
Hi, On 2020/9/1 01:04, Segher Boessenkool wrote: > Hi! > > On Mon, Aug 31, 2020 at 04:06:47AM -0500, Xiong Hu Luo wrote: >> vec_insert accepts 3 arguments, arg0 is input vector, arg1 is the value >> to be insert, arg2 is the place to insert arg1 to arg0. This patch adds >> __builtin_vec_insert_v

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-01 Thread luoxhu via Gcc-patches
Hi, On 2020/9/1 00:47, will schmidt wrote: >> + tmode = TYPE_MODE (TREE_TYPE (arg0)); >> + mode1 = TYPE_MODE (TREE_TYPE (TREE_TYPE (arg0))); >> + mode2 = TYPE_MODE ((TREE_TYPE (arg2))); >> + gcc_assert (VECTOR_MODE_P (tmode)); >> + >> + op0 = expand_expr (arg0, NULL_RTX, tmode, EXPAND_NORMAL)

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-02 Thread luoxhu via Gcc-patches
Hi, On 2020/9/1 21:07, Richard Biener wrote: > On Tue, Sep 1, 2020 at 10:11 AM luoxhu via Gcc-patches > wrote: >> >> Hi, >> >> On 2020/9/1 01:04, Segher Boessenkool wrote: >>> Hi! >>> >>> On Mon, Aug 31, 2020 at 04:06:47AM -0500, Xiong H

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-03 Thread luoxhu via Gcc-patches
On 2020/9/2 17:30, Richard Biener wrote: >> so maybe bypass convert_vector_to_array_for_subscript for special >> circumstance >> like "i = v[n%4]" or "v[n&3]=i" to generate vec_extract or vec_insert builtin >> call a relative simpler method? > I think you have it backward. You need to work wit

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-03 Thread luoxhu via Gcc-patches
Hi, On 2020/9/3 18:29, Richard Biener wrote: > On Thu, Sep 3, 2020 at 11:20 AM luoxhu wrote: >> >> >> >> On 2020/9/2 17:30, Richard Biener wrote: >>>> so maybe bypass convert_vector_to_array_for_subscript for special >>>> circumstance >>&

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-03 Thread luoxhu via Gcc-patches
On 2020/9/4 14:16, luoxhu via Gcc-patches wrote: Hi, Yes, I checked and found that both vec_set and vec_extract doesn't support variable index for most targets, store_bit_field_1 and extract_bit_field_1 would only consider use optabs when index is integer value. Anyway, it shouldn

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-04 Thread luoxhu via Gcc-patches
On 2020/9/4 15:23, Richard Biener wrote: > On Fri, Sep 4, 2020 at 9:19 AM Richard Biener > wrote: >> >> On Fri, Sep 4, 2020 at 8:38 AM luoxhu wrote: >>> >>> >>> >>> On 2020/9/4 14:16, luoxhu via Gcc-patches wrote: >>>>

[PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-06 Thread luoxhu via Gcc-patches
Hi, On 2020/9/4 18:23, Segher Boessenkool wrote: diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c index 03b00738a5e..00c65311f76 100644 --- a/gcc/config/rs6000/rs6000-c.c +++ b/gcc/config/rs6000/rs6000-c.c /* Build *(((arg1_inner_type*)&(vector type){arg1})+arg2)

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-08 Thread luoxhu via Gcc-patches
Hi Richi, On 2020/9/7 19:57, Richard Biener wrote: > + if (TREE_CODE (to) == ARRAY_REF) > + { > + tree op0 = TREE_OPERAND (to, 0); > + if (TREE_CODE (op0) == VIEW_CONVERT_EXPR > + && expand_view_convert_to_vec_set (to, from, to_rtx)) > + { > +

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-08 Thread luoxhu via Gcc-patches
On 2020/9/8 16:26, Richard Biener wrote: >> Seems not only pseudo, for example "v = vec_insert (i, v, n);" >> the vector variable will be store to stack first, then [r112:DI] is a >> memory here to be processed. So the patch loads it from stack(insn #10) to >> temp vector register first, and st

[PATCH] rs6000: Save/restore r31 if frame_pointer_needed is true

2020-03-25 Thread luoxhu--- via Gcc-patches
From: Xionghu Luo This P1 bug is exposed by FRE refactor of r263875. Comparing the fre dump file shows no obvious change of the segment fault function proves it to be a target issue. frame_pointer_needed is set to true in reload pass setup_can_eliminate, but regs_ever_live[31] is false, so pro_a

[PATCH] rs6000: Don't split constant oprator when add, move to temp register for future optimization

2020-03-26 Thread luoxhu--- via Gcc-patches
From: Xionghu Luo Remove split code from add3 to allow a later pass to split. This allows later logic to hoist out constant load in add instructions. In loop, lis+ori could be hoisted out to improve performance compared with previous addis+addi (About 15% on typical case), weak point is one more

Re: [PATCH] rs6000: Split movsf_from_si from high word before reload[PR89310]

2020-07-07 Thread luoxhu via Gcc-patches
On 2020/7/7 08:18, Segher Boessenkool wrote: > Hi! > > On Sun, Jul 05, 2020 at 09:17:57PM -0500, Xionghu Luo wrote: >> For extracting high part element from DImode register like: >> >> {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;} >> >> split it before reload with "and mask" to avoid gene

Re: [PATCH] rs6000: Split movsf_from_si from high word before reload[PR89310]

2020-07-07 Thread luoxhu via Gcc-patches
On 2020/7/8 05:31, Segher Boessenkool wrote: > Hi! > > On Tue, Jul 07, 2020 at 04:39:58PM +0800, luoxhu wrote: >>> Lots of questions, sorry! >> >> Thanks for the nice suggestions of the initial patch contains many issues:), > > Pretty much all of it should

Re: [PATCH] rs6000: Split movsf_from_si from high word before reload[PR89310]

2020-07-08 Thread luoxhu via Gcc-patches
On 2020/7/9 06:43, Segher Boessenkool wrote: > Hi! > > On Wed, Jul 08, 2020 at 11:19:21AM +0800, luoxhu wrote: >> For extracting high part element from DImode register like: >> >> {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;} >> >> sp

Re: [PATCH] rs6000: Define movsf_from_si2 to extract high part SF element from DImode[PR89310]

2020-07-09 Thread luoxhu via Gcc-patches
Hi, On 2020/7/10 03:25, Segher Boessenkool wrote: > Hi! > > On Thu, Jul 09, 2020 at 11:09:42AM +0800, luoxhu wrote: >>> Maybe change it back to just SI? It won't match often at all for QI or >>> HI anyway, it seems. Sorry for that detour. Should be go

Re: [PATCH 1/2] rs6000: Init V4SF vector without converting SP to DP

2020-07-09 Thread luoxhu via Gcc-patches
Update patch to keep the logic for non TARGET_P8_VECTOR targets. Please ignore the previous [PATCH 1/2], Sorry! Move V4SF to V4SI, init vector like V4SI and move to V4SF back. Better instruction sequence could be generated on Power9: lfs + xxpermdi + xvcvdpsp + vmrgew => lwz + (sldi + or) + mtvs

Re: [PATCH] rs6000: Split movsf_from_si from high word before reload[PR89310]

2020-07-10 Thread luoxhu via Gcc-patches
On 2020/7/10 03:25, Segher Boessenkool wrote: > >> + "TARGET_NO_SF_SUBREG" >> + "#" >> + "&& vsx_reg_sfsubreg_ok (operands[0], SFmode)" > > Put this in the insn condition? And since this is just a predicate, > you can just use it instead of gpc_reg_operand. > > (The split condition become

Re: [PATCH 2/2] rs6000: Define define_insn_and_split to split unspec sldi+or to rldimi

2020-07-12 Thread luoxhu via Gcc-patches
On 2020/7/11 08:28, Segher Boessenkool wrote: Hi! On Thu, Jul 09, 2020 at 09:14:45PM -0500, Xiong Hu Luo wrote: * config/rs6000/rs6000.md (rotl_unspec): New define_insn_and_split. +; rldimi with UNSPEC_SI_FROM_SF. +(define_insn_and_split "*rotl_unspec" Please have rotldi

Re: [PATCH] rs6000: Define movsf_from_si2 to extract high part SF element from DImode[PR89310]

2020-07-12 Thread luoxhu via Gcc-patches
Hi, On 2020/7/11 08:54, Segher Boessenkool wrote: > Hi! > > On Fri, Jul 10, 2020 at 09:39:40AM +0800, luoxhu wrote: >> OK, seems the md file needs a format tool too... > > Heh. Just make sure it looks good (that is, does what it looks like), > looks like the res

Re: [PATCH] rs6000: Define movsf_from_si2 to extract high part SF element from DImode[PR89310]

2020-07-14 Thread luoxhu via Gcc-patches
Power8-LE, I re-run these cases on Power8-LE, and confirmed these could pass, what is your platform please? BTW, TARGET_NO_SF_SUBREG ensured TARGET_POWERPC64 for this define_insn_and_split. Thanks. Xionghu > > Thanks, David > > On Mon, Jul 13, 2020 at 2:30 AM luoxhu wrote: >> &g

  1   2   >