Hi Feng,
Thanks for the patch. It works for me as expected.
I am not a reviewer, just tiny comment after tried.
This is quite a good case for newbies to go through the ipa-cp pass.
Is it necessary to update the test case a bit as attached to include more
circumstances for callee's aggregate in
Hi,
On 2019/10/17 16:23, Feng Xue OS wrote:
> IPA does not allow constant propagation on parameter that is used to control
> function recursion.
>
> recur_fn (i)
> {
>if ( !terminate_recursion (i))
> {
>...
>recur_fn (i + 1);
>...
> }
>...
> }
>
> This
-finline-functions is enabled by default for O2 since r276469, update the
test cases with -fno-inline-functions.
v2: disable inlining for the failed cases. Add two more failed cases
not listed in BZ. Tested on P8LE, P8BE and P9LE.
gcc/testsuite/ChangeLog:
2019-10-30 Xiong Hu Luo
On 2019/11/2 00:23, Joseph Myers wrote:
> On Thu, 31 Oct 2019, Xiong Hu Luo wrote:
>
>> +@code{-finline} enables inlining of function declared \"inline\".
>> +@code{-finline} is enabled at levels -O1, -O2, -O3 and -Os, but not -Og.
>
> Use @option{} to mark up option names (both -finline and all
Hi,
On 2019/11/5 06:57, Joseph Myers wrote:
> On Mon, 4 Nov 2019, luoxhu wrote:
>
>> -finline-functions is enabled by default for O2 since r276469, update the
>> test cases with -fno-inline-functions.
>>
>> v2: disable inlining for the failed cases. Add two more fa
On 2019/10/22 22:07, Martin Liška wrote:
On 9/27/19 9:13 AM, luoxhu wrote:
Thanks for your time of so many round of reviews.
You're welcome. One last request would be please to make
gimple_ic_transform a void function. See attached patch.
I'll remind the patch today to Honza
On 2019/11/6 02:20, Joseph Myers wrote:
> On Tue, 5 Nov 2019, Kewen.Lin wrote:
>
>> Very good point! Since gcc doesn't pursue 100% testsuite pass rate, I
>> noticed
>> there are a few failures exposed/caused by some PRs all the time. Could we
>> just leave the test case there without any pre wo
Tested pass and committed to r277904.
gcc/testsuite/ChangeLog:
2019-11-07 Xiong Hu Luo
* gcc.target/powerpc/pr72804.c: Move inline options from
dg-require-effective-target to dg-options.
---
gcc/testsuite/gcc.target/powerpc/pr72804.c | 4 ++--
1 file changed, 2 inser
On 2020/1/10 19:08, Jan Hubicka wrote:
> OK. You will need to do the obvious updates for Martin's patch
> which turned some member functions into static functions.
>
> Honza
Thanks a lot! Rebased & updated, will commit below patch shortly when git push
is ready.
v8:
1. Rebase to master with
On 2020/1/11 20:20, Tamar Christina wrote:
Hi Martin,
This change (r280099) is causing a major performance regression on exchange2 in
SPEC2017 dropping the benchmark by more than 30%.
It seems the parameters no longer do anything. i.e. -flto --param
ipa-cp-eval-threshold=1 --param ipa-cp-u
Ping,
attachment of
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00764/exchange2.tar.gz
shows the profile count difference on cloned nodes digits_2.constprop.[0...8]
without/with this patch. Thanks!
Xiong Hu
On 2020/1/14 14:45, luoxhu wrote:
> Hi,
>
> On 2020/1/3 00:58, Jan Hubi
Thanks Honza,
On 2019/12/10 19:06, Jan Hubicka wrote:
>> Hi,
>>
>> On Tue, Dec 10 2019, Jan Hubicka wrote:
>>> Hi,
>>> I think the updating should treat self recursive edges as loops: that is
>>> calculate SUM of counts incomming edges which are not self recursive,
>>> calculate probability of sel
Ping :)
Patch is here:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00099.html
On 2019/12/3 10:31, luoxhu wrote:
Hi Martin and Honza,
On 2019/11/18 21:02, Martin Liška wrote:
On 11/16/19 10:59 AM, luoxhu wrote:
Sorry that I don't quite understand your meanning here. I didn'
On 2019/12/18 23:48, Jan Hubicka wrote:
>> The size_info of ipa_size_summary are created by r277424. It should be
>> duplicated for cloned nodes, otherwise self_size and
>> estimated_self_stack_size
>> would be 0, causing param large-function-insns and large-function-growth
>> working
>> inaccur
>>> profile_count indir_cnt = indirect->count;
>>> indirect = indirect->clone (id->dst_node, call_stmt,
>>> gimple_uid (stmt),
>>> num, den,
>>>
v2 Changes:
1. Enable proportion orig_sum to the new nodes for self recursive node:
new_sum = (orig_sum + new_sum) \
* self_recursive_probability * (1 / param_ipa_cp_max_recursive_depth).
2. Add value range for param_ipa_cp_max_recursive_depth.
The performance of exchange2 built with PGO wil
gcc_checking_assert (src_val);
}
}
XiongHu
Feng
____
From: luoxhu
Sent: Monday, December 30, 2019 4:11 PM
To: Jan Hubicka; Martin Jambor
Cc: Martin Liška; gcc-patches@gcc.gnu.org; seg...@kernel.crashing.org;
wschm...@linux.ibm.com; g
On 2020/1/7 02:01, Jeff Law wrote:
On Mon, 2020-01-06 at 01:03 -0600, Xiong Hu Luo wrote:
Inline should return failure either (newsize > param_large_function_insns)
OR (newsize > limit). Sometimes newsize is larger than
param_large_function_insns, but smaller than limit, inline doesn't return
f
On 2020/1/7 16:40, Jan Hubicka wrote:
>> On Mon, 2020-01-06 at 01:03 -0600, Xiong Hu Luo wrote:
>>> Inline should return failure either (newsize > param_large_function_insns)
>>> OR (newsize > limit). Sometimes newsize is larger than
>>> param_large_function_insns, but smaller than limit, inlin
On 2020/1/7 23:40, Jan Hubicka wrote:
>>
>>
>> On 2020/1/7 16:40, Jan Hubicka wrote:
On Mon, 2020-01-06 at 01:03 -0600, Xiong Hu Luo wrote:
> Inline should return failure either (newsize > param_large_function_insns)
> OR (newsize > limit). Sometimes newsize is larger than
> par
On 2020/1/8 22:54, Martin Liška wrote:
diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index bd44063a1ac..789564ba335 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -1148,8 +1148,7 @@ symbol_table::materialize_all_clones (void)
if (symtab->dump_file)
On 2020/2/18 17:57, Richard Biener wrote:
> On Tue, 18 Feb 2020, Xionghu Luo wrote:
>
>> Store-merging pass should run twice, the reason is pass fre/pre will do
>> some kind of optimizations to instructions by:
>>1. Converting the load from address to load from function arguments
>>(store_
341
Author: hubicka
Date: Thu Nov 28 14:16:29 2019 +
* ipa-cp.c (update_profiling_info): Fix scaling.
Fix v3 patch and logs are here:
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00764.html
Thanks
Xionghu
On 2020/1/14 14:45, luoxhu wrote:
> Hi,
>
> On 2020/1/3 00:58
From: Xionghu Luo
Backport the patch to fix failures on P9 and P8BE, P7LE for PR94036.
Tested pass on P9/P8/P7, ok to commit?
(gcc-8 is not needed as the test doesn't exists.)
P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
It can have longer latency,
On 2020/3/10 05:28, Segher Boessenkool wrote:
On Thu, Mar 05, 2020 at 02:21:58AM -0600, luo...@linux.ibm.com wrote:
From: Xionghu Luo
Backport the patch to fix failures on P9 and P8BE, P7LE for PR94036.
No changes were needed?
Yes, no conflicts of the patch and instruction counts are sam
Hi,
Sorry for replying so late due to cauldron conference and other LTO issues
I was working on.
v4 Changes:
1. Rebase to trunk.
2. Remove num_of_ics and use vector's length to avoid redundancy.
3. Update the code in ipa-profile.c to improve review feasibility.
4. Add function has_indirect_ca
Thanks Martin,
On 2019/9/25 18:57, Martin Liška wrote:
On 9/25/19 5:45 AM, luoxhu wrote:
Hi,
Sorry for replying so late due to cauldron conference and other LTO issues
I was working on.
Hello.
That's fine, we still have plenty of time for patch review.
Not fixed issues which I rep
Hi Martin,
Thanks for your time of so many round of reviews.
It really helped me a lot.
Updated with your comments and attached for Honza's review and approve. :)
Xiong Hu
BR
On 2019/9/26 16:36, Martin Liška wrote:
On 9/26/19 7:23 AM, luoxhu wrote:
Thanks Martin,
On 2019/9/25
Hi Segher,
On 2019/9/30 00:17, Segher Boessenkool wrote:
> Hi!
>
> Just some editorial comments... The idea of the patch is fine IMHO.
> (I am not maintainer of this, take all my comments for what they are).
>
> On Sun, Sep 29, 2019 at 02:56:37AM -0500, Xiong Hu Luo wrote:
>> To simplify deve
Hi,
This is the formal documentation patch for IPA passes. Thanks.
None of the IPA passes are documented in passes.texi. This patch adds
a section IPA passes just before GIMPLE passes and RTL passes in
Chapter 9 "Passes and Files of the Compiler". Also, a short description
for each IPA pass i
'}' is missed at the end.
gcc/ChangeLog:
tree-sra.c (dump_access): Add missing braces.
---
gcc/tree-sra.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 48589323a1e..cb59b91f20e 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
On 2019/10/14 00:32, Jeff Law wrote:
> On 10/8/19 4:45 AM, Martin Jambor wrote:
>> Hi,
>>
>> On Tue, Oct 08 2019, luoxhu wrote:
>>> '}' is missed at the end.
>>
>> heh, yeah, I wonder for how long.
>>
>> If it irritates you, I
Ping:
Attachment: v5-0001-Missed-function-specialization-partial-devirtuali.patch:
https://gcc.gnu.org/ml/gcc-patches/2019-09/txtuTT17jV7n5.txt
Thanks,
Xiong Hu
On 2019/9/27 15:13, luoxhu wrote:
Hi Martin,
Thanks for your time of so many round of reviews.
It really helped me a lot
Hi Feng,
On 2019/10/17 16:23, Feng Xue OS wrote:
> IPA does not allow constant propagation on parameter that is used to control
> function recursion.
>
> recur_fn (i)
> {
>if ( !terminate_recursion (i))
> {
>...
>recur_fn (i + 1);
>...
> }
>...
> }
>
> T
Rebase to trunk including void gimple_ic_transform.
This patch aims to fix PR69678 caused by PGO indirect call profiling
performance issues.
The bug that profiling data is never working was fixed by Martin's pull
back of topN patches, performance got GEOMEAN ~1% improvement(+24% for
511.povray_r
On 2019/11/15 11:12, Xiong Hu Luo wrote:
P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
Update the test case to fix failures.
gcc/testsuite/ChangeLog:
2019-11-15 Luo Xiong Hu
testsuite/pr92398
* gcc.target/powerpc/pr7280
On 2019/11/15 17:19, Jan Hubicka wrote:
>> On Fri, Nov 15, 2019 at 9:10 AM Jan Hubicka wrote:
>>>
next is initialized only in the loop before, it is never updated
in it's own loop.
gcc/ChangeLog
2019-11-15 Xiong Hu Luo
* ipa-inline.c (inl
Hi Thanks,
On 2019/11/14 17:04, Jan Hubicka wrote:
>> PR ipa/69678
>> * cgraph.c (symbol_table::create_edge): Init speculative_id.
>> (cgraph_edge::make_speculative): Add param for setting speculative_id.
>> (cgraph_edge::speculative_call_info): Find reference by
>> specul
Hi,
On 2019/11/15 18:17, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Nov 14, 2019 at 09:12:32PM -0600, Xiong Hu Luo wrote:
>> P9LE generated instruction is not worse than P8LE.
>> mtvsrdd;xxlnot;stxv vs. not;not;std;std.
>> Update the test case to fix failures.
>
> So this no longer runs it for
P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
Update the test case to fix failures.
v3:
Define and use check_effective_target_xxx etc.
pre_power8: ... power6, power7.
power8: power8 only.
post_power8: power8, power9 ...
post_power9: power9, power10 ...
Hi Segher,
Update the code as you wish, Thanks:
P9LE generated instruction is not worse than P8LE.
mtvsrdd;xxlnot;stxv vs. not;not;std;std.
Update the test case to fix failures.
v4:
Define and use check_effective_target_xxx etc.
power9+: power9, power10 ...
power8: power8 only.
gcc/testsuite/Cha
Hi,
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr72804-1.c
>
>> +/* store generates difference instructions as below:
>> + P9: mtvsrdd;xxlnot;stxv.
>> + P8/P7/P6 LE: not;not;std;std.
>> + P8 BE: mtvsrd;mtvsrd;xxpermdi;xxlnor;stxvd2x.
>> + P7/P6 BE: std;std;addi;lxvd2x;xxlnor;stxvd2x. */
>
Summary variables should be deleted at the end of write_summary.
It's first newed in generate_summary, and second newed in read_summary.
Therefore, delete the first in write_summary, delete the second in
execute.
gcc/ChangeLog:
2019-11-26 Luo Xiong Hu
* ipa-pure-const.c (pure_
Hi,
On 2019/11/26 16:04, Jan Hubicka wrote:
Summary variables should be deleted at the end of write_summary.
It's first newed in generate_summary, and second newed in read_summary.
Therefore, delete the first in write_summary, delete the second in
execute.
gcc/ChangeLog:
2019-11-26 Lu
Thanks,
On 2019/11/26 18:15, Jan Hubicka wrote:
>> Hi,
>>
>> On 2019/11/26 16:04, Jan Hubicka wrote:
Summary variables should be deleted at the end of write_summary.
It's first newed in generate_summary, and second newed in read_summary.
Therefore, delete the first in write_summary,
On 2019/11/4 11:42, luoxhu wrote:
On 2019/11/2 00:23, Joseph Myers wrote:
On Thu, 31 Oct 2019, Xiong Hu Luo wrote:
+@code{-finline} enables inlining of function declared \"inline\".
+@code{-finline} is enabled at levels -O1, -O2, -O3 and -Os, but not -Og.
Use @option{} to mark
Hi Martin and Honza,
On 2019/11/18 21:02, Martin Liška wrote:
> On 11/16/19 10:59 AM, luoxhu wrote:
>> Sorry that I don't quite understand your meanning here. I didn't grep the
>> word "cgraph_edge_summary" in source code, do you mean add new structure
Hi,
On 2019/6/18 13:51, Martin Liška wrote:
On 6/18/19 3:45 AM, Xiong Hu Luo wrote:
Hello.
Thank you for the interest in the area.
This patch aims to fix PR69678 caused by PGO indirect call profiling bugs.
Currently the default instrument function can only find the indirect function
that cal
Hi Martin,
On 2019/6/18 17:34, Martin Liška wrote:
On 6/18/19 11:02 AM, luoxhu wrote:
Hi,
On 2019/6/18 13:51, Martin Liška wrote:
On 6/18/19 3:45 AM, Xiong Hu Luo wrote:
Hello.
Thank you for the interest in the area.
This patch aims to fix PR69678 caused by PGO indirect call profiling
Hi Martin,
On 2019/6/18 18:21, Martin Liška wrote:
On 6/18/19 3:45 AM, Xiong Hu Luo wrote:
6.2. SPEC2017 peakrate:
523.xalancbmk_r (+4.87%); 538.imagick_r (+4.59%); 511.povray_r
(+13.33%);
525.x264_r (-5.29%).
Can you please elaborate what are the key indirect call pr
On 2019/6/19 20:18, Martin Liška wrote:
On 6/19/19 10:56 AM, Martin Liška wrote:
Thank you very much for the numbers. Today, I'm going to prepare the
generalization of single-value counter to track N values.
Ok, here's a patch candidate that does tracking of most common N values. For
your
Hi Martin,
On 2019/6/20 09:59, luoxhu wrote:
On 2019/6/19 20:18, Martin Liška wrote:
On 6/19/19 10:56 AM, Martin Liška wrote:
Thank you very much for the numbers. Today, I'm going to prepare the
generalization of single-value counter to track N values.
Ok, here's a patch cand
Hi Honza,
Thanks very much to get so many useful comments from you.
As a newbie to GCC, not sure whether my questions are described clearly
enough. Thanks for your patience in advance. :)
On 2019/6/20 21:47, Jan Hubicka wrote:
Hi,
some comments on the ipa part of the patch
(and thanks for wor
On 2019/6/24 10:34, luoxhu wrote:
Hi Honza,
Thanks very much to get so many useful comments from you.
As a newbie to GCC, not sure whether my questions are described clearly
enough. Thanks for your patience in advance. :)
On 2019/6/20 21:47, Jan Hubicka wrote:
Hi,
some comments on the
This patch aims to fix PR69678 caused by PGO indirect call profiling
performance issues.
The bug that profiling data is never working was fixed by Martin's pull
back of topN patches, performance got GEOMEAN ~1% improvement.
Still, currently the default profile only generates SINGLE indirect target
Hi Richard,
Thanks for your comments, updated the v2 patch as below:
1. Define and use builtin_with_linkage_p.
2. Add comments.
3. Add a testcase.
In LTO mode, if static library and dynamic library contains same
function and both libraries are passed as arguments, linker will link
the function in
Hi Richard,
On 2019/8/12 16:51, Richard Biener wrote:
On Mon, Aug 12, 2019 at 8:50 AM luoxhu wrote:
Hi Richard,
Thanks for your comments, updated the v2 patch as below:
1. Define and use builtin_with_linkage_p.
2. Add comments.
3. Add a testcase.
In LTO mode, if static library and dynamic
On 2019/8/13 10:22, luoxhu wrote:
diff --git a/gcc/testsuite/gcc.dg/pr91287.c
b/gcc/testsuite/gcc.dg/pr91287.c
new file mode 100644
index 000..c816e0537aa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr91287.c
@@ -0,0 +1,40 @@
+/* { dg-do assemble } */
+/* { dg-options "-O2"
On 2019/8/21 15:40, Richard Biener wrote:
On Tue, 20 Aug 2019, Xiong Hu Luo wrote:
The DECL_MD_FUNCTION_CODE added in r274404(PR 91421) by rsandifo requires that
DECL to be a BUILTIN_IN_MD class built-in, asserts will happen when lto
as the patch r274411(PR 91287) outputs some math function sym
Hi Richard,
On 2019/8/13 17:10, Richard Biener wrote:
On Tue, Aug 13, 2019 at 4:22 AM luoxhu wrote:
Hi Richard,
On 2019/8/12 16:51, Richard Biener wrote:
On Mon, Aug 12, 2019 at 8:50 AM luoxhu wrote:
Hi Richard,
Thanks for your comments, updated the v2 patch as below:
1. Define and use
This is the backport patch to gcc-9-branch, please ignore the previous
mail.
Backport r274411 of "Enable math functions linking with static library
for LTO" from mainline to gcc-9-branch.
Bootstrapped/Regression-tested on Linux POWER8 LE.
gcc/ChangeLog
2019-08-26 Xiong Hu Luo
Backpo
Currently get_most_common_single_value could only return the max hist
, add qsort to enable this function return nth value.
Rename it to get_nth_most_common_value.
v3 Changes:
1. Move sort to profile.c after loading values from disk. Simplify
get_nth_most_common_value.
2. Make qsort stable
Currently get_most_common_single_value could only return the max hist
, add sort after reading from disk, then it return nth value
in later use. Rename it to get_nth_most_common_value.
Hi Martin,
Thanks for your review, v4 Changes as below:
1. Use decrease bubble sort.
BTW, I have a question abo
Hi Martin,
On 2019/7/17 15:55, Martin Liška wrote:
On 7/17/19 7:44 AM, luoxhu wrote:
Hi Martin,
Thanks for your review, v4 Changes as below:
1. Use decrease bubble sort.
BTW, I have a question about hist->hvalue.counters[2], when will it become
-1, please? Thanks. Currently, if it is
From: Xiong Hu Luo
This is a backport of r25, r257253 and r258137 of trunk to gcc-7-branch.
The patches were on trunk before GCC 8 forked already. Totally 5 files need
mannual resolve due to code changes for r25. r257253 and r258137 are
dependent testcases require vsx support need merge t
From: Xiong Hu Luo
dfp printf/scanf of Ha/HA, Da/DA and DDa/DDA is not set properly, cause
incorrect warning happens:
"use of 'D' length modifier with 'a' type character".
Regression-tested on powerpc64le-linux, OK for trunk and gcc-8?
gcc/c-family/ChangeLog:
2019-02-25 Xiong Hu Luo
From: Xiong Hu Luo
This is a backport of r250477, r25, r257253 and r258137 from trunk to
gcc-7-branch to support built-in functions:
vec_extract_fp_from_shorth, vec_extract_fp_from_shortl,
vec_extract_fp32_from_shorth and vec_extract_fp32_from_shortl, etc.
The patches were on trunk before GCC
From: Xiong Hu Luo
Backport r268834 of "Add support for the vec_sbox_be, vec_cipher_be etc."
from mainline to gcc-8-branch.
Regression-tested on Linux POWER8 LE. Backport patch for gcc-8-branch
already got approved and commited. OK for gcc-7-branch?
gcc/ChangeLog:
2019-03-05 Xiong Hu Luo
From: Xiong Hu Luo
These patches are followed changes for r25 on testcases
vsx-vector-6*.c. backport them to update file names and fix regressions
for GCC7 on power9.
Regression tested on power7-be, power8-be, power8-le, power9.
gcc/ChangeLog:
2019-04-03 Xiong Hu Luo
backport f
Ping for GCC-10.
Thanks
Xionghu
On 2019/3/4 09:13, Xiong Hu Luo wrote:
Ping:
https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01949.html
Thanks
Xionghu
On 2019/2/26 AM9:13, luo...@linux.ibm.com wrote:
From: Xiong Hu Luo
dfp printf/scanf of Ha/HA, Da/DA and DDa/DDA is not set properly, cause
From: carll
backport from trunk to gcc-7-branch.
gcc/ChangeLog:
2017-12-11 Carl Love
* config/rs6000/altivec.h (vec_extract_fp32_from_shorth,
vec_extract_fp32_from_shortl]): Add #defines.
* config/rs6000/rs6000-builtin.def (VSLDOI_2DI): Add macro expansion.
*
From: Xiong Hu Luo
The 5 new builtins vec_sbox_be, vec_cipher_be, vec_cipherlast_be, vec_ncipher_be
and vec_ncipherlast_be only support vector unsigned char type parameters.
Add new instruction crypto_vsbox_ and crypto__ to handle
them accordingly, where the new mode CR_vqdi can be expanded to ve
From: Xiong Hu Luo
commited in r268228.
---
ChangeLog
2019-01-24 Xiong Hu Luo
* ChangeLog: replace space with tab.
* MAINTAINERS: delete 1 tab to keep alignment.
---
ChangeLog | 4 ++--
MAINTAINERS | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git
From: Xiong Hu Luo
commited in 268229.
---
gcc/ChangeLog
2019-01-24 Xiong Hu Luo
* tree-ssa-dom.c (test_for_singularity): fix a comment typo.
* vr-values.c (find_case_label_ranges): fix a comment typo.
---
gcc/tree-ssa-dom.c | 2 +-
gcc/vr-values.c| 2 +-
2 files changed
On 2020/9/10 18:08, Richard Biener wrote:
> On Wed, Sep 9, 2020 at 6:03 PM Segher Boessenkool
> wrote:
>>
>> On Wed, Sep 09, 2020 at 04:28:19PM +0200, Richard Biener wrote:
>>> On Wed, Sep 9, 2020 at 3:49 PM Segher Boessenkool
>>> wrote:
Hi!
On Tue, Sep 08, 2020 at 10:26:51
On 2020/9/14 17:47, Richard Biener wrote:
On Mon, Sep 14, 2020 at 10:05 AM luoxhu wrote:
Not sure whether this reflects the issues you discussed above.
I constructed below test cases and tested with and without this patch,
only if "a+c"(which means store only), the performance
On 2020/9/15 14:51, Richard Biener wrote:
>> I only see VAR_DECL and PARM_DECL, is there any function to check the tree
>> variable is global? I added DECL_REGISTER, but the RTL still expands to
>> stack:
>
> is_global_var () or alternatively !auto_var_in_fn_p (), I think doing
> IFN_SET onl
Hi,
On 2020/8/13 01:53, Jan Hubicka wrote:
> Hello,
> with Martin we spent some time looking into exchange2 and my
> understanding of the problem is the following:
>
> There is the self recursive function digits_2 with the property that it
> has 10 nested loops and calls itself from the innermost
Hi,
On 2020/8/13 20:52, Jan Hubicka wrote:
>> Since there are no other callers outside of these specialized nodes, the
>> guessed profile count should be same equal? Perf tool shows that even
>> each specialized node is called only once, none of them take same time for
>> each call:
>>
>>40.6
Hi,
On 2020/9/1 01:04, Segher Boessenkool wrote:
> Hi!
>
> On Mon, Aug 31, 2020 at 04:06:47AM -0500, Xiong Hu Luo wrote:
>> vec_insert accepts 3 arguments, arg0 is input vector, arg1 is the value
>> to be insert, arg2 is the place to insert arg1 to arg0. This patch adds
>> __builtin_vec_insert_v
Hi,
On 2020/9/1 00:47, will schmidt wrote:
>> + tmode = TYPE_MODE (TREE_TYPE (arg0));
>> + mode1 = TYPE_MODE (TREE_TYPE (TREE_TYPE (arg0)));
>> + mode2 = TYPE_MODE ((TREE_TYPE (arg2)));
>> + gcc_assert (VECTOR_MODE_P (tmode));
>> +
>> + op0 = expand_expr (arg0, NULL_RTX, tmode, EXPAND_NORMAL)
Hi,
On 2020/9/1 21:07, Richard Biener wrote:
> On Tue, Sep 1, 2020 at 10:11 AM luoxhu via Gcc-patches
> wrote:
>>
>> Hi,
>>
>> On 2020/9/1 01:04, Segher Boessenkool wrote:
>>> Hi!
>>>
>>> On Mon, Aug 31, 2020 at 04:06:47AM -0500, Xiong H
On 2020/9/2 17:30, Richard Biener wrote:
>> so maybe bypass convert_vector_to_array_for_subscript for special
>> circumstance
>> like "i = v[n%4]" or "v[n&3]=i" to generate vec_extract or vec_insert builtin
>> call a relative simpler method?
> I think you have it backward. You need to work wit
Hi,
On 2020/9/3 18:29, Richard Biener wrote:
> On Thu, Sep 3, 2020 at 11:20 AM luoxhu wrote:
>>
>>
>>
>> On 2020/9/2 17:30, Richard Biener wrote:
>>>> so maybe bypass convert_vector_to_array_for_subscript for special
>>>> circumstance
>>&
On 2020/9/4 14:16, luoxhu via Gcc-patches wrote:
Hi,
Yes, I checked and found that both vec_set and vec_extract doesn't support
variable index for most targets, store_bit_field_1 and extract_bit_field_1
would only consider use optabs when index is integer value. Anyway, it
shouldn
On 2020/9/4 15:23, Richard Biener wrote:
> On Fri, Sep 4, 2020 at 9:19 AM Richard Biener
> wrote:
>>
>> On Fri, Sep 4, 2020 at 8:38 AM luoxhu wrote:
>>>
>>>
>>>
>>> On 2020/9/4 14:16, luoxhu via Gcc-patches wrote:
>>>>
Hi,
On 2020/9/4 18:23, Segher Boessenkool wrote:
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 03b00738a5e..00c65311f76 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
/* Build *(((arg1_inner_type*)&(vector type){arg1})+arg2)
Hi Richi,
On 2020/9/7 19:57, Richard Biener wrote:
> + if (TREE_CODE (to) == ARRAY_REF)
> + {
> + tree op0 = TREE_OPERAND (to, 0);
> + if (TREE_CODE (op0) == VIEW_CONVERT_EXPR
> + && expand_view_convert_to_vec_set (to, from, to_rtx))
> + {
> +
On 2020/9/8 16:26, Richard Biener wrote:
>> Seems not only pseudo, for example "v = vec_insert (i, v, n);"
>> the vector variable will be store to stack first, then [r112:DI] is a
>> memory here to be processed. So the patch loads it from stack(insn #10) to
>> temp vector register first, and st
From: Xionghu Luo
This P1 bug is exposed by FRE refactor of r263875. Comparing the fre
dump file shows no obvious change of the segment fault function proves
it to be a target issue.
frame_pointer_needed is set to true in reload pass setup_can_eliminate,
but regs_ever_live[31] is false, so pro_a
From: Xionghu Luo
Remove split code from add3 to allow a later pass to split.
This allows later logic to hoist out constant load in add instructions.
In loop, lis+ori could be hoisted out to improve performance compared with
previous addis+addi (About 15% on typical case), weak point is
one more
On 2020/7/7 08:18, Segher Boessenkool wrote:
> Hi!
>
> On Sun, Jul 05, 2020 at 09:17:57PM -0500, Xionghu Luo wrote:
>> For extracting high part element from DImode register like:
>>
>> {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}
>>
>> split it before reload with "and mask" to avoid gene
On 2020/7/8 05:31, Segher Boessenkool wrote:
> Hi!
>
> On Tue, Jul 07, 2020 at 04:39:58PM +0800, luoxhu wrote:
>>> Lots of questions, sorry!
>>
>> Thanks for the nice suggestions of the initial patch contains many issues:),
>
> Pretty much all of it should
On 2020/7/9 06:43, Segher Boessenkool wrote:
> Hi!
>
> On Wed, Jul 08, 2020 at 11:19:21AM +0800, luoxhu wrote:
>> For extracting high part element from DImode register like:
>>
>> {%1:SF=unspec[r122:DI>>0x20#0] 86;clobber scratch;}
>>
>> sp
Hi,
On 2020/7/10 03:25, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Jul 09, 2020 at 11:09:42AM +0800, luoxhu wrote:
>>> Maybe change it back to just SI? It won't match often at all for QI or
>>> HI anyway, it seems. Sorry for that detour. Should be go
Update patch to keep the logic for non TARGET_P8_VECTOR targets.
Please ignore the previous [PATCH 1/2], Sorry!
Move V4SF to V4SI, init vector like V4SI and move to V4SF back.
Better instruction sequence could be generated on Power9:
lfs + xxpermdi + xvcvdpsp + vmrgew
=>
lwz + (sldi + or) + mtvs
On 2020/7/10 03:25, Segher Boessenkool wrote:
>
>> + "TARGET_NO_SF_SUBREG"
>> + "#"
>> + "&& vsx_reg_sfsubreg_ok (operands[0], SFmode)"
>
> Put this in the insn condition? And since this is just a predicate,
> you can just use it instead of gpc_reg_operand.
>
> (The split condition become
On 2020/7/11 08:28, Segher Boessenkool wrote:
Hi!
On Thu, Jul 09, 2020 at 09:14:45PM -0500, Xiong Hu Luo wrote:
* config/rs6000/rs6000.md (rotl_unspec): New
define_insn_and_split.
+; rldimi with UNSPEC_SI_FROM_SF.
+(define_insn_and_split "*rotl_unspec"
Please have rotldi
Hi,
On 2020/7/11 08:54, Segher Boessenkool wrote:
> Hi!
>
> On Fri, Jul 10, 2020 at 09:39:40AM +0800, luoxhu wrote:
>> OK, seems the md file needs a format tool too...
>
> Heh. Just make sure it looks good (that is, does what it looks like),
> looks like the res
Power8-LE, I re-run these cases on
Power8-LE, and confirmed these could pass, what is your platform please?
BTW, TARGET_NO_SF_SUBREG ensured TARGET_POWERPC64 for this
define_insn_and_split.
Thanks.
Xionghu
>
> Thanks, David
>
> On Mon, Jul 13, 2020 at 2:30 AM luoxhu wrote:
>>
&g
1 - 100 of 126 matches
Mail list logo