[Bug target/68837] PowerPC switch statement performance

2020-06-03 Thread guihaoc at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68837 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment #2

[Bug target/68837] PowerPC switch statement performance

2020-06-10 Thread guihaoc at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68837 --- Comment #5 from HaoChen Gui --- I think there are two ways avoiding sign extension for offset loading. a. Make sure all offsets be positive. There exists backward jumps as well as STC will reorder the basic block. So the offset might be neg

[Bug target/100952] [12 regression] several test case failures after r12-1202

2021-06-07 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100952 --- Comment #2 from HaoChen Gui --- For pr81348.c, the lxsihzx and vextsh2d are no long needed as the HI to DI mode promotion is removed by my patch. The short could be loaded directly. addis 9,2,.LANCHOR0+4@toc@ha lha 9,.LANCHOR0+4@toc@l(9) I

[Bug target/100952] [12 regression] several test case failures after r12-1202

2021-06-07 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100952 --- Comment #3 from HaoChen Gui --- For pr56605.c, the pseudo is not taken mode promotion. The original combined insn compare:CC (and:DI (reg:DI 206) changes to compare:CC (and:SI (subreg:SI (reg:DI 206) 0) So the dump scan fails. I will change

[Bug target/100952] [12 regression] several test case failures after r12-1202

2021-06-08 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100952 --- Comment #4 from HaoChen Gui --- For fusion-p10-ldcmpi.c and prefix-no-update.c, the optimization doesn't work when mode promotion is disabled. I will do further investigation.

[Bug target/100952] [12 regression] several test case failures after r12-1202

2021-06-24 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100952 --- Comment #6 from HaoChen Gui --- Seurer, Could you provide detail info about test? Such as config and build option. I tested the build on P10 and no failure on parity_1.f90. Thanks. PASS: gfortran.dg/parity_1.f90 -O0 (test for excess err

[Bug target/100952] [12 regression] several test case failures after r12-1202

2021-06-24 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100952 --- Comment #7 from HaoChen Gui --- PASS: gfortran.dg/parity_1.f90 -O0 execution test

[Bug target/103628] ICE: Segmentation fault (in gfc_conv_tree_to_mpfr)

2023-02-22 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103628 HaoChen Gui changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |guihaoc at gcc dot gnu.org

[Bug rtl-optimization/98179] gcc.dg/pr97954.c fails on (at least) BE powerpc

2022-02-21 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
|--- |INVALID CC||guihaoc at gcc dot gnu.org --- Comment #1 from HaoChen Gui --- Tested gcc11 on Power8 BE. Unable to reproduce this issue. The issue should be already fixed by r11-5613-g404d0ca7820bbf258e2edfac423403ee31b48a7b.

[Bug target/103316] PowerPC: Gimple folding of int128 comparisons produces suboptimal code

2022-03-06 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103316 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment

[Bug tree-optimization/105030] New: store motion if-change flag causes if-conversion optimization can't be taken.

2022-03-22 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
erity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: guihaoc at gcc dot gnu.org Target Milestone: --- // source code extern void bar (double *, int); void foo (double a[], int n) { double atemp = 0.5;

[Bug tree-optimization/105030] store motion if-change flag causes if-conversion optimization can't be taken.

2022-03-22 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105030 HaoChen Gui changed: What|Removed |Added Host||powerpc-*-linux-gnu --- Comment #1 from H

[Bug tree-optimization/105030] store motion if-change flag causes if-conversion optimization can't be taken.

2022-03-23 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105030 --- Comment #3 from HaoChen Gui --- (In reply to Richard Biener from comment #2) > That occured to me as well - I think the answer is maybe. In principle > foo() could launch a thread and make the 'atemp' available to it. As long > as foo() ou

[Bug tree-optimization/105030] store motion if-change flag causes if-conversion optimization can't be taken.

2022-03-24 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105030 --- Comment #5 from HaoChen Gui --- (In reply to Richard Biener from comment #4) > something like > > void *bar (void *x) > { > *(double *)x = 1.; > } > > void foo(int n) > { >double atemp; >pthread_create (..., bar, &atemp); >fo

[Bug tree-optimization/105030] store motion if-change flag causes if-conversion optimization can't be taken.

2022-03-25 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105030 --- Comment #7 from HaoChen Gui --- The original case comes from a Fortran program. I rewrote it with C. As the arguments are passed by reference in Fortran (by default), the problem is common. But I am not sure if it has a large performance imp

[Bug tree-optimization/105030] store motion if-change flag causes if-conversion optimization can't be taken.

2022-03-28 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105030 --- Comment #9 from HaoChen Gui --- Escaped for 'atemp' doesn't be set with Fortran source code, while it's set with C source code. 'auto_var_in_fn_p + pt_solution_includes' works for Fortran code. But if the function is a head of the loop in Fo

[Bug target/102146] [11 regression] several test cases fails after r11-8940

2022-04-07 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102146 --- Comment #4 from HaoChen Gui --- (In reply to Richard Biener from comment #2) > What's the status on the remaining failures? For pr56605.c,I already submitted a patch. Waiting for review. https://gcc.gnu.org/pipermail/gcc-patches/2022-Februa

[Bug target/102146] [11 regression] several test cases fails after r11-8940

2022-04-07 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102146 --- Comment #5 from HaoChen Gui --- For prefix-no-update.c, the patch Segher proposed in PR103197 could fix it.

[Bug tree-optimization/105030] store motion if-change flag causes if-conversion optimization can't be taken.

2022-04-07 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105030 --- Comment #11 from HaoChen Gui --- I tested C source code with Ofast. The Ofast enables data store race. It should do store motion but it fails. The problem is on cselim pass. It does conditional store replacement. The 'atemp' is converted to

[Bug tree-optimization/105030] store motion if-change flag causes if-conversion optimization can't be taken.

2022-04-08 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105030 --- Comment #13 from HaoChen Gui --- Could we use the original alias set if the tree code of 'atemp' is var_decl? Is it safe? In which situation we shall use alias-set zero? Thanks.

[Bug target/102146] [11 regression] several test cases fails after r11-8940

2022-04-13 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102146 --- Comment #9 from HaoChen Gui --- Could you backport the patch to GCC11? Thanks.

[Bug target/102146] [11 regression] several test cases fails after r11-8940

2022-04-13 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102146 --- Comment #10 from HaoChen Gui --- (In reply to HaoChen Gui from comment #9) > Could you backport the patch to GCC11? Thanks. Please ignore it as the patch has problem. Thanks.

[Bug target/103605] [PowerPC] fmin/fmax should be inlined always with xsmindp/xsmaxdp

2022-04-26 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103605 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment

[Bug target/103605] [PowerPC] fmin/fmax should be inlined always with xsmindp/xsmaxdp

2022-04-26 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103605 --- Comment #6 from HaoChen Gui --- gcc -O0 -fsignaling-nans -D_WANT_SNAN -lm -o main main.c && ./main (nan, 3.0), fmin: 3.0, builtin: 3.0, xsmincdp: 3.0, xsmindp: 3.0 (3.0, nan), fmin: 3.0, builtin: nan, xsmincdp: nan, xsmindp: 3.0 (snan, 3.0

[Bug tree-optimization/105414] New: constant folding for fmin/max(snan, snan) is wrong

2022-04-27 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: guihaoc at gcc dot gnu.org Target Milestone: --- gcc -O0 -fsignaling-nans -D_WANT_SNAN -lm -o fmin fmin.c && ./fmin (snan, snan), fmin: nan gcc -O3 -fsignaling-nans -D_WANT_SNAN -lm -o fmi

[Bug tree-optimization/105414] constant folding for fmin/max(snan, snan) is wrong

2022-04-27 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105414 --- Comment #2 from HaoChen Gui --- (In reply to Andrew Pinski from comment #1) > What target is this on? I tested it on ppc64le. But I think it should be on all targets?

[Bug tree-optimization/105414] constant folding for fmin/max(snan, snan) is wrong

2022-04-27 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105414 --- Comment #3 from HaoChen Gui --- For fmin/max behavior, I referred the this ticket. https://sourceware.org/bugzilla/show_bug.cgi?id=20947

[Bug tree-optimization/105414] constant folding for fmin/max(snan, snan) is wrong

2022-04-28 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105414 --- Comment #6 from HaoChen Gui --- (In reply to Richard Biener from comment #4) > I think you want > > (if (!tree_expr_maybe_signaling_nan_p (@0)) > ... > > instead. Thanks so much for comments. Do we have a way to return a NaN directly in

[Bug tree-optimization/105414] constant folding for fmin/max(snan, snan) is wrong

2022-04-29 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105414 --- Comment #8 from HaoChen Gui --- (In reply to Jakub Jelinek from comment #7) > Sure, but you don't want to do that at least if flag_trapping_math. > Otherwise, the predicate would be tree_expr_signaling_nan_p and real_nan > function with "",

[Bug target/100694] PPC: initialization of __int128 is very inefficient

2022-07-25 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100694 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment

[Bug target/100996] rs6000 p10 vector add-add fusion should work with -m32 but doesn't

2022-07-28 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
|RESOLVED CC||guihaoc at gcc dot gnu.org --- Comment #1 from HaoChen Gui --- (In reply to acsawdey from comment #0) > The fusion-p10-addadd.c test case does not get vector add-add fusion when > compiling with -m32: > > /home

[Bug target/100996] rs6000 p10 vector add-add fusion should work with -m32 but doesn't

2022-07-28 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100996 --- Comment #2 from HaoChen Gui --- (In reply to acsawdey from comment #0) > The fusion-p10-addadd.c test case does not get vector add-add fusion when > compiling with -m32: > > /home/sawdey/work/gcc/trunk/build/gcc/xgcc > -B/home/sawdey/work/g

[Bug target/100694] PPC: initialization of __int128 is very inefficient

2022-07-28 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100694 --- Comment #6 from HaoChen Gui --- I made a patch to convert ashift to move when the second operand is const0_rtx. With the patch, the expand dump is just like aarch64's. But the problem is still there. I tested the patch with SPECint. All the

[Bug target/95737] PPC: Unnecessary extsw after negative less than

2022-07-28 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737 HaoChen Gui changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/103498] Spec 2017 imagick_r is 2.62% slower on Power10 with pc-relative addressing compared to not using pc-relative addressing

2022-08-08 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103498 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment

[Bug target/102146] [11 regression] several test cases fails after r11-8940

2022-08-16 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102146 --- Comment #20 from HaoChen Gui --- (In reply to Segher Boessenkool from comment #19) > Hi guys, > > What testcases are still failing? I'm a bit lost :-) pr56605.c is still not fixed. +FAIL: gcc.target/powerpc/pr56605.c scan-rtl-dump-times

[Bug target/103109] madd not used for multiply add on POWER9

2022-08-18 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
||guihaoc at gcc dot gnu.org Status|UNCONFIRMED |RESOLVED --- Comment #2 from HaoChen Gui --- Fixed by r13-2107.

[Bug middle-end/102316] Unexpected stringop-overflow Warnings on POWER CPU

2022-08-25 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102316 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org

[Bug rtl-optimization/107013] New: Add fmin/fmax to RTL codes

2022-09-22 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: guihaoc at gcc dot gnu.org Target Milestone: --- Could we add fmin/fmax to RTL codes so that the C standard fmin/fmax can be represented in RTL without UNSPECs? Currently we only have smin/smax that are not valid for NaNs, or when

[Bug target/103109] madd not used for multiply add on POWER9

2022-10-10 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103109 --- Comment #4 from HaoChen Gui --- (In reply to Peter Bergner from comment #3) > (In reply to HaoChen Gui from comment #2) > > Fixed by r13-2107. > > This is marked version = GCC 12. Were you planning on backporting this? Not sure if the pa

[Bug target/95737] PPC: Unnecessary extsw after negative less than

2022-01-05 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment #6

[Bug target/93127] PPC altivec vec_promote creates unnecessary xxpermdi instruction

2022-01-14 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
|--- |FIXED CC||guihaoc at gcc dot gnu.org --- Comment #4 from HaoChen Gui --- I committed a patch (r12-4987) which is related to this issue. But it doesn't behave as the ticket expects. With the patch, vec_min/max is bound to xv[min|m

[Bug target/95737] PPC: Unnecessary extsw after negative less than

2022-01-14 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95737 --- Comment #9 from HaoChen Gui --- Add a pattern to convert the plus mode to DI. +(define_insn_and_split "*my_split" + [(set (match_operand:DI 0 "gpc_reg_operand") + (sign_extend:DI (plus:SI (match_operand:SI 1 "ca_operand") +

[Bug target/103197] ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte

2022-01-16 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment

[Bug target/100952] [12 regression] several test case failures after r12-1202

2022-01-16 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100952 --- Comment #16 from HaoChen Gui --- prefix-no-update.c should be fixed by the patch Segher proposed in PR103197. pr56605.c got a wrong fixed and failed with GCC11. I will submit a patch to fix it.

[Bug target/103124] PPC: "mr" instruction is unnecessary when extending DI to V1TI

2022-01-17 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103124 HaoChen Gui changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/103124] New: PPC: "mr" instruction is unnecessary when extending DI to V1TI

2021-11-07 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
ty: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: guihaoc at gcc dot gnu.org Target Milestone: --- //test.c vector __int128 init (long long a) { vector __int128 b; b = (vector __int128) {a}; return b; } gcc -O2 -s te

[Bug target/103124] PPC: "mr" instruction is unnecessary when extending DI to V1TI

2021-11-07 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103124 --- Comment #1 from HaoChen Gui --- Build command gcc -O2 -S test.c -mcpu=power9

[Bug target/103124] PPC: "mr" instruction is unnecessary when extending DI to V1TI

2021-11-08 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103124 --- Comment #2 from HaoChen Gui --- //lower-subreg.c /* If this is a cast from one mode to another, where the modes have the same size, and they are not tieable, then mark this register as non-decomposable. I

[Bug target/93453] PPC: rldimi not taken into account to avoid shift+or

2021-11-08 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93453 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment #2

[Bug target/103124] PPC: "mr" instruction is unnecessary when extending DI to V1TI

2021-11-08 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103124 --- Comment #3 from HaoChen Gui --- My solution is to split the move (from TI to V1TI) into one vsx_concat_v2di and one V2DI to V1TI move. Thus, TI register 122 can be decomposed. (insn 12 11 17 2 (set (reg:V1TI 121 [ b ]) (subreg:V1TI

[Bug target/93453] PPC: rldimi not taken into account to avoid shift+or

2021-11-15 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93453 --- Comment #4 from HaoChen Gui --- For the second issue, I drafted following insn_and_split pattern. It tries to combine the shift and ior when the nonzero_bits of operands[3] matches the condition. (define_insn_and_split "*rotl3_insert_8" [

[Bug target/93453] PPC: rldimi not taken into account to avoid shift+or

2021-11-22 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93453 --- Comment #6 from HaoChen Gui --- Sehger, Yes, I found that the nonzero_bits doesn't return exact value in other pass. So calling nonzero_bits in md file is bad as it can't be recognized in other pass. Right now I want to convert a single

[Bug target/93453] PPC: rldimi not taken into account to avoid shift+or

2021-11-24 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93453 --- Comment #8 from HaoChen Gui --- I refined the patch and put all things in a helper - change_pseudo_and_mask. As you mentioned, it's still a band-aid. The perfect solution might be a better version of nonzero_bits. Thanks. diff --git a/gcc/co

[Bug target/103124] PPC: "mr" instruction is unnecessary when extending DI to V1TI

2021-12-01 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103124 --- Comment #5 from HaoChen Gui --- (In reply to Segher Boessenkool from comment #4) > Skipping mode TI for zero_extend lowering. > Splitting mode TI for ashift lowering with shift amounts = > Splitting mode TI for lshiftrt lowering with

[Bug target/100868] PPC: Inefficient code for vec_reve(vector double)

2021-12-06 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
||guihaoc at gcc dot gnu.org Status|NEW |RESOLVED --- Comment #3 from HaoChen Gui --- Fixed on trunk.

[Bug target/100736] ICE: unrecognizable insn

2021-12-07 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100736 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment

[Bug target/100736] ICE: unrecognizable insn

2021-12-08 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100736 --- Comment #4 from HaoChen Gui --- Yes, there is a question. With my patch, the test case generates following assembly. Seems they have the same latency (cror vs. crnot). I wonder why we need reverse the CR bit comparison when finite-math-only

[Bug target/100952] [12 regression] several test case failures after r12-1202

2021-12-19 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100952 --- Comment #13 from HaoChen Gui --- Issue for fusion-p10-ldcmpi.c was fixed by r12-1655. https://gcc.gnu.org/pipermail/gcc-cvs/2021-June/349357.html

[Bug target/103784] New: suboptimal code for returning bool value on target ppc

2021-12-20 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
Component: target Assignee: unassigned at gcc dot gnu.org Reporter: guihaoc at gcc dot gnu.org Target Milestone: --- //test.c #include bool foo (int a, int b) { if (a > 2) return false; if (b < 10) return true; return true; } //assembly with trunk

[Bug target/103784] suboptimal code for returning bool value on target ppc

2021-12-20 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784 --- Comment #2 from HaoChen Gui --- Sorry, I pasted wrong codes. Here are the correct ones. //test.c #include bool foo (int a, int b) { if (a > 2) return false; if (b < 10) return true; return false; } //assembly with the trunk

[Bug target/103784] suboptimal code for returning bool value on target ppc

2021-12-20 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103784 --- Comment #4 from HaoChen Gui --- output with "-fdump-tree-optimized=/dev/stdout" ;; Function foo (foo, funcdef_no=0, decl_uid=3317, cgraph_uid=1, symbol_order=0) Removing basic block 5 _Bool foo (int a, int b) { _Bool _1; _Bool _5; [

[Bug target/102146] [11 regression] several test cases fails after r11-8940

2021-09-02 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102146 --- Comment #1 from HaoChen Gui --- For pr81348.c, it was already fixed by r11-8941. Segher backported it. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100952#c12 PASS: gcc.target/powerpc/pr81348.c (test for excess errors) PASS: gcc.target/pow

[Bug target/102169] powerpc64 int memory operations using FP instructions

2021-09-29 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102169 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment

[Bug target/108004] New: x-form logical operations with dot instructions are not emitted.

2022-12-06 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: guihaoc at gcc dot gnu.org Target Milestone: --- //test case int foo (int a, int b, int c, int d) { return (a & b) > 0 ? c : d; } //assemble on P9 and 3,3,4

[Bug target/108004] x-form logical operations with dot instructions are not emitted.

2022-12-06 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108004 --- Comment #3 from HaoChen Gui --- (In reply to Andrew Pinski from comment #2) > Especially when it comes to signed comparisons. >From the ISA, For all fixed-point instructions in which Rc=1, and for addic., andi., and andis., the first three

[Bug target/108004] x-form logical operations with dot instructions are not emitted.

2022-12-07 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108004 --- Comment #4 from HaoChen Gui --- $cat asm_test.c #include unsigned long foo() { unsigned long res; __asm__ ("li 3,0x\n\t" "li 4,0xfff1\n\t" "and. 3,3,4\n\t" "mfcr %0"

[Bug target/100866] PPC: Inefficient code for vec_revb(vector unsigned short) < P9

2022-12-13 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
|RESOLVED Assignee|unassigned at gcc dot gnu.org |guihaoc at gcc dot gnu.org CC||guihaoc at gcc dot gnu.org --- Comment #17 from HaoChen Gui --- Both issues are fixed.

[Bug target/100952] [12/13 regression] several test case failures after r12-1202

2022-12-19 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100952 HaoChen Gui changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug tree-optimization/105414] constant folding for fmin/max(snan, snan) is wrong

2022-05-11 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105414 HaoChen Gui changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/102146] [11 regression] several test cases fails after r11-8940

2022-05-19 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102146 --- Comment #15 from HaoChen Gui --- As r12-8128 was revoked, failure of pr56605.c is still not fixed.

[Bug target/103316] PowerPC: Gimple folding of int128 comparisons produces suboptimal code

2022-06-16 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
at gcc dot gnu.org |guihaoc at gcc dot gnu.org Resolution|--- |FIXED --- Comment #18 from HaoChen Gui --- Fixed by r13-1131

[Bug target/101865] _ARCH_PWR8 is not defined when using -mcpu=power8

2021-08-24 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101865 --- Comment #4 from HaoChen Gui --- Codes in rs6000-cpus.def, #define ISA_2_7_MASKS_SERVER(ISA_2_6_MASKS_SERVER \ | OPTION_MASK_P8_VECTOR\ |

[Bug target/101865] _ARCH_PWR8 is not defined when using -mcpu=power8

2021-08-24 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101865 --- Comment #6 from HaoChen Gui --- (In reply to Segher Boessenkool from comment #5) > (In reply to HaoChen Gui from comment #4) > > I wonder if it's a Power8 architecture when those 6 options are all > > disabled. Or it is regressed to Power7?

[Bug target/101865] _ARCH_PWR8 is not defined when using -mcpu=power8

2021-08-25 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101865 --- Comment #9 from HaoChen Gui --- (In reply to Tulio Magno Quites Machado Filho from comment #7) > (In reply to HaoChen Gui from comment #6) > > Does _ARCH_PWR8 impact anything during the compiling? > > I can answer this question from an user

[Bug target/93802] gcc generates a rlwinm/or pair instead of a single rlwimi (powerpc)

2024-03-05 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
|RESOLVED CC||guihaoc at gcc dot gnu.org --- Comment #2 from HaoChen Gui --- It's already fixed by f4a3cea3fb02. Now it generates a single rlwimi. rlwimi 3,3,16,0,31-16

[Bug target/103605] [PowerPC] fmin/fmax should be inlined always with xsmindp/xsmaxdp

2023-08-16 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103605 HaoChen Gui changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/110429] Redundant vector extract instruction on P9

2023-08-16 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110429 HaoChen Gui changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/106769] PPCLE: vec_extract(vector unsigned int) unnecessary rldicl after mfvsrwz

2023-08-16 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106769 HaoChen Gui changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug rtl-optimization/110034] The first popped allcono doesn't take precedence over later popped in ira coloring

2023-08-29 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110034 --- Comment #5 from HaoChen Gui --- (In reply to Vladimir Makarov from comment #4) > Thank you for providing the test case. > > To be honest I don't see why assigning to hr3 to r134 is better. > Currently we have the following assignments: > >

[Bug target/108728] gcc.dg/torture/float128-cmp-invalid.c fails on power 9 BE

2023-08-29 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108728 HaoChen Gui changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug rtl-optimization/110034] The first popped allcono doesn't take precedence over later popped in ira coloring

2023-08-30 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110034 HaoChen Gui changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED

[Bug target/108812] gcc.target/powerpc/p9-sign_extend-runnable.c fails on power 9 BE

2023-09-11 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108812 HaoChen Gui changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/96762] ICE in extract_insn, at recog.c:2294 (error: unrecognizable insn)

2023-09-11 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96762 HaoChen Gui changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/111449] New: memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions

2023-09-17 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: guihaoc at gcc dot gnu.org Target Milestone: --- int compare (const char* s1, const char* s2) { return __builtin_memcmp (s1, s2, 16) == 0; } trunk

[Bug target/88558] Inline lrint, lrintf

2023-10-08 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88558 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org

[Bug target/116030] ICE "could not split insn" in final_scan_insn_1, at final.cc on power pc

2024-08-18 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116030 --- Comment #2 from HaoChen Gui --- Peter, the root cause of the issue is the combine is done at ira pass. There is no split pass after ira and reload. So the split has to been done after reload, which causes the ICE. Jeff (Jiufu) is working on

[Bug target/54063] [10/11/12/13/14 regression] on powerpc64 gcc 4.9/8 generates larger code for global variable accesses than gcc 4.7

2023-04-20 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54063 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment

[Bug target/114732] New: ge can't be reversed to unlt for bcd compares

2024-04-15 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
onent: target Assignee: unassigned at gcc dot gnu.org Reporter: guihaoc at gcc dot gnu.org Target Milestone: --- //test.c int foo (vector unsigned char a, vector unsigned char b) { return __builtin_vec_bcdsub_ge (a, b, 0) != 1; } //assembly bcdsub. 2,2,3,0

[Bug target/114732] ge can't be reversed to unlt for bcd compares

2024-04-15 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114732 --- Comment #1 from HaoChen Gui --- A straightforward test case. It passes when compiling with O0 and aborts when compiling with O2. //test.c #include #define BCD_POS0 12// 0xC #define BCD_NEG 13// 0xD void abort (void); vecto

[Bug target/114732] ge can't be reversed to unlt for bcd compares

2024-04-16 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114732 --- Comment #6 from HaoChen Gui --- (In reply to Segher Boessenkool from comment #3) > 1001, 0101, 0011 I mean of course. > > In some ways CCmode models this better than CCFPmode, but we do not actually > model > the SO bit (bit 3) at all in CC

[Bug rtl-optimization/112417] New: expand_builtin_return shoud check alignment for the memory reference

2023-11-06 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: guihaoc at gcc dot gnu.org Target Milestone: --- //test.c void * foo (void * p) { if (p) __builtin_return (p); } when compiling it with mno-vsx on ppc64, it

[Bug rtl-optimization/112417] expand_builtin_return shoud check alignment for the memory reference

2023-11-06 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112417 HaoChen Gui changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |guihaoc at gcc dot gnu.org

[Bug target/111449] memcmp (p,q,16) == 0 can be optimized better on ppc64 with vector comparison instructions

2023-11-17 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111449 HaoChen Gui changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/112707] [14 regression] gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-11-26 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707 HaoChen Gui changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #4 f

[Bug target/112707] [14 regression] gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-11-27 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707 --- Comment #9 from HaoChen Gui --- (In reply to Segher Boessenkool from comment #8) > Yeah, it tested for ISA 2.04 before. That was an attempt at including 476 > probably? > > We really should have a TARGET_FCTID, on for TARGET_POWERPC64 or f

[Bug target/112707] [14 regression] gcc 14 outputs invalid assembly on ppc: Error: unrecognized opcode: `fctid'

2023-12-10 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112707 HaoChen Gui changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug rtl-optimization/110034] New: The first popped allcono doesn't take precedence over later popped in ira coloring

2023-05-30 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
IRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: guihaoc at gcc dot gnu.org Target Milestone: --- Followings are ira dumps from a test case. r134 has only one cp(shuffle) with r173. The r173 and

[Bug target/106769] PPCLE: vec_extract(vector unsigned int) unnecessary rldicl after mfvsrwz

2023-05-30 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106769 HaoChen Gui changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |guihaoc at gcc dot gnu.org

[Bug rtl-optimization/110034] The first popped allcono doesn't take precedence over later popped in ira coloring

2023-05-30 Thread guihaoc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110034 HaoChen Gui changed: What|Removed |Added CC||guihaoc at gcc dot gnu.org --- Comment

  1   2   >