[Bug target/93738] [13/14/15/16 regression] test case gcc.target/powerpc/20050603-3.c fails

2025-09-04 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93738 --- Comment #17 from Segher Boessenkool --- (In reply to Kishan Parmar from comment #16) > Apart from this, i notice that other arch like ia64, aarch64, etc.. lowers > zero_extract to respective bit-field extract insns.. should we do the same > f

[Bug target/113939] Switch m68k to LRA

2025-09-01 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113939 --- Comment #14 from Segher Boessenkool --- (In reply to Mikael Pettersson from comment #12) > Could we perhaps switch the default to improve testing coverage with LRA? You do not have much time left for testing *without* LRA! Old reload will

[Bug tree-optimization/101641] Bogus redundant store removal

2025-09-01 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641 --- Comment #12 from Segher Boessenkool --- And in RTL there are no types at all. Everything just holds bit patterns, which are interpreted in various modes. You can never translate that to a C type, no matter how hard you try!

[Bug tree-optimization/101641] Bogus redundant store removal

2025-09-01 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641 --- Comment #10 from Segher Boessenkool --- > > > That is, somehow we must anticipate the removal, > > > I suppose it is via > > > > > > /* Recognize all noop sets, these will be killed by followup pass. */ > > > if (insn_code_number < 0 &

[Bug tree-optimization/101641] Bogus redundant store removal

2025-08-29 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101641 --- Comment #8 from Segher Boessenkool --- Hi! (In reply to Richard Biener from comment #7) > Wow, and this time it's even combine coming into play! But it is just something that happens during the instruction combiner pass, not anything to do

[Bug target/117818] [13/14/15/16 regression] vec_add incorrectly generates vadduwm for vector char const inputs.

2025-08-22 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818 --- Comment #12 from Segher Boessenkool --- Hrm, yeah, the ISA says bits 57..63 of VRB. That seems wrong, 121..127 is more logical. Let me test what existing hardware does.

[Bug target/117818] [13/14/15/16 regression] vec_add incorrectly generates vadduwm for vector char const inputs.

2025-08-22 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818 --- Comment #10 from Segher Boessenkool --- Btw, from power10 on (arch 3.1 and later) vslq (and vsrq) are preferred :-)

[Bug target/121520] [16 regression] g++.dg/DRs/dr2575.C FAIL

2025-08-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121520 --- Comment #13 from Segher Boessenkool --- (In reply to Kishan Parmar from comment #12) > @Jakub New Tests still fails on Power10, Power9 IEEE128, Power9, Power8 > IEEE128. > > FAIL: g++.dg/DRs/dr2581-1.C -std=c++23 (test for warnings, line

[Bug target/121520] [16 regression] g++.dg/DRs/dr2575.C FAIL

2025-08-16 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121520 --- Comment #11 from Segher Boessenkool --- (In reply to Sam James from comment #10) > (In reply to Sam James from comment #9) > > All of the changes in > > https://inbox.sourceware.org/gcc-patches/4b00a310-89fd-40b2-a7d1- > > 93cf55d0a...@redha

[Bug target/121520] [16 regression] g++.dg/DRs/dr2575.C FAIL

2025-08-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121520 --- Comment #8 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #7) > Should be fixed now. But was it approved? Hint: it wasn't.

[Bug target/120528] Optimize zero extend to TImode when value is in a VSX register on power10

2025-08-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120528 --- Comment #2 from Segher Boessenkool --- And you can always use mtvsrdd d,0,s (which uses literal 0 as first source, not GPR0). In RTL that canonically is written as a zero_extend.

[Bug target/121076] PPCLE: Inefficient implementation of __builtin_bswap16

2025-08-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121076 --- Comment #2 from Segher Boessenkool --- Trying 6 -> 7: 6: {r117:HI=bswap(r122:DI#6);clobber scratch;} REG_DEAD r122:DI 7: r121:DI=zero_extend(r117:HI) REG_DEAD r117:HI Failed to match this instruction: (set (reg:DI 121 [ _

[Bug target/121076] PPCLE: Inefficient implementation of __builtin_bswap16

2025-08-14 Thread segher at gcc dot gnu.org via Gcc-bugs
||2025-08-14 CC||segher at gcc dot gnu.org Ever confirmed|0 |1 Target|powerpc-*-*-* |powerpc*-*-* --- Comment #1 from Segher Boessenkool --- Confirmed. Try -mcpu=power9 if you

[Bug target/121525] Add _Float16 support to PowerPC starting with power9

2025-08-13 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121525 --- Comment #5 from Segher Boessenkool --- (In reply to Joseph S. Myers from comment #4) > If you pass _Float16 in a double precision register, note potential > signaling NaN issues - preferably a signaling NaN of type _Float16 should be > passe

[Bug target/121525] Add _Float16 support to PowerPC starting with power9

2025-08-13 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121525 --- Comment #3 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #2) > If at all possible, please support _Float16 regardless of the ISA, Of course. Normally there will be emulation routines in libgcc, but very worst case we

[Bug target/121525] Add _Float16 support to PowerPC starting with power9

2025-08-13 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121525 --- Comment #1 from Segher Boessenkool --- (In reply to Michael Meissner from comment #0) > Power9 (i.e. ISA 3.0) added support for the xscvdphp, xscvhpdp instructions > that convert scalar IEEE 16-bit floating points to/from SFmode as well as >

[Bug testsuite/121501] [16 regression] gcc.target/powerpc/sse4_1-pblendw-2.c FAIL starting with - r16-3067-g8e3239e3e92f3c

2025-08-11 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121501 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org

[Bug rtl-optimization/121503] [16 regression] gcc.c-torture/execute/pr77718.c FAIL starting with - r16-3067-g8e3239e3e92f3c

2025-08-11 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121503 --- Comment #5 from Segher Boessenkool --- Btw, the -mvsx forces -mcpu=power7. This is a well-known and ancient bug. Please don't rely on this, one day this misbehaviour will go away! :-)

[Bug rtl-optimization/121503] [16 regression] gcc.c-torture/execute/pr77718.c FAIL starting with - r16-3067-g8e3239e3e92f3c

2025-08-11 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121503 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org

[Bug middle-end/121470] (unsigned short)0x8000 is handled unexpectedly; due to the way const_int is handled

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121470 --- Comment #10 from Segher Boessenkool --- (In reply to Andrew Pinski from comment #8) > (In reply to Segher Boessenkool from comment #7) > > Please stop the vandalism. This is NOT a dup. > > How is it not? > (unsigned char)0x80 vs (unsigned

[Bug middle-end/121470] (unsigned short)0x8000 is handled unexpectedly; due to the way const_int is handled

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121470 --- Comment #9 from Segher Boessenkool --- (In reply to Segher Boessenkool from comment #7) > Please stop the vandalism. This is NOT a dup. Of course this is not "how it always worked". We used to have RTL way earlier in the pipeline already.

[Bug middle-end/121470] (unsigned short)0x8000 is handled unexpectedly; due to the way const_int is handled

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121470 Segher Boessenkool changed: What|Removed |Added Resolution|DUPLICATE |--- Status|RESOLVED

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #25 from Segher Boessenkool --- The number is an integer constant. 32768 is 32768, not -32768. The value got that way (potentially, but not in this case even) because it was cast to un unsigned short. All of that is done way before

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #20 from Segher Boessenkool --- (In reply to Andrew Pinski from comment #18) > Simple answer: > When the INTEGER_CST (unsigned short) is expanded into a const_int, the sign > extend happens due to the rules of const_int. What does th

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #19 from Segher Boessenkool --- So, apparently force_reg was called here, and it went way down from there. It never should have ended up there, but it is a very common thing, there are tens of ways to get there, no clue what happened

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #17 from Segher Boessenkool --- When you cast to *signed* short instead, you get -32768, at tree level already. And that is correct. This is not the problem here. With the "unsigned short" code, f() here, you get +32768 at tree leve

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #16 from Segher Boessenkool --- There _is_ no const_int there yet, btw. There is no RTL at all yet!

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #15 from Segher Boessenkool --- What is this "TYPE_MODE"? Nothing here has type short_int, HImode: we have an integer constant value, 32768, which is cast to "unsigned short", which is a no-op: that results in an integer constant 327

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #13 from Segher Boessenkool --- (In reply to Andrew Pinski from comment #12) > (In reply to Segher Boessenkool from comment #11) > > (In reply to Andrew Pinski from comment #8) > > > Note this is documented in the internals documentat

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #11 from Segher Boessenkool --- (In reply to Andrew Pinski from comment #8) > Note this is documented in the internals documentation. What is? "We have a bug here"? I doubt it.

[Bug middle-end/121470] (unsigned short)0x8000 is handled unexpectedly; due to the way const_int is handled

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121470 Segher Boessenkool changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|DUPLIC

[Bug middle-end/121470] (unsigned short)0x8000 is handled unexpectedly; due to the way const_int is handled

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121470 Segher Boessenkool changed: What|Removed |Added Ever confirmed|0 |1 Resolution|DUPLICATE

[Bug rtl-optimization/121470] New: (unsigned short)0x8000 is expanded incorrectly

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: segher at gcc dot gnu.org Target Milestone: --- Created attachment 62086 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62086&action=edit testcase This testcase: === void f(void) { asm(&quo

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-07 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #19 from Segher Boessenkool --- (In reply to Avinash Jayakar from comment #17) > I looked at the slp vectorization pass that converts scalar gimple code to "straight-line paralellisation". Some "scalar" (whatever that means) things

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-07 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #18 from Segher Boessenkool --- (In reply to Surya Kumari Jangala from comment #16) > With the testcase in the "Description", we are seeing both a splat and a > shift being generated. Instead, a single add instruction is more efficie

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #15 from Segher Boessenkool --- (In reply to Avinash Jayakar from comment #14) > (In reply to Surya Kumari Jangala from comment #12) > > Ok. We also need to tackle the original issue, which is that a shift left > > can be optimized b

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #13 from Segher Boessenkool --- (In reply to Surya Kumari Jangala from comment #12) > Ok. We also need to tackle the original issue, which is that a shift left > can be optimized by generating a vector add. Perhaps tackle this issue

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-04 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #11 from Segher Boessenkool --- > Segher, is this a case of needing to add a combiner pattern to translate that > splat/shift into an add of itself? You only ever do "combiner patterns" to recognise something that combine generates

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-08-02 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #15 from Segher Boessenkool --- (In reply to Steven Munroe from comment #12) > Also from PowerISA 3.1C > > The result is placed into VSR[VRT+32], except if, for any > byte element in VSR[VRB+32], the low-order 3 bits are not > equal

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-08-02 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #14 from Segher Boessenkool --- (In reply to Steven Munroe from comment #11) > And as you point out the instructions vslo/vsro/vsl/vsr only care about bits > 121..127. Also older machines needed the byte splat for vsl/vsr. vslq look

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-08-02 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #13 from Segher Boessenkool --- (In reply to Segher Boessenkool from comment #9) > Both vsl and vslo actually look only at the right-most byte in the shift > amount argument (bits 125..127 resp. bits 121..124). In original AltiVec i

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-08-01 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #10 from Segher Boessenkool --- (In reply to Steven Munroe from comment #8) > It seems the evolution of the PowerISA and Vector intrinsics has not been > smooth. > > It is not obvious how to generate xxspltib from an intrinsic. > Ve

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-08-01 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #9 from Segher Boessenkool --- Both vsl and vslo actually look only at the right-most byte in the shift amount argument (bits 125..127 resp. bits 121..124). In original AltiVec it was required to hold the same value in every lane, b

[Bug target/118890] ubsan bootstrap failure for powerpc64le-unknown-linux-gnu

2025-07-31 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118890 --- Comment #6 from Segher Boessenkool --- So, is this all done now?

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-07-31 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #5 from Segher Boessenkool --- But of course we need -mcpu=power8 or later for that insn.

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-07-31 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #4 from Segher Boessenkool --- No, we should generate code as Peter says in #c1. Doing a shift is worse code.

[Bug libgcc/115242] libgcc unwinder does not handle vector registers, even if the target machine supports them.

2025-07-31 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115242 --- Comment #10 from Segher Boessenkool --- (In reply to Florian Weimer from comment #9) > (In reply to Segher Boessenkool from comment #8) > > Can we have a testcase please? > > The test case in glibc: https://sourceware.org/bugzilla/show_bug.

[Bug libgcc/115242] libgcc unwinder does not handle vector registers, even if the target machine supports them.

2025-07-30 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115242 --- Comment #8 from Segher Boessenkool --- Can we have a testcase please? It sounds like the glibc you used was misconfigured. Of course VSX registers are not restored by the GCC unwinder stuff if you configured GCC to not support VSX register

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-07-30 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 Segher Boessenkool changed: What|Removed |Added Last reconfirmed||2025-07-30 Ever confirmed|0

[Bug target/93738] [13/14/15/16 regression] test case gcc.target/powerpc/20050603-3.c fails

2025-07-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93738 --- Comment #15 from Segher Boessenkool --- (In reply to Kishan Parmar from comment #13) > Operand of 10 gets converted to below insn > > (and:SI (subreg:SI (lshiftrt:DI (reg:DI 129 [ x+-4 ]) > (const_int 12 [0xc])) 4) > (const_i

[Bug testsuite/119382] [15 Regression] gcc.target/powerpc/vsx-builtin-7.c fail starting with r15-7961-gdc47161c1f32c3

2025-07-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119382 --- Comment #11 from Segher Boessenkool --- The flag wil help. But it isn't as permanent as you should like: it's not really more than a side effect. So it won't really vanquish the problem.

[Bug target/115800] PowerPC GCC cannot build a little endian compile if --with-cpu=power5 is used

2025-07-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800 Segher Boessenkool changed: What|Removed |Added Status|RESOLVED|WAITING Resolution|WONTFIX

[Bug target/115800] PowerPC GCC cannot build a little endian compile if --with-cpu=power5 is used

2025-07-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800 --- Comment #10 from Segher Boessenkool --- (In reply to Michael Meissner from comment #8) > Given powerpcle64 requires a minimum of power8, I'm not sure it is worth > making libgfortran and libstdc++ build using --with-cpu=power5. In the past,

[Bug target/115800] PowerPC GCC cannot build a little endian compile if --with-cpu=power5 is used

2025-07-24 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800 --- Comment #9 from Segher Boessenkool --- (In reply to Andreas Schwab from comment #7) > It is generally assumed that powerpc64le-*-* implies POWER7+ (glibc even > requires POWER8+). This is independent of the older -mlittle support (which > d

[Bug target/93738] [13/14/15/16 regression] test case gcc.target/powerpc/20050603-3.c fails

2025-07-24 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93738 --- Comment #12 from Segher Boessenkool --- > However, this pattern is failing to match in some cases, > and we end up with two separate instructions: one for rotate and another for > insert. So this is *not* a combine problem at all you say? J

[Bug target/108958] Powerpcle could generate mtvsrdd for zero extend DI to TI mode, when the TImode is in a vector register

2025-07-22 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108958 --- Comment #4 from Segher Boessenkool --- (Btw, the subject says "powerpcle", but this is about something very different: powerpc64le. "powerpcle" is also a valid first component of a target triple! Almost no one used 32-bit PowerPC in wrong-e

[Bug testsuite/120805] [16 Regression] gcc.target/powerpc/p9-vec-length-epil-4.c fail starting with r16-1645-g309dbcea2cabb3

2025-07-22 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805 --- Comment #14 from Segher Boessenkool --- Hi! (In reply to Avinash Jayakar from comment #12) > (In reply to Segher Boessenkool from comment #10) > > As a meta-comment: almost everything using scan-assembler-times is > > obfuscated. > > > > It

[Bug testsuite/120805] [16 Regression] gcc.target/powerpc/p9-vec-length-epil-4.c fail starting with r16-1645-g309dbcea2cabb3

2025-07-21 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org

[Bug target/121095] [15 Regression] Possibly unnecessary PRE pass on aarch64 for fpmr

2025-07-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121095 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-07-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #4 from Segher Boessenkool --- It's the splitter at altivec.md:321

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-07-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #3 from Segher Boessenkool --- Does xxspltib_constant_p return the wrong num_insns, or is the problem something lower, some splitter?

[Bug target/121007] [15 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 Segher Boessenkool changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #14 from Segher Boessenkool --- Thanks! If there is anything we (Power people) can do, please let us know!

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2025-07-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 --- Comment #17 from Segher Boessenkool --- Hi! So, why do we not generate xxspltib where it would help. Please send a patch? Improvements will usually be to the xxspltib-generating code itself, not to the legacy code that generates the old (c

[Bug target/87949] PowerPC saves CR registers across calls

2025-07-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87949 Segher Boessenkool changed: What|Removed |Added Status|ASSIGNED|NEW

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #12 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #11) > Though even there is uninitialized read I guess from temp.a. > That said, LRA obviously shouldn't hang even on code which has UB at runtime. Of course.

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #9 from Segher Boessenkool --- Hrm, the insn here is just a mulldi instruction, a bog-standard integer multiplication (by a constant, 6 here). But insn 58 (where the problems start, "Changing address in insn 58 r218:DI&0xfff

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #8 from Segher Boessenkool --- (Also tested on powerpc-linux (where things just work), and on powerpc64-linux (the older ABI, correct-endian), where it fails just the same as on LE).

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #7 from Segher Boessenkool --- Cool, thanks! 121007.c:36:3: warning: 'v4' may be used uninitialized [-Wmaybe-uninitialized] No clue why it says "may be" there, it obviously *is* used uninitialised, this is the first time it is used

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #4 from Segher Boessenkool --- (In reply to Andrew Pinski from comment #1) > This is definitely sounding more and more like PR 93658. Yes, and maybe the error / fix / workaround will be similar: replace a VECTOR_MEM_ALTIVEC_P by VEC

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #3 from Segher Boessenkool --- Does anyone want to take this? Fame and fortune await! We need a reduced test case btw :-)

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #26 from Segher Boessenkool --- (In reply to chenglulu from comment #25) > > And if the input is non-sensical, the compiler output will be as well, or > > the > > compiler can give up in some cases. > > > I also don't quite agree t

[Bug rtl-optimization/120983] recog violates earlyclobber with user-defined hard register before reload (causing ICE on gcc.target/loongarch/bitwise-shift-reassoc-clobber.c)

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120983 --- Comment #3 from Segher Boessenkool --- Please attach a testcase, and how to compile the code (-O2 etc.). Oh, and fill in the target field :-)

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #24 from Segher Boessenkool --- (In reply to Xi Ruoyao from comment #21) > (In reply to Segher Boessenkool from comment #20) > > (In reply to Peter Bergner from comment #17) > > > The reason operands 0, 1 and 4 all use the register r

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #23 from Segher Boessenkool --- It is a different target. Your issue has nothing at all to do with the problem we used to have. The root cause is very likely completely unrelated. Etc. etc. etc.

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #20 from Segher Boessenkool --- (In reply to Peter Bergner from comment #17) > The reason operands 0, 1 and 4 all use the register r23, is that each > operand is using the same pseudo, coming from variable "x", which is a user > defi

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #19 from Segher Boessenkool --- Hi Peter! (In reply to Peter Bergner from comment #18) > So the error message is coming from this hunk in my patch: > > + /* Both the earlyclobber operand and conflicting operand > +

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #16 from Segher Boessenkool --- It is allowed by recog(). Most likely your pattern is incorrect, but it is not completely impossible there is something wrong in genrecog.cc -- but that isn't combine either.

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #14 from Segher Boessenkool --- (match_operand:DI 1 "register_operand" "r0") That means either a general register ("r"), or the same thing as operand 0 (that's what "0" means)! So the backend explicitly allows it

[Bug target/113934] Switch avr to LRA

2025-06-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 --- Comment #14 from Segher Boessenkool --- Congratulations, and thank you!

[Bug tree-optimization/120598] Compiler is unable to vectorise scalar code

2025-06-20 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598 Segher Boessenkool changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2025-06-20 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 120598, which changed state. Bug 120598 Summary: Compiler is unable to vectorise scalar code https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598 What|Removed |Added -

[Bug tree-optimization/120598] Compiler is unable to vectorise scalar code

2025-06-19 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598 --- Comment #8 from Segher Boessenkool --- (In reply to Jeevitha from comment #6) > The following dot_product function gets vectorized with the latest GCC trunk > and gcc 15.1.0: > > #include > #include > extern float dot_product(const int16_

[Bug tree-optimization/120598] Compiler is unable to vectorise scalar code

2025-06-19 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598 --- Comment #7 from Segher Boessenkool --- [I cannot read any of the attached code, but...] The proposed manually vectorised code converts 64-bit integers to IEEE SP floats, which is extremely lossy. I don't find it very surprising the compile

[Bug target/120681] PowerPC GCC turns off pc-relative addressing on power10 when -mcmodel=large is used

2025-06-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120681 --- Comment #2 from Segher Boessenkool --- What is this testcase meant to test? The only thing it *does* test is if this trivial piece of code compiles at all (it doesn't test if the code generated is correct, or anything else about it!) It ju

[Bug testsuite/120519] g++.target/powerpc/mvc-symbols1.C fail starting with r16-965-g83eee43e998d0a

2025-06-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120519 --- Comment #10 from Segher Boessenkool --- I was not cc:'ed. And I did not approve it. It should not have been committed. We have (minimal!) process for a reason. It would be chaos without it.

[Bug rtl-optimization/74585] powerpc64: Very poor code generation for homogeneous vector aggregates passed in registers

2025-06-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74585 --- Comment #17 from Segher Boessenkool --- The stack is always in memory, AFAIK :-) Do we have any targets where it is not? Do we have any targets where BLKmode is not always in memory? That is something that should be documented btw :-) Any

[Bug testsuite/120519] g++.target/powerpc/mvc-symbols1.C fail starting with r16-965-g83eee43e998d0a

2025-06-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120519 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org

[Bug rtl-optimization/74585] powerpc64: Very poor code generation for homogeneous vector aggregates passed in registers

2025-06-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74585 --- Comment #15 from Segher Boessenkool --- The compiler now seems to assume in earlier passes that parameters and return values are passed in memory. This is very sub-optimal, all but the last passes cannot do much useful work this way.

[Bug target/115576] [14/15/16 regression] Worse code generated for simple struct conversion since r14-2386-gbdf2737cda53a8

2025-06-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115576 --- Comment #9 from Segher Boessenkool --- This belong in simplify-rtx, not in combine.

[Bug target/108415] ICE in emit_library_call_value_1 at gcc/calls.cc:4181

2025-06-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108415 --- Comment #9 from Segher Boessenkool --- What is the current state here? We should simply not allow -mmodulo at all if we do not generate such insns (we do not have a -mcpu= that allows those). We do not want multiple ways to do thing, certa

[Bug rtl-optimization/108273] Inconsistent dfa state between debug and non-debug

2025-06-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108273 --- Comment #10 from Segher Boessenkool --- The problem seems to be in generic scheduling code, not in the Power backend. Can someone confirm this, or point out where the problem is, is show the problem no longer exists? Whatever way we can re

[Bug middle-end/119600] HOST_WIDEST_FAST_INT should be used instead of long for BITMAP_WORD in bitmap.h

2025-05-21 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119600 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org

[Bug target/108958] Powerpcle could generate mtvsrdd for zero extend DI to TI mode, when the TImode is in a vector register

2025-05-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108958 --- Comment #2 from Segher Boessenkool --- (A good patch is like: we currently generate X (because of Y Z A), but we could do B C D instead, and generate E).

[Bug target/108958] Powerpcle could generate mtvsrdd for zero extend DI to TI mode, when the TImode is in a vector register

2025-05-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108958 --- Comment #1 from Segher Boessenkool --- Sure. What do we need to improve on this? Please propose a patch :-)

[Bug target/97786] rs6000 isinf etc. are pretty horrible

2025-05-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97786 --- Comment #9 from Segher Boessenkool --- (Erm,tdc *is* 3.0, but setbc is 3.1, I can never ever get this right it seems! But setb is 3.0).

[Bug target/97786] rs6000 isinf etc. are pretty horrible

2025-05-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97786 --- Comment #8 from Segher Boessenkool --- (In reply to Surya Kumari Jangala from comment #7) > Hi Segher, > > Thanks for the pointers! > We can optimize the code further and remove the branch completely. > > For P10: > > xststdcdp 0,1,48

[Bug target/113939] Switch m68k to LRA

2025-05-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113939 --- Comment #11 from Segher Boessenkool --- (In reply to John Paul Adrian Glaubitz from comment #7) > (In reply to John Paul Adrian Glaubitz from comment #6) > > I suggest we switch m68k to LRA, so we can close this bug report. Plus file > > bu

[Bug target/117818] [12/13/14/15/16 regression] vec_add incorrectly generates vadduwm for vector char const inputs.

2025-05-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818 --- Comment #8 from Segher Boessenkool --- We still support powerpc64-* just fine. And powerpc-linux (the 32-bit target) is tested just fine as well, and the community does support it. No one cares _too_ much about it anymore, but why let it d

[Bug target/97786] rs6000 isinf etc. are pretty horrible

2025-05-07 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97786 --- Comment #6 from Segher Boessenkool --- Hi Surya! Hrm yes, xststdcdp _does_ return a sign bit as well. Do we currently say that in RTL as well? Unfortunately we cannot just follow an xststdcdp by a setb, setb tests bit 1, but the tdp sets b

  1   2   3   4   5   6   7   8   9   10   >