[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-23 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 --- Comment #12 from Segher Boessenkool --- (In reply to Richard Biener from comment #7) > Is it possible to define a fused and/xor+xor alternative that's split after > RA, slightly pessimized to prefer the altivec alternative, to allow the XXL

[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-22 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 --- Comment #11 from Michael Meissner --- For singlebuff.c, there is a clear improvement when using the XXEVAL instruction: XXEVAL TRUNK GCC14 GCC13 GCC12 GCC11 -- - - - - - -O3: 4.46 5.40

[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-22 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 --- Comment #10 from Michael Meissner --- There is an instruction that was added in power10 (XXEVAL) that does provide fusion between VSX vectors that includes ANDC->XOR and XOR->XOR fusion. I have coded up patches to support this and I will be

[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-22 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 --- Comment #9 from Michael Meissner --- I tried several of the options to change the code generation: -mno-power10-fusion which disables doing the fusion pairing. Combinations of -fno-schedule-insns and -fno-schedule-insns2. -fno-sched-press

[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-22 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 --- Comment #8 from Michael Meissner --- I added an option to not do the combiner patterns until after reload, and it does not seem to fire at all.

[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 Richard Biener changed: What|Removed |Added Keywords||missed-optimization --- Comment #7 fro

[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-21 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 --- Comment #5 from Michael Meissner --- For the singlebuff.c benchmark, the numbers are: Trunk (sources checked out October 5th):5.40 seconds GCC 14 (sources checked out October 21st): 5.40 seconds GCC 13 (sources checked out October 21

[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-21 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 --- Comment #6 from Michael Meissner --- Note, in the first comment, I mis-read the instruction, and the instruction being used is vector unsigned long long rotate left, and not vector unsigned long long shift left. I.e.:

[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-21 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 --- Comment #4 from Michael Meissner --- I tracked down the commit that first made the slowdown visible: commit 3a61ca1b9256535e1bfb19b2d46cde21f3908a5d (HEAD) Author: Jan Hubicka Date: Thu Jul 6 18:56:22 2023 +0200 Improve profile upda

[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-21 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 Michael Meissner changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |meissner at gcc dot gnu.org

[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-21 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 --- Comment #2 from Michael Meissner --- Created attachment 59406 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59406&action=edit Singlebuff.c test The singlebuff.c is a simpler test case than multibuff.c. However, the numbers quoted an

[Bug target/117251] SHA3 code for PowerPC has a major slow down

2024-10-21 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251 Michael Meissner changed: What|Removed |Added Priority|P3 |P2 Version|15.0