https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #10 from Michael Meissner <meissner at gcc dot gnu.org> --- There is an instruction that was added in power10 (XXEVAL) that does provide fusion between VSX vectors that includes ANDC->XOR and XOR->XOR fusion. I have coded up patches to support this and I will be submitting these patches shortly. XXEVAL Trunk GCC14 GCC13 GCC12 GCC11 ------ ----- ----- ----- ----- ----- -O3: 5.53 6.15 6.28 5.57 5.61 9.56 The latency of XXEVAL is slightly more than the fused VANDC/VXOR or VXOR/VXOR, so I have written the patch to prefer doing the Altivec instructions if they don't need a temporary register. XXEVAL Trunk GCC14 GCC13 GCC12 ------ ----- ----- ----- ----- Fuse VANDC -> VXOR 209 600 600 600 600 Fuse VXOR -> VXOR --- 240 240 120 120 XXEVAL to fuse ANDC -> XOR 391 --- --- --- --- XXEVAL to fuse XOR -> XOR 240 --- --- --- --- Spill vector to stack 78 364 364 172 184 Load spilled vector from stack 431 962 962 713 723 Vector moves 10 100 100 70 72 Vector rotate right 696 696 696 696 696 XXLANDC or VANDC 209 600 600 600 600 XXLXOR or VXOR 953 1,824 1,824 1,824 1,824 XXEVAL 631 --- --- --- --- XXSPLTIB and VEXTSB2D to load constants 24 24 24 24 24