on 2023/2/13 21:57, Richard Sandiford wrote: > "Kewen.Lin" <li...@linux.ibm.com> writes: >> Hi Richard, >> >> on 2023/1/27 19:08, Richard Sandiford via Gcc-patches wrote: >>> PR96373 points out that a predicated SVE loop currently converts >>> trapping unconditional ops into unpredicated vector ops. Doing >>> the operation on inactive lanes can then raise an exception. >>> >>> As discussed in the PR trail, we aren't 100% consistent about >>> whether we preserve traps or not. But the direction of travel >>> is clearly to improve that rather than live with it. This patch >>> tries to do that for the SVE case. >>> >>> Doing this regresses gcc.target/aarch64/sve/fabd_1.c. I've added >>> -fno-trapping-math for now and filed PR108571 to track it. >>> A similar problem applies to fsubr_1.d. >>> >>> I think this is likely to regress Power 10, since conditional >>> operations are only available for masked loops. I think we'll >>> need to add -fno-trapping-math to any affected testcases, >>> but I don't have a Power 10 system to test on. Kewen, would you >>> mind giving this a spin and seeing how bad the fallout is? >>> >> >> Sorry for the late reply, I'm just back from vacation. >> >> Thank you for fixing this and caring about Power10! >> >> I tested your proposed patch on one Power10 machine (ppc64le), >> it's bootstrapped but some test failures got exposed as below. >> >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-1.c scan-assembler-times >> \\\\mlxvl\\\\M 14 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-1.c scan-assembler-times >> \\\\mstxvl\\\\M 7 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-2.c scan-assembler-times >> \\\\mlxvl\\\\M 20 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-2.c scan-assembler-times >> \\\\mstxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-3.c scan-assembler-times >> \\\\mlxvl\\\\M 14 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-3.c scan-assembler-times >> \\\\mstxvl\\\\M 7 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times >> \\\\mlxvl\\\\M 70 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times >> \\\\mlxvx?\\\\M 120 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times >> \\\\mstxvl\\\\M 70 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-4.c scan-assembler-times >> \\\\mstxvx?\\\\M 70 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times >> \\\\mlxvl\\\\M 21 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times >> \\\\mstxvl\\\\M 21 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-5.c scan-assembler-times >> \\\\mstxvx?\\\\M 21 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times >> \\\\mlxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times >> \\\\mlxvx?\\\\M 42 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-6.c scan-assembler-times >> \\\\mstxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-8.c scan-assembler-times >> \\\\mlxvl\\\\M 16 >> < FAIL: gcc.target/powerpc/p9-vec-length-epil-8.c scan-assembler-times >> \\\\mstxvl\\\\M 7 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-not >> \\\\mlxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-not >> \\\\mstxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-times >> \\\\mlxvl\\\\M 20 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-1.c scan-assembler-times >> \\\\mstxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-not >> \\\\mlxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-not >> \\\\mstxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-times >> \\\\mlxvl\\\\M 20 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-2.c scan-assembler-times >> \\\\mstxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-3.c scan-assembler-times >> \\\\mlxvl\\\\M 14 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-3.c scan-assembler-times >> \\\\mstxvl\\\\M 7 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not >> \\\\mlxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not >> \\\\mstxv\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-not >> \\\\mstxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-times >> \\\\mlxvl\\\\M 70 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-4.c scan-assembler-times >> \\\\mstxvl\\\\M 70 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not >> \\\\mlxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not >> \\\\mstxv\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-not >> \\\\mstxvx\\\\M >> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-times >> \\\\mlxvl\\\\M 21 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-5.c scan-assembler-times >> \\\\mstxvl\\\\M 21 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-times >> \\\\mlxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-times >> \\\\mstxvl\\\\M 10 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-6.c scan-assembler-times >> \\\\mstxvx?\\\\M 6 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-8.c scan-assembler-times >> \\\\mlxvl\\\\M 30 >> < FAIL: gcc.target/powerpc/p9-vec-length-full-8.c scan-assembler-times >> \\\\mstxvl\\\\M 10 >> >> By checking several of them, it's due to that we don't vectorize >> some loop having float type involved with partial vector any more. >> >> As you suggested above, I fixed them with an extra option >> "-fno-trapping-math" and verified all of them can pass again. >> I also noticed that the original test case in PR96373 fails >> on Power10 too, so I added one constructed case pr96373.c >> into sub bucket gcc.target/powerpc for testing coverage >> on Power. >> >> One re-spin with the attached adjustment shows there is no >> regression failure any more, and the new test case works well >> on both ppc64 (P8) and ppc64le (P10) Linux. > > Thanks for doing this. The patch is OK, if you need approval. > I'll push mine once it's in.
Thanks for the review! Pushed in r13-5978-g4f5a1198065dc0. btw, do we want this to be backported? If yes, I'm going to backport it to gcc-12 and gcc-11 branches soon (for gcc-10 we don't have partial vector support on Power btw). BR, Kewen