https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122103
--- Comment #14 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Tamar Christina <[email protected]>: https://gcc.gnu.org/g:7fcd3ed36c68d39b1d51137d5bdf0bd91b99be60 commit r16-6510-g7fcd3ed36c68d39b1d51137d5bdf0bd91b99be60 Author: Tamar Christina <[email protected]> Date: Mon Jan 5 20:55:34 2026 +0000 vect: teach if-convert to predicate __builtin calls [PR122103] The following testcase void f (float *__restrict c, int *__restrict d, int n) { for (int i = 0; i < n; i++) { if (d[i] > 1000) c[i] = __builtin_sqrtf (c[i]); } } compiled with -O3 -march=armv9-a -fno-math-errno -ftrapping-math needs to be predicated on the conditional. It's invalid to execute the branch and use a select to extract it later unless using -fno-trapping-math. This change in if-conversion changes what we used to generate: _26 = _4 > 1000; _34 = _33 + _2; _5 = (float *) _34; _6 = .MASK_LOAD (_5, 32B, _26, 0.0); _7 = __builtin_sqrtf (_6); .MASK_STORE (_5, 32B, _26, _7); into _26 = _4 > 1000; _34 = _33 + _2; _5 = (float *) _34; _6 = .MASK_LOAD (_5, 32B, _26, 0.0); _7 = .COND_SQRT (_26, _6, _6); .MASK_STORE (_5, 32B, _26, _7); which correctly results in .L3: ld1w z0.s, p7/z, [x1, x3, lsl 2] cmpgt p7.s, p7/z, z0.s, z31.s ld1w z30.s, p7/z, [x0, x3, lsl 2] fsqrt z30.s, p7/m, z30.s st1w z30.s, p7, [x0, x3, lsl 2] incw x3 whilelo p7.s, w3, w2 b.any .L3 instead of .L3: ld1w z0.s, p7/z, [x1, x3, lsl 2] cmpgt p7.s, p7/z, z0.s, z31.s ld1w z30.s, p7/z, [x0, x3, lsl 2] fsqrt z30.s, p6/m, z30.s st1w z30.s, p7, [x0, x3, lsl 2] incw x3 whilelo p7.s, w3, w2 b.any .L3 gcc/ChangeLog: PR tree-optimization/122103 * tree-if-conv.cc (ifcvt_can_predicate): Support gimple_call_builtin_p. (if_convertible_stmt_p, predicate_rhs_code, predicate_statements): Likewise. gcc/testsuite/ChangeLog: PR tree-optimization/122103 * gcc.target/aarch64/sve/pr122103_1.c: New test. * gcc.target/aarch64/sve/pr122103_2.c: New test. * gcc.target/aarch64/sve/pr122103_3.c: New test.
