https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115258
--- Comment #8 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The releases/gcc-14 branch has been updated by Richard Sandiford <rsand...@gcc.gnu.org>: https://gcc.gnu.org/g:a38e9e06d068fb9523b44ba2b4e2f7a7d6784200 commit r14-11637-ga38e9e06d068fb9523b44ba2b4e2f7a7d6784200 Author: Richard Sandiford <richard.sandif...@arm.com> Date: Wed Apr 16 13:20:29 2025 +0100 aarch64: Avoid unnecessary use of 2-input TBLs [PR115258] When using TBL for (say) a V4SI permutation, the aarch64 port first asks target-independent code to lower to a V16QI permutation. Then, during code generation, an input like: (reg:V4SI R) gets converted to: (subreg:V16QI (reg:V4SI R) 0) aarch64_vectorize_vec_perm_const had: d.op0 = op0 ? force_reg (op_mode, op0) : NULL_RTX; if (op0 == op1) d.op1 = d.op0; else d.op1 = op1 ? force_reg (op_mode, op1) : NULL_RTX; But subregs (unlike regs) are not shared, so the op0 == op1 check always failed for this case. We'd then force each subreg into a fresh register, meaning that during the later: aarch64_expand_vec_perm_1 (d->target, d->op0, d->op1, sel); there is no way for aarch64_expand_vec_perm_1 to realise that d->op0 and d->op1 are the same value. It would therefore generate a two-input TBL in the testcase, even though a single-input TBL is enough. I'm not sure forcing subregs to a fresh regiter is a good idea -- it caused problems for copysign & co. -- but that's not something to fiddle with during stage 4. Using op0 == op1 for rtx equality is independently wrong, so we might as well just fix that for now. The patch gets rid of extra MOVs that are a regression from GCC 14. The testcase is based on one from Kugan, itself based on TSVC. gcc/ PR target/115258 * config/aarch64/aarch64.cc (aarch64_vectorize_vec_perm_const): Use d.one_vector_p to decide whether op1 should be a copy of op0. gcc/testsuite/ PR target/115258 * gcc.target/aarch64/pr115258_2.c: New test. Co-authored-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com> (cherry picked from commit 31dcf941ac78c4b1b01dc4b2ce9809f0209153b8)