Hi all, To handle DImode BCAX operations we want to do them on the SIMD side only if the incoming arguments don't require a cross-bank move. This means we need to split back the combination to separate GP BIC+EOR instructions if the operands are expected to be in GP regs through reload. The split happens pre-reload if we already know that the destination will be a GP reg. Otherwise if reload descides to use the "=r,r" alternative we ensure operand 0 is early-clobber. This scheme is similar to how we handle the BSL operations elsewhere in aarch64-simd.md.
Thus, for the functions: uint64_t bcax_d_gp (uint64_t a, uint64_t b, uint64_t c) { return BCAX (a, b, c); } uint64x1_t bcax_d (uint64x1_t a, uint64x1_t b, uint64x1_t c) { return BCAX (a, b, c); } we now generate the desired: bcax_d_gp: bic x1, x1, x2 eor x0, x1, x0 ret bcax_d: bcax v0.16b, v0.16b, v1.16b, v2.16b ret When the inputs are in SIMD regs we use BCAX and when they are in GP regs we don't force them to SIMD with extra moves. Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill Signed-off-by: Kyrylo Tkachov <ktkac...@nvidia.com> gcc/ * config/aarch64/aarch64-simd.md (*bcaxqdi4): New define_insn_and_split. gcc/testsuite/ * gcc.target/aarch64/simd/bcax_d.c: Add tests for DImode arguments.
0003-aarch64-Handle-DImode-BCAX-operations.patch
Description: 0003-aarch64-Handle-DImode-BCAX-operations.patch