On Fri, Dec 22, 2023 at 11:14 AM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > This patch resolves the failure of pr43644-2.c in the testsuite, a code > quality test I added back in July, that started failing as the code GCC > generates for 128-bit values (and their parameter passing) has been in > flux. After a few attempts at tweaking pattern constraints in the hope > of convincing reload to produce a more aggressive (but potentially > unsafe) register allocation, I think the best solution is to use a > peephole2 to catch/clean-up this specific case. > > Specifically, the function: > > unsigned __int128 foo(unsigned __int128 x, unsigned long long y) { > return x+y; > } > > currently generates: > > foo: movq %rdx, %rcx > movq %rdi, %rax > movq %rsi, %rdx > addq %rcx, %rax > adcq $0, %rdx > ret > > and with this patch/peephole2 now generates: > > foo: movq %rdx, %rax > movq %rsi, %rdx > addq %rdi, %rax > adcq $0, %rdx > ret > > which I believe is optimal.
How about simply moving the assignment to the MSB in the split pattern after the LSB calculation: [(set (match_dup 0) (match_dup 4)) - (set (match_dup 5) (match_dup 2)) (parallel [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:DWIH (match_dup 0) (match_dup 1)) (match_dup 0))) (set (match_dup 0) (plus:DWIH (match_dup 0) (match_dup 1)))]) + (set (match_dup 5) (match_dup 2)) (parallel [(set (match_dup 5) (plus:DWIH (plus:DWIH There is an earlyclobber on the output operand, so we are sure that assignments to (op 0) and (op 5) won't clobber anything. cprop_hardreg pass will then do the cleanup for us, resulting in: foo: movq %rdi, %rax addq %rdx, %rax movq %rsi, %rdx adcq $0, %rdx Uros. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? > > > 2023-12-21 Roger Sayle <ro...@nextmovesoftware.com> > > gcc/ChangeLog > PR target/43644 > * config/i386/i386.md (define_peephole2): Tweak register allocation > of *add<dwi>3_doubleword_concat_zext. > > gcc/testsuite/ChangeLog > PR target/43644 > * gcc.target/i386/pr43644-2.c: Expect 2 movq instructions. > > > Thanks in advance, and for your patience with this testsuite noise. > Roger > -- >
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 4c6368bf3b7..9f97d407975 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6411,13 +6411,13 @@ (define_insn_and_split "*add<dwi>3_doubleword_concat_zext" "#" "&& reload_completed" [(set (match_dup 0) (match_dup 4)) - (set (match_dup 5) (match_dup 2)) (parallel [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:DWIH (match_dup 0) (match_dup 1)) (match_dup 0))) (set (match_dup 0) (plus:DWIH (match_dup 0) (match_dup 1)))]) + (set (match_dup 5) (match_dup 2)) (parallel [(set (match_dup 5) (plus:DWIH (plus:DWIH