On Fri, Dec 22, 2023 at 11:14 AM Roger Sayle <[email protected]> wrote:
>
>
> This patch resolves the failure of pr43644-2.c in the testsuite, a code
> quality test I added back in July, that started failing as the code GCC
> generates for 128-bit values (and their parameter passing) has been in
> flux. After a few attempts at tweaking pattern constraints in the hope
> of convincing reload to produce a more aggressive (but potentially
> unsafe) register allocation, I think the best solution is to use a
> peephole2 to catch/clean-up this specific case.
>
> Specifically, the function:
>
> unsigned __int128 foo(unsigned __int128 x, unsigned long long y) {
> return x+y;
> }
>
> currently generates:
>
> foo: movq %rdx, %rcx
> movq %rdi, %rax
> movq %rsi, %rdx
> addq %rcx, %rax
> adcq $0, %rdx
> ret
>
> and with this patch/peephole2 now generates:
>
> foo: movq %rdx, %rax
> movq %rsi, %rdx
> addq %rdi, %rax
> adcq $0, %rdx
> ret
>
> which I believe is optimal.
How about simply moving the assignment to the MSB in the split pattern
after the LSB calculation:
[(set (match_dup 0) (match_dup 4))
- (set (match_dup 5) (match_dup 2))
(parallel [(set (reg:CCC FLAGS_REG)
(compare:CCC
(plus:DWIH (match_dup 0) (match_dup 1))
(match_dup 0)))
(set (match_dup 0)
(plus:DWIH (match_dup 0) (match_dup 1)))])
+ (set (match_dup 5) (match_dup 2))
(parallel [(set (match_dup 5)
(plus:DWIH
(plus:DWIH
There is an earlyclobber on the output operand, so we are sure that
assignments to (op 0) and (op 5) won't clobber anything.
cprop_hardreg pass will then do the cleanup for us, resulting in:
foo: movq %rdi, %rax
addq %rdx, %rax
movq %rsi, %rdx
adcq $0, %rdx
Uros.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures. Ok for mainline?
>
>
> 2023-12-21 Roger Sayle <[email protected]>
>
> gcc/ChangeLog
> PR target/43644
> * config/i386/i386.md (define_peephole2): Tweak register allocation
> of *add<dwi>3_doubleword_concat_zext.
>
> gcc/testsuite/ChangeLog
> PR target/43644
> * gcc.target/i386/pr43644-2.c: Expect 2 movq instructions.
>
>
> Thanks in advance, and for your patience with this testsuite noise.
> Roger
> --
>
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 4c6368bf3b7..9f97d407975 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -6411,13 +6411,13 @@ (define_insn_and_split
"*add<dwi>3_doubleword_concat_zext"
"#"
"&& reload_completed"
[(set (match_dup 0) (match_dup 4))
- (set (match_dup 5) (match_dup 2))
(parallel [(set (reg:CCC FLAGS_REG)
(compare:CCC
(plus:DWIH (match_dup 0) (match_dup 1))
(match_dup 0)))
(set (match_dup 0)
(plus:DWIH (match_dup 0) (match_dup 1)))])
+ (set (match_dup 5) (match_dup 2))
(parallel [(set (match_dup 5)
(plus:DWIH
(plus:DWIH