This is attacking case 3 of PR 94174.
Although I'm no longer using ccmp for most of the TImode comparisons.
Thanks to Wilco Dijkstra for pulling off my blinders and reminding me
that we can use subs+sbcs for (almost) all compares.
The first 5 patches clean up or add patterns to support the expansion
and not generate extraneous constant loads.
The aarch64_expand_addsubti patch tidies up the existing TImode
arithmetic expansions.
EXAMPLE __subvti3 (context diff is easier to read):
*** 12,27 ****
10: b7f800a3 tbnz x3, #63, 24 <__subvti3+0x24>
! 14: eb02003f cmp x1, x2
! 18: 5400010c b.gt 38 <__subvti3+0x38>
! 1c: 54000140 b.eq 44 <__subvti3+0x44> // b.none
20: d65f03c0 ret
! 24: eb01005f cmp x2, x1
! 28: 5400008c b.gt 38 <__subvti3+0x38>
! 2c: 54ffffa1 b.ne 20 <__subvti3+0x20> // b.any
! 30: eb00009f cmp x4, x0
! 34: 54ffff69 b.ls 20 <__subvti3+0x20> // b.plast
! 38: a9bf7bfd stp x29, x30, [sp, #-16]!
! 3c: 910003fd mov x29, sp
! 40: 94000000 bl 0 <abort>
! 44: eb04001f cmp x0, x4
! 48: 54ffff88 b.hi 38 <__subvti3+0x38> // b.pmore
! 4c: d65f03c0 ret
--- 12,22 ----
10: b7f800a3 tbnz x3, #63, 24 <__subvti3+0x24>
! 14: eb00009f cmp x4, x0
! 18: fa01005f sbcs xzr, x2, x1
! 1c: 540000ab b.lt 30 <__subvti3+0x30> // b.tstop
20: d65f03c0 ret
! 24: eb04001f cmp x0, x4
! 28: fa02003f sbcs xzr, x1, x2
! 2c: 54ffffaa b.ge 20 <__subvti3+0x20> // b.tcont
! 30: a9bf7bfd stp x29, x30, [sp, #-16]!
! 34: 910003fd mov x29, sp
! 38: 94000000 bl 0 <abort>
EXAMPLE from the pr:
void test3(__int128 a, uint64_t l)
{
if ((__int128_t)a - l <= 1)
doit();
}
*** 11,23 ****
subs x0, x0, x2
sbc x1, x1, xzr
! cmp x1, 0
! ble .L6
! .L1:
ret
.p2align 2,,3
- .L6:
- bne .L4
- cmp x0, 1
- bhi .L1
.L4:
b doit
--- 11,19 ----
subs x0, x0, x2
sbc x1, x1, xzr
! cmp x0, 2
! sbcs xzr, x1, xzr
! blt .L4
ret
.p2align 2,,3
.L4:
b doit
r~
Richard Henderson (9):
aarch64: Accept 0 as first argument to compares
aarch64: Accept zeros in add<GPI>3_carryin
aarch64: Add <su>cmp_*_carryinC patterns
aarch64: Add <su>cmp<GPI>_carryinC_m2
aarch64: Provide expander for sub<GPI>3_compare1
aarch64: Introduce aarch64_expand_addsubti
aarch64: Adjust result of aarch64_gen_compare_reg
aarch64: Implement TImode comparisons
aarch64: Implement absti2
gcc/config/aarch64/aarch64-protos.h | 10 +-
gcc/config/aarch64/aarch64.c | 292 +++++++++-------
gcc/config/aarch64/aarch64-simd.md | 18 +-
gcc/config/aarch64/aarch64-speculation.cc | 5 +-
gcc/config/aarch64/aarch64.md | 389 +++++++++++++---------
5 files changed, 402 insertions(+), 312 deletions(-)
--
2.20.1