Re: [PATCH 09/14] aarch64: Add support for unpacked SVE FDIV

Richard Sandiford Tue, 10 Jun 2025 11:54:49 -0700

Spencer Abson <spencer.ab...@arm.com> writes:
> On Fri, Jun 06, 2025 at 12:46:32PM +0100, Richard Sandiford wrote:
>> Spencer Abson <spencer.ab...@arm.com> writes:
>> > This patch extends the unpredicated FP division expander to support
>> > partial FP modes.  It extends the existing patterns used to implement
>> > UNSPEC_COND_FDIV and it's approximation as needed.
>> >
>> > gcc/ChangeLog:
>> >
>> >    * config/aarch64/aarch64-sve.md: (@aarch64_sve_<optab><mode>):
>> >    Extend from SVE_FULL_F to SVE_F, use aarch64_predicate_operand.
>> >    (div<mode>3): Extend from SVE_FULL_F to SVE_F.
>> >    (@aarch64_frecpe<mode>): Likewise.
>> >    (@aarch64_frecps<mode>): Likewise.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >    * gcc.target/aarch64/sve/unpacked_fdiv_1.c: New test.
>> >    * gcc.target/aarch64/sve/unpacked_fdiv_2.c: Likewise.
>> >    * gcc.target/aarch64/sve/unpacked_fdiv_3.c: Likewise.
>> > ---
>> >  gcc/config/aarch64/aarch64-sve.md             | 50 +++++++++----------
>> >  .../gcc.target/aarch64/sve/unpacked_fdiv_1.c  | 34 +++++++++++++
>> >  .../gcc.target/aarch64/sve/unpacked_fdiv_2.c  | 11 ++++
>> >  .../gcc.target/aarch64/sve/unpacked_fdiv_3.c  | 11 ++++
>> >  4 files changed, 81 insertions(+), 25 deletions(-)
>> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/unpacked_fdiv_1.c
>> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/unpacked_fdiv_2.c
>> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/unpacked_fdiv_3.c
>> >
>> > diff --git a/gcc/config/aarch64/aarch64-sve.md 
>> > b/gcc/config/aarch64/aarch64-sve.md
>> > index cdad900d9cf..79a087837de 100644
>> > --- a/gcc/config/aarch64/aarch64-sve.md
>> > +++ b/gcc/config/aarch64/aarch64-sve.md
>> > @@ -3752,9 +3752,9 @@
>> >  
>> >  ;; Unpredicated floating-point unary operations.
>> >  (define_insn "@aarch64_sve_<optab><mode>"
>> > -  [(set (match_operand:SVE_FULL_F 0 "register_operand" "=w")
>> > -  (unspec:SVE_FULL_F
>> > -    [(match_operand:SVE_FULL_F 1 "register_operand" "w")]
>> > +  [(set (match_operand:SVE_F 0 "register_operand" "=w")
>> > +  (unspec:SVE_F
>> > +    [(match_operand:SVE_F 1 "register_operand" "w")]
>> >      SVE_FP_UNARY))]
>> >    "TARGET_SVE"
>> >    "<sve_fp_op>\t%0.<Vetype>, %1.<Vetype>"
>> 
>> I agree the patch is correct for the current definitions of SVE_FP_UNARY
>> and SVE_FP_BINARY.  Since the names are generic, though, I think it
>> would be worth adding a comment to iterators.md above the definition
>> of both iterators, saying something like:
>> 
>> ;; This iterator is currently only used for estimation instructions,
>> ;; where lax handling of exceptions is assumed to be acceptable.
>> ;; The iterator is therefore applied unconditionally to partial FP modes.
>> ;; This would need to be revisited if new operations are added in future.
>> 
>> (feel free to reword)
>> 
>> The patch LGTM with that change.
>> 
>> That said, I suppose the documentation of the -mlow-precision-*
>> options doesn't explicit state that they change exception behaviour.
>> Maybe it would be safer to leave the reciprocal stuff out for now,
>> even though wanting low-precision results with strict exception
>> conformance seems like an odd combination.  Does anyone else have
>> any thoughts?
>
> Yeah, I agree that it's not immediately clear whether -mlow-precision-*
> alone justifies this.  I wouldn't have made this change if the low-
> precision expansion wasn't predicated on all of:
>
>   if (!flag_finite_math_only
>       || flag_trapping_math
>       || !flag_unsafe_math_optimizations
>       || optimize_function_for_size_p (cfun)
>       || !use_approx_division_p)
>     return false;
>
> Which, IIUC, reflects the fact that it also requires -ffast-math or
> -funsafe-math-optimizations.


Ah, OK.
>
> I should have placed an assertion (or something similar) to make sure
> that we have !flag_trapping_math when the low-precision expander is
> handling partial vector modes.
>
> Perhaps something for V2?  Happy to drop it for now if not.

I think in that case we can just change the comment I suggested to:

;; This iterator is currently only used for estimation instructions,
;; which are never generated automatically when -ftrapping-math is true.
;; The iterator is therefore applied unconditionally to partial FP modes.
;; This might need to be revisited if new operations are added in future.

It can be tricky for expanders to assert that the caller is sane,
because expanders generally don't have as much context as the caller does.
For example, the instructions could theoretically be protected by a
test for trouble-free inputs, a bit like for -ftree-builtin-call-dce.

Thanks,
Richard

Re: [PATCH 09/14] aarch64: Add support for unpacked SVE FDIV

Reply via email to