On Fri, Jun 06, 2025 at 12:46:32PM +0100, Richard Sandiford wrote:
> Spencer Abson <[email protected]> writes:
> > This patch extends the unpredicated FP division expander to support
> > partial FP modes. It extends the existing patterns used to implement
> > UNSPEC_COND_FDIV and it's approximation as needed.
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64-sve.md: (@aarch64_sve_<optab><mode>):
> > Extend from SVE_FULL_F to SVE_F, use aarch64_predicate_operand.
> > (div<mode>3): Extend from SVE_FULL_F to SVE_F.
> > (@aarch64_frecpe<mode>): Likewise.
> > (@aarch64_frecps<mode>): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/sve/unpacked_fdiv_1.c: New test.
> > * gcc.target/aarch64/sve/unpacked_fdiv_2.c: Likewise.
> > * gcc.target/aarch64/sve/unpacked_fdiv_3.c: Likewise.
> > ---
> > gcc/config/aarch64/aarch64-sve.md | 50 +++++++++----------
> > .../gcc.target/aarch64/sve/unpacked_fdiv_1.c | 34 +++++++++++++
> > .../gcc.target/aarch64/sve/unpacked_fdiv_2.c | 11 ++++
> > .../gcc.target/aarch64/sve/unpacked_fdiv_3.c | 11 ++++
> > 4 files changed, 81 insertions(+), 25 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/unpacked_fdiv_1.c
> > create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/unpacked_fdiv_2.c
> > create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/unpacked_fdiv_3.c
> >
> > diff --git a/gcc/config/aarch64/aarch64-sve.md
> > b/gcc/config/aarch64/aarch64-sve.md
> > index cdad900d9cf..79a087837de 100644
> > --- a/gcc/config/aarch64/aarch64-sve.md
> > +++ b/gcc/config/aarch64/aarch64-sve.md
> > @@ -3752,9 +3752,9 @@
> >
> > ;; Unpredicated floating-point unary operations.
> > (define_insn "@aarch64_sve_<optab><mode>"
> > - [(set (match_operand:SVE_FULL_F 0 "register_operand" "=w")
> > - (unspec:SVE_FULL_F
> > - [(match_operand:SVE_FULL_F 1 "register_operand" "w")]
> > + [(set (match_operand:SVE_F 0 "register_operand" "=w")
> > + (unspec:SVE_F
> > + [(match_operand:SVE_F 1 "register_operand" "w")]
> > SVE_FP_UNARY))]
> > "TARGET_SVE"
> > "<sve_fp_op>\t%0.<Vetype>, %1.<Vetype>"
>
> I agree the patch is correct for the current definitions of SVE_FP_UNARY
> and SVE_FP_BINARY. Since the names are generic, though, I think it
> would be worth adding a comment to iterators.md above the definition
> of both iterators, saying something like:
>
> ;; This iterator is currently only used for estimation instructions,
> ;; where lax handling of exceptions is assumed to be acceptable.
> ;; The iterator is therefore applied unconditionally to partial FP modes.
> ;; This would need to be revisited if new operations are added in future.
>
> (feel free to reword)
>
> The patch LGTM with that change.
>
> That said, I suppose the documentation of the -mlow-precision-*
> options doesn't explicit state that they change exception behaviour.
> Maybe it would be safer to leave the reciprocal stuff out for now,
> even though wanting low-precision results with strict exception
> conformance seems like an odd combination. Does anyone else have
> any thoughts?
Yeah, I agree that it's not immediately clear whether -mlow-precision-*
alone justifies this. I wouldn't have made this change if the low-
precision expansion wasn't predicated on all of:
if (!flag_finite_math_only
|| flag_trapping_math
|| !flag_unsafe_math_optimizations
|| optimize_function_for_size_p (cfun)
|| !use_approx_division_p)
return false;
Which, IIUC, reflects the fact that it also requires -ffast-math or
-funsafe-math-optimizations.
I should have placed an assertion (or something similar) to make sure
that we have !flag_trapping_math when the low-precision expander is
handling partial vector modes.
Perhaps something for V2? Happy to drop it for now if not.
>
> Richard