On Mon, Dec 8, 2025 at 3:59 PM Richard Biener <[email protected]> wrote:
>
> On Mon, 8 Dec 2025, Uros Bizjak wrote:
>
> > On Mon, Dec 8, 2025 at 2:41 PM Richard Biener <[email protected]> wrote:
> > >
> > > The following adjusts costing of vector construction from scalars for
> > > FP modes which with 387 math can reside in FP regs which need spilling
> > > to be reloaded to XMM.  I've played on the safe side with mixed
> > > SSE/387 math.
> > >
> > > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> > >
> > > OK?
> > >
> > > Thanks,
> > > Richard.
> > >
> > >         PR target/121230
> > >         * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
> > >         With FP mode and 387 math cost spill/reload.
> > >
> > >         * gcc.target/i386/pr121230.c: New testcase.
> > > ---
> > >  gcc/config/i386/i386.cc                  | 15 ++++++++++++++-
> > >  gcc/testsuite/gcc.target/i386/pr121230.c | 16 ++++++++++++++++
> > >  2 files changed, 30 insertions(+), 1 deletion(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121230.c
> > >
> > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > index db43045753b..ad978d7474d 100644
> > > --- a/gcc/config/i386/i386.cc
> > > +++ b/gcc/config/i386/i386.cc
> > > @@ -26397,7 +26397,20 @@ ix86_vector_costs::add_stmt_cost (int count, 
> > > vect_cost_for_stmt kind,
> > >                                 (TREE_OPERAND (gimple_assign_rhs1 (def), 
> > > 0))))))
> > >             {
> > >               if (fp)
> > > -               m_num_sse_needed[where]++;
> > > +               {
> > > +                 /* Scalar FP values residing in x87 registers need to be
> > > +                    spilled and reloaded.  */
> > > +                 if (ix86_fpmath & FPMATH_387)
> >
> > Perhaps you can use the IS_STACK_MODE() macro, it determines more
> > precisely which mode is handled in stack registers.
>
> Sure (though practically vectorized are only SFmode and DFmode?).

Please note that the macro also handles support for SFmode and DFmode
with TARGET_SSE / TARGET_SSE2.

> Like the following.

Yes.

> Re-testing on x86_64-unknown-linux-gnu, OK?

OK.

Thanks,
Uros.

>
> Thanks,
> Richard.
>
> From 69c63b06daf193fcc5fa2e0093db0d4198b75432 Mon Sep 17 00:00:00 2001
> From: Richard Biener <[email protected]>
> Date: Mon, 8 Dec 2025 14:36:58 +0100
> Subject: [PATCH] target/121230 - x86 vector CTOR cost with 387 math
> To: [email protected]
>
> The following adjusts costing of vector construction from scalars for
> FP modes which with 387 math can reside in FP regs which need spilling
> to be reloaded to XMM.  I've played on the safe side with mixed
> SSE/387 math.
>
>         PR target/121230
>         * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
>         With FP mode and 387 math cost spill/reload.
>
>         * gcc.target/i386/pr121230.c: New testcase.
> ---
>  gcc/config/i386/i386.cc                  | 15 ++++++++++++++-
>  gcc/testsuite/gcc.target/i386/pr121230.c | 16 ++++++++++++++++
>  2 files changed, 30 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr121230.c
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index db43045753b..75a9cb6211a 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -26397,7 +26397,20 @@ ix86_vector_costs::add_stmt_cost (int count, 
> vect_cost_for_stmt kind,
>                                 (TREE_OPERAND (gimple_assign_rhs1 (def), 
> 0))))))
>             {
>               if (fp)
> -               m_num_sse_needed[where]++;
> +               {
> +                 /* Scalar FP values residing in x87 registers need to be
> +                    spilled and reloaded.  */
> +                 auto mode2 = TYPE_MODE (TREE_TYPE (op));
> +                 if (IS_STACK_MODE (mode2))
> +                   {
> +                     int cost
> +                       = (ix86_cost->hard_register.fp_store[mode2 == SFmode
> +                                                            ? 0 : 1]
> +                          + ix86_cost->sse_load[sse_store_index (mode2)]);
> +                     stmt_cost += COSTS_N_INSNS (cost) / 2;
> +                   }
> +                 m_num_sse_needed[where]++;
> +               }
>               else
>                 {
>                   m_num_gpr_needed[where]++;
> diff --git a/gcc/testsuite/gcc.target/i386/pr121230.c 
> b/gcc/testsuite/gcc.target/i386/pr121230.c
> new file mode 100644
> index 00000000000..67c9c5ccb2d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr121230.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O3 -march=athlon-xp -mfpmath=387 
> -fexcess-precision=standard" } */
> +
> +typedef struct {
> +    float a;
> +    float b;
> +} f32_2;
> +
> +f32_2 add32_2(f32_2 x, f32_2 y) {
> +    return (f32_2){ x.a + y.a, x.b + y.b};
> +}
> +
> +/* We do not want the vectorizer to vectorize the store and/or the
> +   conversion (with IA32 we do not support V2SF add) given that spills
> +   FP regs to reload them to XMM.  */
> +/* { dg-final { scan-assembler-not "movss\[ \\t\]+\[0-9\]*\\\(%esp\\\), 
> %xmm" } } */
> --
> 2.51.0

Reply via email to