Hi Surya,

Apologies for the late response. This patch has been bootstrapped and
regtested for powerpc64le with no new failures. 

Thanks and regards,
Avinash Jayakar


On Mon, 2025-09-15 at 22:22 +0530, Surya Kumari Jangala wrote:
> Hi Avinash,
> Patches should be posted for review only *after* regression testing.
> 
> Regards,
> Surya
> 
> On 15/09/25 5:09 pm, Avinash Jayakar wrote:
> > Hi,
> > 
> > Below is a patch for fixing PR121700 proposed for master. Kindly
> > review. 
> > Regtesting on powerpc64le is still running. Will update the status
> > soon in this
> > thread.
> > 
> > Thank you,
> > Avinash Jayakar
> > 
> > rs6000: Update scalar cost of {TRUNC,FLOOR}_MOD_EXPR[PR121700]
> > 
> > The default cost model with O2 flag is the VERY_CHEAP model which
> > produces sub-optimal code for loops with TRUNC/FLOOR modulo
> > expressions.
> > Currently the vectorized variant of modulo operator is almost 4
> > times
> > faster than scalar variant for 32 bit integeres on power10.
> > 
> > In order to fairly compare scalar and vectorized variants of the
> > loop
> > in function vect_analyze_loop_costing, update the scalar cost for
> > TRUNC_MOD_EXPR and FLOOR_MOD_EXPR. The value 6 is currently the
> > number
> > of instructions generated for these expressions with O2 flag.
> > 
> > 2025-09-15  Avinash Jayakar <[email protected]>
> > 
> > gcc/ChangeLog:
> >     PR target/121700
> >         * config/rs6000/rs6000.cc
> > (rs6000_adjust_vect_cost_per_stmt): Add cost
> >     for {FLOOR,TRUNC}_MOD_EXPR.
> > ---
> >  gcc/config/rs6000/rs6000.cc | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/gcc/config/rs6000/rs6000.cc
> > b/gcc/config/rs6000/rs6000.cc
> > index 8dd23f8619c..183e454c5bc 100644
> > --- a/gcc/config/rs6000/rs6000.cc
> > +++ b/gcc/config/rs6000/rs6000.cc
> > @@ -5311,6 +5311,12 @@ rs6000_adjust_vect_cost_per_stmt (enum
> > vect_cost_for_stmt kind,
> >        tree_code subcode = gimple_assign_rhs_code (stmt_info-
> > >stmt);
> >        if (subcode == COND_EXPR)
> >     return 2;
> > +/* For {FLOOR,TRUNC}_MOD_EXPR, cost them a bit higher in order to
> > fairly 
> > +   compare the scalar and vector costs, since there is no direct
> > instruction
> > +   that can evaluation these expressions with just 1 instruction.
> > Currently
> > +   using the number of instructions generated for these
> > expressions.*/
> > +      if (subcode == FLOOR_MOD_EXPR || subcode == TRUNC_MOD_EXPR)
> > +  return 6;
> >      }
> >  
> >    return 0;

Reply via email to