On Mon, Mar 31, 2025 at 11:34 PM Robin Dapp <rdapp....@gmail.com> wrote:
>
> >> Yeah...and I also don't like the magic "ceil(AVL / 2) ≤ vl ≤ VLMAX if
> >> AVL < (2 * VLMAX)" rule...
> >
> > +1, spec has some description about this but I am not sure if I really get 
> > the point.
> >
> > From Spec:
> >
> > "For  example,  this  permits  an  implementation  to  set  vl  =  ceil(AVL
> > /  2)  for  VLMAX  <  AVL  <  2*VLMAX  in  order  to  evenly
> > distribute work over the last two iterations of a stripmine loop. 
> > Requirement
> > 2 ensures that the  rst stripmine iteration of reduction
> > loops uses the largest vector length of all iterations, even in the case of
> > AVL < 2*VLMAX. This allows software to avoid needing to
> > explicitly  calculate  a  running  maximum  of  vector  lengths  observed
> > during  a  stripmined  loop.  Requirement  2  also  allows  an
> > implementation to set vl to VLMAX for VLMAX < AVL < 2*VLMAX"
>
> Yeah, that's very unfortunate.
>
> The rule is something like
>
>   if AVL >= 2 * VLMAX
>     vl = vsetvl = min (AVL, VLMAX)
>
>   if VLMAX > AVL < 2 * VLMAX
>     vl = vsetvl = "whatever" ;)

Note it's not quite "whatever" -- there is a constraint that vl be
monotonically nonincreasing, which in some cases is the only important
property.  No denying this is an annoyance, though.

>
>   if AVL <= VLMAX
>     vl = vsetvl = min (AVL, VLMAX)
>
> The idea of load balancing is alright I guess but it really complicates 
> matters
> in the compiler.
>
> FWIW my plan for GCC 16 is to define a SELECT_VL_SANE (or any better name I 
> can
> come up with) that doesn't have this behavior and always only performs a
> minimum instead.  This will allow us to perform scalar evolution on vsetvl
> rather than giving up as we do right now.  Microarchitectures where vsetvl
> always behaves like a minimum would then enable the corresponding 
> expander/insn
> and others would fall back to the current behavior.
>
> --
> Regards
>  Robin
>

Reply via email to