Hi Robin,
Sorry, I should have simplified the problem by presenting it in terms of
Zve32x, because Zve32f implies Zve32x.
As the specification states, the requirement is to support LMUL ≥ SEW/ELEN.
Regarding the implementation,

I followed this rule to fix the problem.
In this link: https://godbolt.org/z/j59oTW371, there is a vsetivli
zero,2,e32,mf2,ta,ma.
Here, SEW=32, and Zve32x has ELEN=32, which makes LMUL=1/2 illegal.

According to the rule LMUL ≥ SEW/ELEN => LMUL ≥ 32 / 32 => LMUL ≥  1.


On Tue, Feb 4, 2025 at 8:01 PM Robin Dapp <rdapp....@gmail.com> wrote:

> Hi Monk,
>
> could you detail the issue/patch a bit?  Are we generally violating
> LMUL >= SEW/ELEN with zve32f (zve32x as well then)?  And what's
> "implicit zve32f"?  In the test case it's specified explicitly.
>
> > According to Section 3.4.2, Vector Register Grouping, in the RISC-V
> > Vector Specification, the rule for LMUL is LMUL >= SEW/ELEN
>
> > +       /* Follow rule LMUL >= SEW / ELEN.  */
> > +       int elen = TARGET_VECTOR_ELEN_64 ? 1 : 2;
> >         int factor = TARGET_MIN_VLEN / size;
> >         if (inner_size == 8)
> > -         factor = MIN (factor, 8);
> > +         factor = MIN (factor, 8 / elen);
> >         else if (inner_size == 16)
> > -         factor = MIN (factor, 4);
> > +         factor = MIN (factor, 4 / elen);
> >         else if (inner_size == 32)
> > -         factor = MIN (factor, 2);
> > +         factor = MIN (factor, 2 / elen);
> >         else if (inner_size == 64)
> >           factor = MIN (factor, 1);
> >         else
>
> As far as I understand it the minimum LMUL rule applies to the minimum SEW
> = 8
> and the ELEN of the implementation.  An LMUL = 1/8 is invalid for a VLEN =
> 32
> because that would mean we'd only have 4 bits per element.
>
> The spec says:
>
> For standard vector extensions with ELEN=32, fractional LMULs of 1/2 and
> 1/4
> must be supported. For standard vector extensions with ELEN=64, fractional
> LMULs of 1/2, 1/4, and 1/8 must be supported.
>
> So the problem is we assume a "sane" implementation that would implement
> LMUL=1/8 whenever VLEN > 32 but that's too optimistic?
>
> Then the problem would be that we're using TARGET_MIN_VLEN rather than ELEN
> here and there are implementations that could technically support LMUL =
> 1/8
> but don't?
>

  Yes, technically supported, but hardware may get illegal from vsetvli
instruction.
  I think we should use ELEN. Our current implementations are designed with
Zve64x in mind.


>
> This sounds a bit like vector unaligned access all over again...
> So we'd want a "sane" uarch flag that keeps the current MIN_VLEN behavior
> but
> needed to make LMUL = 1/4 the minimum by default.  This only applies to
> LMUL =
> 1/8, though and not all the other cases.
>
> > +/* { dg-options "-march=rv32imafc_zve32f_zvl128b -mabi=ilp32 -O2" } */
>
> > +/* { dg-final { scan-assembler
> {vsetivli\s+zero,\s*2,\s*e32,\s*m1,\s*t[au],\s*m[au]} } } */
> > +/* { dg-final { scan-assembler
> {vsetivli\s+zero,\s*4,\s*e32,\s*m1,\s*t[au],\s*m[au]} } } */
>
> From what I can tell the test case uses a V2SImode, so 64 bit.
> When VLEN=128 (zvl128b) isn't the correct LMUL mf2 rather than m1?

In particular, how would the same LMUL for AVL=2 and AVL=4 and the same data
> type be correct?
>
  That's right. The case just allocates more space, but storing 2 and 4
elements remains the same.


>
> Maybe it would help to add a run test?  A PR might be useful as well
> to track things as we're late in the release cycle.
>

   The test is from ./gcc.dg/tree-ssa/pr80898-2.c, I will add a run test.

>
> --
> Regards
>  Robin
>
>

Reply via email to