Hi Monk,

could you detail the issue/patch a bit?  Are we generally violating
LMUL >= SEW/ELEN with zve32f (zve32x as well then)?  And what's
"implicit zve32f"?  In the test case it's specified explicitly.

> According to Section 3.4.2, Vector Register Grouping, in the RISC-V
> Vector Specification, the rule for LMUL is LMUL >= SEW/ELEN

> +       /* Follow rule LMUL >= SEW / ELEN.  */
> +       int elen = TARGET_VECTOR_ELEN_64 ? 1 : 2;
>         int factor = TARGET_MIN_VLEN / size;
>         if (inner_size == 8)
> -         factor = MIN (factor, 8);
> +         factor = MIN (factor, 8 / elen);
>         else if (inner_size == 16)
> -         factor = MIN (factor, 4);
> +         factor = MIN (factor, 4 / elen);
>         else if (inner_size == 32)
> -         factor = MIN (factor, 2);
> +         factor = MIN (factor, 2 / elen);
>         else if (inner_size == 64)
>           factor = MIN (factor, 1);
>         else

As far as I understand it the minimum LMUL rule applies to the minimum SEW = 8
and the ELEN of the implementation.  An LMUL = 1/8 is invalid for a VLEN = 32
because that would mean we'd only have 4 bits per element.

The spec says:

For standard vector extensions with ELEN=32, fractional LMULs of 1/2 and 1/4
must be supported. For standard vector extensions with ELEN=64, fractional
LMULs of 1/2, 1/4, and 1/8 must be supported.

So the problem is we assume a "sane" implementation that would implement
LMUL=1/8 whenever VLEN > 32 but that's too optimistic?

Then the problem would be that we're using TARGET_MIN_VLEN rather than ELEN
here and there are implementations that could technically support LMUL = 1/8
but don't?

This sounds a bit like vector unaligned access all over again...
So we'd want a "sane" uarch flag that keeps the current MIN_VLEN behavior but
needed to make LMUL = 1/4 the minimum by default.  This only applies to LMUL =
1/8, though and not all the other cases.

> +/* { dg-options "-march=rv32imafc_zve32f_zvl128b -mabi=ilp32 -O2" } */

> +/* { dg-final { scan-assembler 
> {vsetivli\s+zero,\s*2,\s*e32,\s*m1,\s*t[au],\s*m[au]} } } */
> +/* { dg-final { scan-assembler 
> {vsetivli\s+zero,\s*4,\s*e32,\s*m1,\s*t[au],\s*m[au]} } } */

>From what I can tell the test case uses a V2SImode, so 64 bit.
When VLEN=128 (zvl128b) isn't the correct LMUL mf2 rather than m1?
In particular, how would the same LMUL for AVL=2 and AVL=4 and the same data
type be correct?

Maybe it would help to add a run test?  A PR might be useful as well
to track things as we're late in the release cycle.

-- 
Regards
 Robin

Reply via email to