The layout will be different between VLEN=128 and VLEN=256 (and also
any larger VLEN)

Give a practical example:
vec1 allocated into v8, and v9, the reg layout will be:

VLEN = 128
v8 = [0, 1, 2, 3]
v9 = [4, 5, 6, 7]

VLEN=256
v8 = [0, 1, 2, 3, 4, 5, 6, 7]
v9 = [?, ?, ?, ?, ?, ?, ?, ?]

Then you could imaging
vsetivli        zero,8,e32,m2,ta,ma
vadd.vv v8, v8, v10
is work on any machine with VLEN >= 128

Ok, so whenever we didn't split a vector into LMUL1-sized (128 here) chunks in the first place we cannot go back to LMUL1 any more.

Doesn't that also mean that _if_ we split into 128-bit chunks (first case above) running on VLEN=256 would look like

v8 = [0, 1, 2, 3, ?, ?, ?, ?]
v9 = [4, 5, 6, 7, ?, ?, ?, ?]

and
vsetivli        zero,8,e32,m2,ta,ma
vadd.vv v8, v8, v10

wouldn't get the right result (from an LMUL2 perspective)? So the layouts are
only compatible if VLEN and LMUL match?

I'm probably misunderstanding and/or am confused :)

--
Regards
Robin

Reply via email to