The layout will be different between VLEN=128 and VLEN=256 (and also
any larger VLEN)
Give a practical example:
vec1 allocated into v8, and v9, the reg layout will be:
VLEN = 128
v8 = [0, 1, 2, 3]
v9 = [4, 5, 6, 7]
VLEN=256
v8 = [0, 1, 2, 3, 4, 5, 6, 7]
v9 = [?, ?, ?, ?, ?, ?, ?, ?]
Then you could imaging
vsetivli zero,8,e32,m2,ta,ma
vadd.vv v8, v8, v10
is work on any machine with VLEN >= 128
Ok, so whenever we didn't split a vector into LMUL1-sized (128 here) chunks in
the first place we cannot go back to LMUL1 any more.
Doesn't that also mean that _if_ we split into 128-bit chunks (first case
above) running on VLEN=256 would look like
v8 = [0, 1, 2, 3, ?, ?, ?, ?]
v9 = [4, 5, 6, 7, ?, ?, ?, ?]
and
vsetivli zero,8,e32,m2,ta,ma
vadd.vv v8, v8, v10
wouldn't get the right result (from an LMUL2 perspective)? So the layouts are
only compatible if VLEN and LMUL match?
I'm probably misunderstanding and/or am confused :)
--
Regards
Robin