Hi Robin: > I think it's your > > RISC-V: Allow VLS types using up to LMUL 8 > > that makes the difference. I don't have that one in my tree. > > Your example above gives me > > test_256bit_vector: > .LFB0: > .cfi_startproc > vsetivli zero,4,e32,m1,ta,ma > addi a5,a1,16 > vle32.v v9,0(a5) > addi a5,a2,16 > vle32.v v11,0(a5) > vle32.v v8,0(a1) > vle32.v v10,0(a2) > addi a5,a0,16 > vadd.vv v9,v9,v11 > vadd.vv v8,v8,v10 > vse32.v v9,0(a5) > vse32.v v8,0(a0) > > which is of course not great and kind of defeats the purpose of a vector CC if > we need pass via stack just because of an LMUL mismatch.
Hmmm, pass via stack is just not fit VLS CC...it should pass on vector reg with LMUL2 constraint I am not sure about passing arguments in m2 but limited operation on m1 is worth spending time on that way because I think it's hard to prevent us from spilling that into the stack and reloading that into the register. And that code gen will kind of make VLS CC become useless due to the code gen quality. > > -- > Regards > Robin >