On 6/30/23 08:58, Song Gao wrote:
+#define VEXTH(NAME, BIT, E1, E2) \
+void HELPER(NAME)(CPULoongArchState *env, \
+ uint32_t oprsz, uint32_t vd, uint32_t vj) \
+{ \
+ int i, max; \
+ VReg *Vd = &(env->fpr[vd].vreg); \
+ VReg *Vj = &(env->fpr[vj].vreg); \
+ \
+ max = LSX_LEN / BIT; \
+ for (i = 0; i < max; i++) { \
+ Vd->E1(i) = Vj->E2(i + max); \
+ if (oprsz == 32) { \
+ Vd->E1(i + max) = Vj->E2(i + max * 3); \
+ } \
+ } \
}
Better with void * and uint32_t desc.
So this doesn't expand all in order, similar to x86 AVX and arm SVE.
I believe the way I handled it there was
ofs = 128 / bit;
for (i = 0; i < oprsz / (BIT / 8); i += ofs) {
for (j = 0; j < ofs; j++) {
E1[i + j] = E2[i + j + ofs];
}
}
r~