On 6/7/21 9:58 AM, Peter Maydell wrote:
+#define DO_VCADD(OP, ESIZE, TYPE, H, FN0, FN1) \
+ void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, void
*vm) \
+ { \
+ TYPE *d = vd, *n = vn, *m = vm; \
+ uint16_t mask = mve_element_mask(env); \
+ unsigned e; \
+ TYPE r[16 / ESIZE]; \
+ /* Calculate all results first to avoid overwriting inputs */ \
+ for (e = 0; e < 16 / ESIZE; e++) { \
+ if (!(e & 1)) { \
+ r[e] = FN0(n[H(e)], m[H(e + 1)]); \
+ } else { \
+ r[e] = FN1(n[H(e)], m[H(e - 1)]); \
+ } \
+ } \
+ for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \
+ uint64_t bytemask = mask_to_bytemask##ESIZE(mask); \
+ d[H(e)] &= ~bytemask; \
+ d[H(e)] |= (r[e] & bytemask); \
+ } \
+ mve_advance_vpt(env); \
+ }
I guess this is ok. You could unroll the loop once, so that you compute only
even+odd results before writeback.
+/*
+ * VCADD Qd == Qm at size MO_32 is UNPREDICTABLE; we choose not to diagnose
+ * so we can reuse the DO_2OP macro. (Our implementation calculates the
+ * "expected" results in this case.)
+ */
You've done this elsewhere, though.
Either way,
Reviewed-by: Richard Henderson <[email protected]>
r~