On Wed, 6 Jul 2022 at 10:11, Richard Henderson
<[email protected]> wrote:
>
> We can reuse the SVE functions for implementing moves to/from
> horizontal tile slices, but we need new ones for moves to/from
> vertical tile slices.
>
> Signed-off-by: Richard Henderson <[email protected]>
> +/*
> + * Move Zreg vector to ZArray column.
> + */
> +#define DO_MOVA_C(NAME, TYPE, H) \
> +void HELPER(NAME)(void *za, void *vn, void *vg, uint32_t desc) \
> +{ \
> + int i, oprsz = simd_oprsz(desc); \
> + for (i = 0; i < oprsz; ) { \
> + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
> + do { \
> + if (pg & 1) { \
> + *(TYPE *)(za + tile_vslice_offset(i)) = *(TYPE *)(vn +
> H(i)); \
> + } \
> + i += sizeof(TYPE); \
> + pg >>= sizeof(TYPE); \
> + } while (i & 15); \
> + } \
> +}
> +
> +DO_MOVA_C(sme_mova_cz_b, uint8_t, H1)
> +DO_MOVA_C(sme_mova_cz_h, uint16_t, H2)
> +DO_MOVA_C(sme_mova_cz_s, uint32_t, H4)
i is a byte offset in this loop, so shouldn't these be using H1_2 and H1_4 ?
> +/*
> + * Move ZArray column to Zreg vector.
> + */
> +#define DO_MOVA_Z(NAME, TYPE, H) \
> +void HELPER(NAME)(void *vd, void *za, void *vg, uint32_t desc) \
> +{ \
> + int i, oprsz = simd_oprsz(desc); \
> + for (i = 0; i < oprsz; ) { \
> + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
> + do { \
> + if (pg & 1) { \
> + *(TYPE *)(vd + H(i)) = *(TYPE *)(za +
> tile_vslice_offset(i)); \
> + } \
> + i += sizeof(TYPE); \
> + pg >>= sizeof(TYPE); \
> + } while (i & 15); \
> + } \
> +}
> +
> +DO_MOVA_Z(sme_mova_zc_b, uint8_t, H1)
> +DO_MOVA_Z(sme_mova_zc_h, uint16_t, H2)
> +DO_MOVA_Z(sme_mova_zc_s, uint32_t, H4)
Similarly here?
Otherwise
Reviewed-by: Peter Maydell <[email protected]>
thanks
-- PMM