05.03.2024 05:31, Richard Henderson :
The while the 8-bit input elements are sequential in the input vector,
Nitpick: "The while the"
/mjt
On 3/4/24 23:19, Philippe Mathieu-Daudé wrote:
+ uint32_t *za_row = &za[H4(tile_vslice_index(row))];
+ uint32_t n = zn[H4(row)];
+
+ for (col = 0; col < oprsz; ++col) {
+ uint8_t pb = pm[H1(col >> 1)] >> ((col & 1) * 4);
+ uint32_t *a = &za_row[col];
S
Hi Richard,
On 5/3/24 03:31, Richard Henderson wrote:
The while the 8-bit input elements are sequential in the input vector,
the 32-bit output elements are not sequential in the output matrix.
Do not attempt to compute 2 32-bit outputs at the same time.
Cc: qemu-sta...@nongnu.org
Fixes: 23a5e38
The while the 8-bit input elements are sequential in the input vector,
the 32-bit output elements are not sequential in the output matrix.
Do not attempt to compute 2 32-bit outputs at the same time.
Cc: qemu-sta...@nongnu.org
Fixes: 23a5e3859f5 ("target/arm: Implement SME integer outer product")