On 8/11/24 15:43, Joe Hattori wrote:
Arm's intrinsic matrix multiply accumulate instructions take two 8-bit
vector and add up a 32-bit vector. Current emulation causes overflow
when large 8-bit integers are used. This commit fixes the issue by
casting the 8-bit integers to 32-bit integers before
Arm's intrinsic matrix multiply accumulate instructions take two 8-bit
vector and add up a 32-bit vector. Current emulation causes overflow
when large 8-bit integers are used. This commit fixes the issue by
casting the 8-bit integers to 32-bit integers before multiplication.
Fixes: 2323c5ffd4b5 ("