On Tue, Sep 3, 2024 at 2:24 PM Haochen Jiang wrote:
>
> Hi all,
>
> The intrin for non-optimized got a typo in mask type, which will cause
> the high bits of __mmask32 being unexpectedly zeroed.
>
> The test does not fail under O0 with current 1b since the testcase is
> wrong. We need to include a
Hi all,
The intrin for non-optimized got a typo in mask type, which will cause
the high bits of __mmask32 being unexpectedly zeroed.
The test does not fail under O0 with current 1b since the testcase is
wrong. We need to include avx512-mask-type.h after SIZE is defined, or
it will always be __mma