https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91246

d_vampile <d_vampile at 163 dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |d_vampile at 163 dot com

--- Comment #6 from d_vampile <d_vampile at 163 dot com> ---
(In reply to Jiangning Liu from comment #3)
> Expect to vectorize the inner loop by generating the code below for x86,
> 
> vpbroadcastd [mem], ymm0
> vpaddd [mem], ymm0, ymm1
> vpbroadcastd reg, ymm2
> vpcmpeqd ymm2, ymm1, k0
> kortestw k0, k0
> cmovne ...
> 
> AArch64 should have vectorization instructions counterpart to implement the
> same functionality.

I see that on x86, the result of vcmpeqb comparison can be recorded through the
vmovmskb instruction. I wonder if there is a similar instruction for
efficiently recording the result of vectorized comparison on neno?

x86 i.e..
..
vpcmpeqb %ymm0, %ymm1, %ymm0
vpmovmskb %ymm0, %ebx
cmp 0xffffffff, %ebx
..

Reply via email to