https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #52031|0                           |1
        is obsolete|                            |

--- Comment #14 from Hongtao.liu <crazylht at gmail dot com> ---
Created attachment 52032
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52032&action=edit
update patch

Update patch, Now gcc can generate optimal code

for #c0

.L4:
        vmovdqu (%rdi), %ymm1
        vmovdqu16       32(%rdi), %ymm2
        vpcmpuw $0, %ymm0, %ymm1, %k1
        vpcmpuw $0, %ymm0, %ymm2, %k0
        kortestw        %k0, %k1
        je      .L10
        kortestw        %k1, %k1
        je      .L5
        kmovd   %k1, %eax



For #c6

.L4:
        vmovdqu (%rdi), %ymm2
        vmovdqu 32(%rdi), %ymm1
        vpcmpuw $0, %ymm0, %ymm2, %k3
        vpcmpuw $0, %ymm0, %ymm1, %k0
        kortestd        %k0, %k3
        je      .L10
        kortestw        %k3, %k3
        je      .L5

Reply via email to