Hi,

As subject, this patch rewrites the vmull[_high]_p8 Neon intrinsics to use RTL
builtins rather than inline assembly code, allowing for better scheduling and
optimization.

Regression tested and bootstrapped on aarch64-none-linux-gnu and
aarch64_be-none-elf - no issues.

Ok for master?

Thanks,
Jonathan

----

gcc/ChangeLog:

2021-02-05  Jonathan Wright  <joanthan.wri...@arm.com>

        * config/aarch64/aarch64-simd-builtins.def: Add pmull[2]
        builtin generator macros.
        * config/aarch64/aarch64-simd.md (aarch64_pmullv8qi): Define.
        (aarch64_pmull_hiv16qi_insn): Define.
        (aarch64_pmull_hiv16qi): Define.
        * config/aarch64/arm_neon.h (vmull_high_p8): Use RTL builtin
        instead of inline asm.
        (vmull_p8): Likewise.

Attachment: rb14128.patch
Description: rb14128.patch

Reply via email to