Hi, As subject, this patch rewrites the vmull[_high]_p8 Neon intrinsics to use RTL builtins rather than inline assembly code, allowing for better scheduling and optimization.
Regression tested and bootstrapped on aarch64-none-linux-gnu and aarch64_be-none-elf - no issues. Ok for master? Thanks, Jonathan ---- gcc/ChangeLog: 2021-02-05 Jonathan Wright <joanthan.wri...@arm.com> * config/aarch64/aarch64-simd-builtins.def: Add pmull[2] builtin generator macros. * config/aarch64/aarch64-simd.md (aarch64_pmullv8qi): Define. (aarch64_pmull_hiv16qi_insn): Define. (aarch64_pmull_hiv16qi): Define. * config/aarch64/arm_neon.h (vmull_high_p8): Use RTL builtin instead of inline asm. (vmull_p8): Likewise.
rb14128.patch
Description: rb14128.patch