Hi, As subject, this patch rewrites the vpaddq Neon intrinsics to use RTL builtins rather than inline assembly code, allowing for better scheduling and optimization.
Regression tested and bootstrapped on aarch64-none-linux-gnu - no issues. Ok for master? Thanks, Jonathan --- gcc/ChangeLog: 2021-02-08 Jonathan Wright <jonathan.wri...@arm.com> * config/aarch64/aarch64-simd-builtins.def: Use VDQ_I iterator for aarch64_addp<mode> builtin macro generator. * config/aarch64/aarch64-simd.md: Use VDQ_I iterator in aarch64_addp<mode> RTL pattern. * config/aarch64/arm_neon.h (vpaddq_s8): Use RTL builtin instead of inline asm. (vpaddq_s16): Likewise. (vpaddq_s32): Likewise. (vpaddq_s64): Likewise. (vpaddq_u8): Likewise. (vpaddq_u16): Likewise. (vpaddq_u32): Likewise. (vpaddq_u64): Likewise.
rb14136.patch
Description: rb14136.patch