Hi Christophe, > -----Original Message----- > From: Christophe Lyon <christophe.l...@arm.com> > Sent: Tuesday, April 18, 2023 2:46 PM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov <kyrylo.tkac...@arm.com>; > Richard Earnshaw <richard.earns...@arm.com>; Richard Sandiford > <richard.sandif...@arm.com> > Cc: Christophe Lyon <christophe.l...@arm.com> > Subject: [PATCH 00/22] arm: New framework for MVE intrinsics > > Hi, > > This is the beginning of a long patch series to change the way Arm MVE > intrinsics are implemented. The goal is to get rid of arm_mve.h, which > takes a long time to parse and compile. >
Thanks for doing this. It is a significant improvement to the MVE intrinsics and should address some of the biggest maintainability and scalability issues we have in that area. I'll be going through the patches one-by-one (I've looked at these offline already before), but the approach looks good to me at a high level. My hope is that we'll move all the intrinsics, including the Neon ones to use this framework in the future, but getting the framework in place first is a good major first step in that direction. Thanks, Kyrill > Roughly speaking, it's about using a framework very similar to what is > implemented for AArch64/SVE intrinsics. I haven't converted all the > intrinsics yet, but I think it would be good to start the conversion > when stage-1 reopens. > > * Factorizing names > One of the main implementation differences I noticed between SVE and > MVE is that mve.md provides only full builtin names at the moment, and > makes almost no use of "parameterized names" > (https://gcc.gnu.org/onlinedocs/gccint/Parameterized- > Names.html#Parameterized-Names). > > Without this, we'd need the builtin expander to use a large > switch/case of the form: > > switch (code) > case VADDQ_S: insn_code = code_for_mve_vaddq_s (...) > case VADDQ_U: insn_code = code_for_mve_vaddq_u (...) > case VSUBQ_S: insn_code = code_for_mve_vsubq_s (...) > case VSUBQ_U: insn_code = code_for_mve_vsubq_u (...) > .... > > so part of the work (which I called "factorize" in the commit > messages) is about replacing > > (define_insn "mve_vaddq_n_<supf><mode>" > with > (define_insn "@mve_<mve_insn>q_n_<supf><mode>" > with the help of a new iterator (mve_insn). > > Doing so makes it more obvious that some patterns are identical, > except for the instruction name. I took this opportunity to merge > them, so for instance I have a patch which merges add, sub and mul > patterns. Although not strictly necessary for the MVE intrinsics > restructuring work, this is a good opportunity to reduce such code > duplication (I did notice a few bugs during that process, which led me > to post a few small patches in the past months). Note that identical > patterns will probably remain after the series, they can be merged > later if we want. > > This factorization also implies the introduction of new iterators, but > also means that several existing ones become useless. These patches do > not remove them because it's a bit painful to reorder patches which > remove lines at some "random" places, leading to merge conflicts. It's > much simpler to write a big cleanup patch at the end of the serie to > remove all such useless iterators at once. > > * Intrinsic re-implementation > After intrinsic names have been factorized, the actual > re-implementation patch is small: > - add 1 line in each of arm-mve-builtins-base.{cc,def,h} describing > the intrinsic shape/signature, types and predicates involved, > RTX/unspec codes > - remove the intrinsic definitions from arm_mve.h > > The full series of ~140 patches is organized like this: > - patches 1 and 2 introduce the new framework > - new implementation of vreinterpretq > - new implementation of vuninitialized > - patch groups of varying size, consisting in: > - add a new "shape" if needed (e.g. unary, binary, ternary, ....) > - add framework support functions if needed > - factorize a set of intrinsics (at minimum, just make use of > parameterized-names) > - actual re-implementation of the intrinsics > > I kept patches small so the incremental progress is easy to follow and > check. I'll submit the patches in small groups, this first one will > make sure we agree on the implementation. > > Tested on arm-eabi with -mthumb/-mfloat-abi=hard/-march=armv8.1- > m.main+mve. > > To help reviewers, I suggest to compare arm-mve-builtins.cc with > aarch64-sve-builtins.cc. > > Christophe Lyon (22): > arm: move builtin function codes into general numberspace > arm: [MVE intrinsics] Add new framework > arm: [MVE intrinsics] Rework vreinterpretq > arm: [MVE intrinsics] Rework vuninitialized > arm: [MVE intrinsics] add binary_opt_n shape > arm: [MVE intrinsics] add unspec_based_mve_function_exact_insn > arm: [MVE intrinsics] factorize vadd vsubq vmulq > arm: [MVE intrinsics] rework vaddq vmulq vsubq > arm: [MVE intrinsics] add binary shape > arm: [MVE intrinsics] factorize vandq veorq vorrq vbicq > arm: [MVE intrinsics] rework vandq veorq > arm: [MVE intrinsics] add binary_orrq shape > arm: [MVE intrinsics] rework vorrq > arm: [MVE intrinsics] add unspec_mve_function_exact_insn > arm: [MVE intrinsics] add create shape > arm: [MVE intrinsics] factorize vcreateq > arm: [MVE intrinsics] rework vcreateq > arm: [MVE intrinsics] factorize several binary_m operations > arm: [MVE intrinsics] factorize several binary _n operations > arm: [MVE intrinsics] factorize several binary _m_n operations > arm: [MVE intrinsics] factorize several binary operations > arm: [MVE intrinsics] rework vhaddq vhsubq vmulhq vqaddq vqsubq > vqdmulhq vrhaddq vrmulhq > > gcc/config.gcc | 2 +- > gcc/config/arm/arm-builtins.cc | 237 +- > gcc/config/arm/arm-builtins.h | 1 + > gcc/config/arm/arm-c.cc | 42 +- > gcc/config/arm/arm-mve-builtins-base.cc | 163 + > gcc/config/arm/arm-mve-builtins-base.def | 50 + > gcc/config/arm/arm-mve-builtins-base.h | 47 + > gcc/config/arm/arm-mve-builtins-functions.h | 387 + > gcc/config/arm/arm-mve-builtins-shapes.cc | 529 ++ > gcc/config/arm/arm-mve-builtins-shapes.h | 47 + > gcc/config/arm/arm-mve-builtins.cc | 2013 ++++- > gcc/config/arm/arm-mve-builtins.def | 40 +- > gcc/config/arm/arm-mve-builtins.h | 672 +- > gcc/config/arm/arm-protos.h | 24 + > gcc/config/arm/arm.cc | 27 + > gcc/config/arm/arm_mve.h | 7581 +---------------- > gcc/config/arm/arm_mve_builtins.def | 6 - > gcc/config/arm/arm_mve_types.h | 1430 ---- > gcc/config/arm/iterators.md | 240 +- > gcc/config/arm/mve.md | 1747 +--- > gcc/config/arm/predicates.md | 4 + > gcc/config/arm/t-arm | 32 +- > gcc/config/arm/unspecs.md | 1 + > gcc/config/arm/vec-common.md | 8 +- > gcc/testsuite/g++.target/arm/mve.exp | 8 +- > .../arm/mve/general-c++/nomve_fp_1.c | 15 + > .../arm/mve/general-c++/vreinterpretq_1.C | 25 + > .../gcc.target/arm/mve/general-c/nomve_fp_1.c | 15 + > .../arm/mve/general-c/vreinterpretq_1.c | 25 + > 29 files changed, 4926 insertions(+), 10492 deletions(-) > create mode 100644 gcc/config/arm/arm-mve-builtins-base.cc > create mode 100644 gcc/config/arm/arm-mve-builtins-base.def > create mode 100644 gcc/config/arm/arm-mve-builtins-base.h > create mode 100644 gcc/config/arm/arm-mve-builtins-functions.h > create mode 100644 gcc/config/arm/arm-mve-builtins-shapes.cc > create mode 100644 gcc/config/arm/arm-mve-builtins-shapes.h > create mode 100644 gcc/testsuite/g++.target/arm/mve/general- > c++/nomve_fp_1.c > create mode 100644 gcc/testsuite/g++.target/arm/mve/general- > c++/vreinterpretq_1.C > create mode 100644 gcc/testsuite/gcc.target/arm/mve/general- > c/nomve_fp_1.c > create mode 100644 gcc/testsuite/gcc.target/arm/mve/general- > c/vreinterpretq_1.c > > -- > 2.34.1