Hi Tamar,
On Thu, 10 Jan 2019 at 16:41, Tamar Christina <tamar.christ...@arm.com> wrote: > > Hi Christoph, > > It was introduced in a small refactoring after which I only retested the > testcases I added,which don't trigger the issue. > > In any case it's a trivial fix and I'll submit a patch in a bit. > > Tamar > > ________________________________________ > From: Christophe Lyon <christophe.l...@linaro.org> > Sent: Thursday, January 10, 2019 3:35:18 PM > To: Tamar Christina > Cc: Kyrill Tkachov; gcc-patches@gcc.gnu.org; nd; Ramana Radhakrishnan; > Richard Earnshaw; ni...@redhat.com > Subject: Re: [PATCH 9/9][GCC][Arm] Add ACLE intrinsics for complex > mutliplication and addition > > Hi Tamar, > > > On Thu, 10 Jan 2019 at 04:44, Tamar Christina <tamar.christ...@arm.com> wrote: > > > > Hi Kyrill, > > > > Committed with a the addition of a few trivial defines and iterators that > > were missing due to > > The patch being split. > > > > Thanks, > > Tamar > > > > -----Original Message----- > > From: Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> > > Sent: Friday, December 21, 2018 11:40 AM > > To: Tamar Christina <tamar.christ...@arm.com>; gcc-patches@gcc.gnu.org > > Cc: nd <n...@arm.com>; Ramana Radhakrishnan <ramana.radhakrish...@arm.com>; > > Richard Earnshaw <richard.earns...@arm.com>; ni...@redhat.com > > Subject: Re: [PATCH 9/9][GCC][Arm] Add ACLE intrinsics for complex > > mutliplication and addition > > > > Hi Tamar, > > > > On 11/12/18 15:46, Tamar Christina wrote: > > > Hi All, > > > > > > This patch adds NEON intrinsics and tests for the Armv8.3-a complex > > > multiplication and add instructions with a rotate along the Argand plane. > > > > > > The instructions are documented in the ArmARM[1] and the intrinsics > > > specification will be published on the Arm website [2]. > > > > > > The Lane versions of these instructions are special in that they always > > > select a pair. > > > using index 0 means selecting lane 0 and 1. Because of this the range > > > check for the intrinsics require special handling. > > > > > > On Arm, in order to implement some of the lane intrinsics we're using > > > the structure of the register file. The lane variant of these > > > instructions always select a D register, but the data itself can be > > > stored in Q registers. This means that for single precision complex > > > numbers you are only allowed to select D[0] but using the register file > > > layout you can get the range 0-1 for lane indices by selecting between > > > Dn[0] and Dn+1[0]. > > > > > > Same reasoning applies for half float complex numbers, except there > > > your D register indexes can be 0 or 1, so you have a total range of 4 > > > elements (for a V8HF). > > > > > > > > > [1] > > > https://developer.arm.com/docs/ddi0487/latest/arm-architecture-referen > > > ce-manual-armv8-for-armv8-a-architecture-profile > > > [2] https://developer.arm.com/docs/101028/latest > > > > > > Bootstrapped Regtested on arm-none-gnueabihf and no issues. > > > > > > Ok for trunk? > > > > > > > Ok. > > Thanks, > > Kyrill > > > > > Thanks, > > > Tamar > > > > > > gcc/ChangeLog: > > > > > > 2018-12-11 Tamar Christina <tamar.christ...@arm.com> > > > > > > * config/arm/arm-builtins.c > > > (enum arm_type_qualifiers): Add qualifier_lane_pair_index. > > > (MAC_LANE_PAIR_QUALIFIERS): New. > > > (arm_expand_builtin_args): Use it. > > > (arm_expand_builtin_1): Likewise. > > > * config/arm/arm-protos.h (neon_vcmla_lane_prepare_operands): New. > > > * config/arm/arm.c (neon_vcmla_lane_prepare_operands): New. > > > * config/arm/arm-c.c (arm_cpu_builtins): Add > > > __ARM_FEATURE_COMPLEX. > > > * config/arm/arm_neon.h: > > > (vcadd_rot90_f16): New. > > > (vcaddq_rot90_f16): New. > > > (vcadd_rot270_f16): New. > > > (vcaddq_rot270_f16): New. > > > (vcmla_f16): New. > > > (vcmlaq_f16): New. > > > (vcmla_lane_f16): New. > > > (vcmla_laneq_f16): New. > > > (vcmlaq_lane_f16): New. > > > (vcmlaq_laneq_f16): New. > > > (vcmla_rot90_f16): New. > > > (vcmlaq_rot90_f16): New. > > > (vcmla_rot90_lane_f16): New. > > > (vcmla_rot90_laneq_f16): New. > > > (vcmlaq_rot90_lane_f16): New. > > > (vcmlaq_rot90_laneq_f16): New. > > > (vcmla_rot180_f16): New. > > > (vcmlaq_rot180_f16): New. > > > (vcmla_rot180_lane_f16): New. > > > (vcmla_rot180_laneq_f16): New. > > > (vcmlaq_rot180_lane_f16): New. > > > (vcmlaq_rot180_laneq_f16): New. > > > (vcmla_rot270_f16): New. > > > (vcmlaq_rot270_f16): New. > > > (vcmla_rot270_lane_f16): New. > > > (vcmla_rot270_laneq_f16): New. > > > (vcmlaq_rot270_lane_f16): New. > > > (vcmlaq_rot270_laneq_f16): New. > > > (vcadd_rot90_f32): New. > > > (vcaddq_rot90_f32): New. > > > (vcadd_rot270_f32): New. > > > (vcaddq_rot270_f32): New. > > > (vcmla_f32): New. > > > (vcmlaq_f32): New. > > > (vcmla_lane_f32): New. > > > (vcmla_laneq_f32): New. > > > (vcmlaq_lane_f32): New. > > > (vcmlaq_laneq_f32): New. > > > (vcmla_rot90_f32): New. > > > (vcmlaq_rot90_f32): New. > > > (vcmla_rot90_lane_f32): New. > > > (vcmla_rot90_laneq_f32): New. > > > (vcmlaq_rot90_lane_f32): New. > > > (vcmlaq_rot90_laneq_f32): New. > > > (vcmla_rot180_f32): New. > > > (vcmlaq_rot180_f32): New. > > > (vcmla_rot180_lane_f32): New. > > > (vcmla_rot180_laneq_f32): New. > > > (vcmlaq_rot180_lane_f32): New. > > > (vcmlaq_rot180_laneq_f32): New. > > > (vcmla_rot270_f32): New. > > > (vcmlaq_rot270_f32): New. > > > (vcmla_rot270_lane_f32): New. > > > (vcmla_rot270_laneq_f32): New. > > > (vcmlaq_rot270_lane_f32): New. > > > (vcmlaq_rot270_laneq_f32): New. > > > * config/arm/arm_neon_builtins.def (vcadd90, vcadd270, vcmla0, > > > vcmla90, > > > vcmla180, vcmla270, vcmla_lane0, vcmla_lane90, vcmla_lane180, > > > vcmla_lane270, > > > vcmla_laneq0, vcmla_laneq90, vcmla_laneq180, vcmla_laneq270, > > > vcmlaq_lane0, vcmlaq_lane90, vcmlaq_lane180, vcmlaq_lane270): New. > > > * config/arm/neon.md (neon_vcmla_lane<rot><mode>, > > > neon_vcmla_laneq<rot><mode>, neon_vcmlaq_lane<rot><mode>): New. > > > > > > gcc/testsuite/ChangeLog: > > > > > > 2018-12-11 Tamar Christina <tamar.christ...@arm.com> > > > > > > * gcc.target/aarch64/advsimd-intrinsics/vector-complex.c: Add > > > AArch32 regexpr. > > > * gcc.target/aarch64/advsimd-intrinsics/vector-complex_f16.c: > > > Likewise. > > > > > > -- > > > > Since r267796, I've noticed a regression on aarch64: > FAIL: gcc.target/aarch64/pr68674.c (test for excess errors) > Excess errors: > /home/tcwg-buildslave/workspace/tcwg-buildfarm_0/_build/builds/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/gcc.git~master_rev_9ccac37030d1cce880d7df7a5716fb56f89a67f6-stage2/gcc/include/arm_neon.h:33361:10: > error: incompatible types when returning type 'int' but 'float16x4_t' > was expected > /home/tcwg-buildslave/workspace/tcwg-buildfarm_0/_build/builds/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/gcc.git~master_rev_9ccac37030d1cce880d7df7a5716fb56f89a67f6-stage2/gcc/include/arm_neon.h:33385:10: > error: incompatible types when returning type 'int' but 'float16x4_t' > was expected > /home/tcwg-buildslave/workspace/tcwg-buildfarm_0/_build/builds/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/gcc.git~master_rev_9ccac37030d1cce880d7df7a5716fb56f89a67f6-stage2/gcc/include/arm_neon.h:33423:10: > error: incompatible types when returning type 'int' but 'float16x4_t' > was expected > /home/tcwg-buildslave/workspace/tcwg-buildfarm_0/_build/builds/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/gcc.git~master_rev_9ccac37030d1cce880d7df7a5716fb56f89a67f6-stage2/gcc/include/arm_neon.h:33477:10: > error: incompatible types when returning type 'int' but 'float16x4_t' > was expected > /home/tcwg-buildslave/workspace/tcwg-buildfarm_0/_build/builds/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/gcc.git~master_rev_9ccac37030d1cce880d7df7a5716fb56f89a67f6-stage2/gcc/include/arm_neon.h:33595:10: > error: incompatible types when returning type 'int' but 'float32x2_t' > was expected > /home/tcwg-buildslave/workspace/tcwg-buildfarm_0/_build/builds/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/gcc.git~master_rev_9ccac37030d1cce880d7df7a5716fb56f89a67f6-stage2/gcc/include/arm_neon.h:33648:10: > error: incompatible types when returning type 'int' but 'float32x2_t' > was expected > /home/tcwg-buildslave/workspace/tcwg-buildfarm_0/_build/builds/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/gcc.git~master_rev_9ccac37030d1cce880d7df7a5716fb56f89a67f6-stage2/gcc/include/arm_neon.h:33701:10: > error: incompatible types when returning type 'int' but 'float32x2_t' > was expected > /home/tcwg-buildslave/workspace/tcwg-buildfarm_0/_build/builds/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/gcc.git~master_rev_9ccac37030d1cce880d7df7a5716fb56f89a67f6-stage2/gcc/include/arm_neon.h:33754:10: > error: incompatible types when returning type 'int' but 'float32x2_t' > was expected > > I'm surprised you didn't see this during validations? I've noticed other problems on arm-none-linux-gnueabihf: FAIL: gcc.target/aarch64/advsimd-intrinsics/vector-complex_f16.c -O0 (test for excess errors) Excess errors: /aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabihf/gcc3/gcc/include/arm_neon.h:18323:10: error: this builtin is not supported for this target [....] The testcase is compiled with -mfp16-format=ieee -march=armv8.3-a -O2 -march=armv8.3-a+fp16 In addition, guess what, some scan-assembler-times directives fail on big-endian..... on armeb-none-linux-gnueabihf : gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 : vcmla.f32\\td[0-9]+, d[0-9]+, d[0-9]+\\[0\\], #0 found 1 times FAIL: gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 scan-assembler-times vcmla.f32\\td[0-9]+, d[0-9]+, d[0-9]+\\[0\\], #0 2 gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 : vcmla.f32\\td[0-9]+, d[0-9]+, d[0-9]+\\[0\\], #180 found 1 times FAIL: gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 scan-assembler-times vcmla.f32\\td[0-9]+, d[0-9]+, d[0-9]+\\[0\\], #180 2 gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 : vcmla.f32\\td[0-9]+, d[0-9]+, d[0-9]+\\[0\\], #270 found 1 times FAIL: gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 scan-assembler-times vcmla.f32\\td[0-9]+, d[0-9]+, d[0-9]+\\[0\\], #270 2 gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 : vcmla.f32\\td[0-9]+, d[0-9]+, d[0-9]+\\[0\\], #90 found 1 times FAIL: gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 scan-assembler-times vcmla.f32\\td[0-9]+, d[0-9]+, d[0-9]+\\[0\\], #90 2 gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 : vcmla.f32\\tq[0-9]+, q[0-9]+, d[0-9]+\\[0\\], #0 found 0 times FAIL: gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 scan-assembler-times vcmla.f32\\tq[0-9]+, q[0-9]+, d[0-9]+\\[0\\], #0 2 gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 : vcmla.f32\\tq[0-9]+, q[0-9]+, d[0-9]+\\[0\\], #180 found 0 times FAIL: gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 scan-assembler-times vcmla.f32\\tq[0-9]+, q[0-9]+, d[0-9]+\\[0\\], #180 2 gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 : vcmla.f32\\tq[0-9]+, q[0-9]+, d[0-9]+\\[0\\], #270 found 0 times FAIL: gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 scan-assembler-times vcmla.f32\\tq[0-9]+, q[0-9]+, d[0-9]+\\[0\\], #270 2 gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 : vcmla.f32\\tq[0-9]+, q[0-9]+, d[0-9]+\\[0\\], #90 found 0 times FAIL: gcc.target/aarch64/advsimd-intrinsics/vector-complex.c -O0 scan-assembler-times vcmla.f32\\tq[0-9]+, q[0-9]+, d[0-9]+\\[0\\], #90 2 On aarch64_be, I'm see ICEs: /gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vector-complex.c: In function 'test_vcmla_laneq_f32': /gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vector-complex.c:78:1: internal compiler error: Segmentation fault 0xc3967f crash_signal /gcc/toplev.c:326 0xa70718 mark_jump_label_1 /gcc/jump.c:1087 0xa707fb mark_jump_label_1 /gcc/jump.c:1212 0xa707fb mark_jump_label_1 /gcc/jump.c:1212 0xa70c62 mark_all_labels /gcc/jump.c:332 0xa70c62 rebuild_jump_labels_1 /gcc/jump.c:74 0x78c6af execute /gcc/cfgexpand.c:6549 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. and similar for vector-complex_f16.c Maybe you've already fixed this later in the series? Happy new year :) Christophe