On Mon, 30 Jul 2018 at 18:09, Segher Boessenkool <seg...@kernel.crashing.org> wrote: > > On Tue, Jul 24, 2018 at 05:18:41PM +0000, Segher Boessenkool wrote: > > This patch allows combine to combine two insns into two. This helps > > in many cases, by reducing instruction path length, and also allowing > > further combinations to happen. PR85160 is a typical example of code > > that it can improve. > > > > This patch does not allow such combinations if either of the original > > instructions was a simple move instruction. In those cases combining > > the two instructions increases register pressure without improving the > > code. With this move test register pressure does no longer increase > > noticably as far as I can tell. > > > > (At first I also didn't allow either of the resulting insns to be a > > move instruction. But that is actually a very good thing to have, as > > should have been obvious). > > > > Tested for many months; tested on about 30 targets. > > > > I'll commit this later this week if there are no objections. > > Done now, with the testcase at > https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01856.html . >
Hi, Since this was committed, I've noticed regressions on aarch64: FAIL: gcc.dg/zero_bits_compound-1.c scan-assembler-not \\(and: on arm-none-linux-gnueabi FAIL: gfortran.dg/actual_array_constructor_1.f90 -O1 execution test On aarch64, I've also noticed a few others regressions but I'm not yet 100% sure it's caused by this patch (bisect running): gcc.target/aarch64/ashltidisi.c scan-assembler-times asr 4 gcc.target/aarch64/sve/var_stride_2.c -march=armv8.2-a+sve scan-assembler-times \\tadd\\tx[0-9]+, x[0-9]+, x[0-9]+, lsl 10\\n 2 gcc.target/aarch64/sve/var_stride_4.c -march=armv8.2-a+sve scan-assembler-times \\tlsl\\tx[0-9]+, x[0-9]+, 10\\n 2 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmeq\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0\\n 7 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmeq\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 14 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmeq\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0\\n 5 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmeq\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 10 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmge\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0\\n 21 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmge\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 42 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmge\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0\\n 15 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmge\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 30 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmgt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0\\n 21 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmgt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 42 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmgt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0\\n 15 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmgt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 30 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmle\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0\\n 21 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmle\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 42 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmle\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0\\n 15 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmle\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 30 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmlt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0\\n 21 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmlt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 42 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmlt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0\\n 15 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmlt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 30 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmne\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0\\n 21 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmne\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 42 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmne\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0\\n 15 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmne\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 30 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmuo\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d\\n 252 gcc.target/aarch64/sve/vcond_4.c -march=armv8.2-a+sve scan-assembler-times \\tfcmuo\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s\\n 180 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmeq\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0 14 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmeq\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d 28 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmeq\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0 10 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmeq\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s 20 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmge\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0 21 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmge\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d 42 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmge\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0 15 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmge\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s 30 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmgt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0 28 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmgt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d 56 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmgt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0 20 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmgt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s 40 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmle\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0 21 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmle\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d 42 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmle\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0 15 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmle\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s 30 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmlt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0 28 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmlt\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d 56 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmlt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0 20 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmlt\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s 40 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmne\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, #0\\.0 7 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmne\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d 14 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmne\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, #0\\.0 5 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmne\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s 10 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmuo\\tp[0-9]+\\.d, p[0-7]/z, z[0-9]+\\.d, z[0-9]+\\.d 63 gcc.target/aarch64/sve/vcond_5.c -march=armv8.2-a+sve scan-assembler-times \\tfcmuo\\tp[0-9]+\\.s, p[0-7]/z, z[0-9]+\\.s, z[0-9]+\\.s 45 > > Segher