Hi,

There is no way to perform scalar addition in the vector register file,
but with the RTX costs in place we start rewriting (x << 1) to (x + x)
on almost all cores. The code which makes this decision has no idea that we
will end up doing this (it happens well before reload) and so we end up with
very ugly code generation in the case where addition was selected, but
we are operating in vector registers.

This patch relies on the same gimmick we are already using to allow
shifts on 32-bit scalars in the vector register file - Use a vector 32x2
operation instead, knowing that we can safely ignore the top bits.

This restores some normality to scalar_shift_1.c, however the test
that we generate a left shift by one is clearly bogus, so remove that.

This patch is pretty ugly, but it does generate superficially better
looking code for this testcase.

Tested on aarch64-none-elf with no issues.

OK for stage 1?

Thanks,
James

---
gcc/

2014-03-27  James Greenhalgh  <james.greenha...@arm.com>

        * config/aarch64/aarch64.md (*addsi3_aarch64): Add alternative in
        vector registers.

gcc/testsuite/
2014-03-27  James Greenhalgh  <james.greenha...@arm.com>

        * gcc.target/aarch64/scalar_shift_1.c: Fix expected assembler.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index c86a29d8e7f8df21f25e14d22df1c3e8c37c907f..9c544a0a473732ebdf9238205db96d0d0c57de9a 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1063,16 +1063,17 @@ (define_expand "add<mode>3"
 
 (define_insn "*addsi3_aarch64"
   [(set
-    (match_operand:SI 0 "register_operand" "=rk,rk,rk")
+    (match_operand:SI 0 "register_operand" "=rk,rk,w,rk")
     (plus:SI
-     (match_operand:SI 1 "register_operand" "%rk,rk,rk")
-     (match_operand:SI 2 "aarch64_plus_operand" "I,r,J")))]
+     (match_operand:SI 1 "register_operand" "%rk,rk,w,rk")
+     (match_operand:SI 2 "aarch64_plus_operand" "I,r,w,J")))]
   ""
   "@
   add\\t%w0, %w1, %2
   add\\t%w0, %w1, %w2
+  add\\t%0.2s, %1.2s, %2.2s
   sub\\t%w0, %w1, #%n2"
-  [(set_attr "type" "alu_imm,alu_reg,alu_imm")]
+  [(set_attr "type" "alu_imm,alu_reg,neon_add,alu_imm")]
 )
 
 ;; zero_extend version of above
diff --git a/gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c b/gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c
index 7cb17f8..826bafc 100644
--- a/gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c
@@ -193,7 +193,6 @@ test_corners_sisd_di (Int64x1 b)
   return b;
 }
 /* { dg-final { scan-assembler "sshr\td\[0-9\]+,\ d\[0-9\]+,\ 63" } } */
-/* { dg-final { scan-assembler "shl\td\[0-9\]+,\ d\[0-9\]+,\ 1" } } */
 
 Int32x1
 test_corners_sisd_si (Int32x1 b)
@@ -207,7 +206,6 @@ test_corners_sisd_si (Int32x1 b)
   return b;
 }
 /* { dg-final { scan-assembler "sshr\tv\[0-9\]+\.2s,\ v\[0-9\]+\.2s,\ 31" } } */
-/* { dg-final { scan-assembler "shl\tv\[0-9\]+\.2s,\ v\[0-9\]+\.2s,\ 1" } } */
 
 
 

Reply via email to