Hi all,
This patch fixes a wrong-code bug with the *aarch64_lshr_sisd_or_int_<mode>3
pattern and its associated splitters. The problem is that for the 2nd
alternative it will split a right-shift into a SISD left-shift by the
negated
amount to be shifted by (the ushl instruction allows such semantics).
The splitter generates this RTL:
(set (match_dup 2)
(unspec:QI [(match_dup 2)] UNSPEC_SISD_NEG))
(set (match_dup 0)
(unspec:SI [(match_dup 1) (match_dup 2)] UNSPEC_USHL_2S))
The problem here is that the shift amount register is negated without
telling
the register allocator about it (and it can't figure it out itself).
So if you try to use the register that operand 2 is assigned to later on,
you get the negated shift amount instead!
The testcase in the patch demonstrates the simple code that can get
miscompiled
due to this behaviour.
The solution in this patch is to negate the shift amount into the output
operand (operand 0) and mark it as an earlyclobber in that alternative.
This is actually exactly what the very similar
*aarch64_ashr_sisd_or_int_<mode>3 pattern does below.
I believe this is the safest and simplest fix at this stage.
This bug was exposed on the Linaro 4.9 branch that happened to have the
perfect
storm of costs and register pressure and ended up miscompiling
the TEST_BIT macro in ira-costs.c during a build of trunk by the generated
Linaro compiler, generating essentially code like:
.L141:
neg d8, d8 //d8 negated!
ushl v0.2s, v11.2s, v8.2s // shift right => shift left by neg amount
fmov w0, s0
<...irrelevant code...>
b .L140
<...>
.L140:
fmov w0, s8 // s8/d8 used and incremented assuming it had not
changed at L141
add w0, w0, 1
fmov s8, w0
fmov w1, s10
cmp w0, w1
bne .L141
Basically d8 is negated and later used as if it had not been at .L140
leading
to completely wrong behaviour.
With this patch that particular part of the assembly now contains at L141:
neg d0, d8
ushl v0.2s, v11.2s, v0.2s
fmov w0, s0
Leaving the original shift amount in d8 intact.
This bug occurs on FSF trunk and 4.9 branch (not on 4.8 as the offending
pattern was introduced for 4.9)
Bootstrapped and tested on trunk and 4.9.
Ok for trunk and 4.9?
Thanks,
Kyrill
2015-02-17 Kyrylo Tkachov <kyrylo.tkac...@arm.com>
* config/aarch64/aarch64.md (*aarch64_lshr_sisd_or_int_<mode>3):
Mark operand 0 as earlyclobber in 2nd alternative.
(1st define_split below *aarch64_lshr_sisd_or_int_<mode>3):
Write negated shift amount into QI lowpart operand 0 and use it
in the shift step.
(2nd define_split below *aarch64_lshr_sisd_or_int_<mode>3): Likewise.
2015-02-17 Kyrylo Tkachov <kyrylo.tkac...@arm.com>
* gcc.target/aarch64/sisd-shft-neg_1.c: New test.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 1f4169e..8f157ce 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3360,7 +3360,7 @@ (define_insn "*aarch64_ashl_sisd_or_int_<mode>3"
;; Logical right shift using SISD or Integer instruction
(define_insn "*aarch64_lshr_sisd_or_int_<mode>3"
- [(set (match_operand:GPI 0 "register_operand" "=w,w,r")
+ [(set (match_operand:GPI 0 "register_operand" "=w,&w,r")
(lshiftrt:GPI
(match_operand:GPI 1 "register_operand" "w,w,r")
(match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>" "Us<cmode>,w,rUs<cmode>")))]
@@ -3379,11 +3379,13 @@ (define_split
(match_operand:DI 1 "aarch64_simd_register")
(match_operand:QI 2 "aarch64_simd_register")))]
"TARGET_SIMD && reload_completed"
- [(set (match_dup 2)
+ [(set (match_dup 3)
(unspec:QI [(match_dup 2)] UNSPEC_SISD_NEG))
(set (match_dup 0)
- (unspec:DI [(match_dup 1) (match_dup 2)] UNSPEC_SISD_USHL))]
- ""
+ (unspec:DI [(match_dup 1) (match_dup 3)] UNSPEC_SISD_USHL))]
+ {
+ operands[3] = gen_lowpart (QImode, operands[0]);
+ }
)
(define_split
@@ -3392,11 +3394,13 @@ (define_split
(match_operand:SI 1 "aarch64_simd_register")
(match_operand:QI 2 "aarch64_simd_register")))]
"TARGET_SIMD && reload_completed"
- [(set (match_dup 2)
+ [(set (match_dup 3)
(unspec:QI [(match_dup 2)] UNSPEC_SISD_NEG))
(set (match_dup 0)
- (unspec:SI [(match_dup 1) (match_dup 2)] UNSPEC_USHL_2S))]
- ""
+ (unspec:SI [(match_dup 1) (match_dup 3)] UNSPEC_USHL_2S))]
+ {
+ operands[3] = gen_lowpart (QImode, operands[0]);
+ }
)
;; Arithmetic right shift using SISD or Integer instruction
diff --git a/gcc/testsuite/gcc.target/aarch64/sisd-shft-neg_1.c b/gcc/testsuite/gcc.target/aarch64/sisd-shft-neg_1.c
new file mode 100644
index 0000000..c091657
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sisd-shft-neg_1.c
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-inline" } */
+
+extern void abort (void);
+
+#define force_simd_si(v) asm volatile ("mov %s0, %1.s[0]" :"=w" (v) :"w" (v) :)
+
+unsigned int
+shft_add (unsigned int a, unsigned int b)
+{
+ unsigned int c;
+
+ force_simd_si (a);
+ force_simd_si (b);
+ c = a >> b;
+ force_simd_si (c);
+
+ return c + b;
+}
+
+int
+main (void)
+{
+ unsigned int i = 0;
+ unsigned int a = 0xdeadbeef;
+
+ for (i = 0; i < 32; i++)
+ {
+ unsigned int exp = (a / (1 << i) + i);
+ unsigned int got = shft_add (a, i);
+
+ if (exp != got)
+ abort ();
+ }
+
+ return 0;
+}
+