Hi All, As reported in PR rtl-optimization/53352 CSE currently trips up on a paradoxical subreg case. When compiling for ARM GNU/Linux with -O3 the expanded RTL of interest looks like:
(insn 12 11 13 3 (set (reg:SI 140) (lshiftrt:SI (reg/v:SI 135 [ tmp1 ]) (const_int 16 [0x10]))) (nil)) (insn 13 12 14 3 (set (reg:QI 141) (subreg:QI (reg:SI 140) 0)) (nil)) (insn 14 13 15 3 (set (reg:SI 142) (subreg:SI (reg:QI 141) 0)) (nil)) (insn 15 14 16 3 (set (reg:SI 134 [ tmp1$2 ]) (and:SI (reg:SI 142) (const_int 255 [0xff]))) (nil)) ... (insn 29 28 30 3 (set (reg:SI 0 r0) (const_int 0 [0])) (nil)) after "cse1" things look like: (insn 12 11 13 2 (set (reg:SI 140) (const_int 65280 [0xff00])) (nil)) (insn 13 12 14 2 (set (reg:QI 141) (subreg:QI (reg:SI 140) 0)) (expr_list:REG_EQUAL (const_int 0 [0]) (nil))) ;; This is *not* equal to zero because the upper ;; two bytes are undefined. (insn 14 13 15 2 (set (reg:SI 142) (subreg:SI (reg:QI 141) 0)) (expr_list:REG_EQUAL (const_int 0 [0]) (nil))) (insn 15 14 16 2 (set (reg:SI 134 [ tmp1$2 ]) (reg:SI 142)) (expr_list:REG_EQUAL (const_int 0 [0]) (nil))) ... (insn 29 28 30 2 (set (reg:SI 0 r0) (reg:SI 142)) (expr_list:REG_EQUAL (const_int 0 [0]) (nil))) I believe the REG_EQUAL note on the set involving a paradoxical subreg is incorrect. It eventually causes 0xFF00 to be passed to the function 'foo'. The attached patch fixes the issue by skipping the paradoxical subreg in 'equiv_constant'. Compiler bootstrapped for i686-pc-linux-gnu and full GCC test runs for i686-pc-linux-gnu and arm-none-linux-gnueabi (no regressions). OK? (If this is OK, then can someone commit for me. I don't have write access). gcc/ 2012-05-15 Meador Inge <mead...@codesourcery.com> PR rtl-optimization/53352 * cse.c (equiv_constant): Ignore paradoxical subregs. gcc/testsuite/ 2012-05-15 Meador Inge <mead...@codesourcery.com> PR rtl-optimization/53352 * gcc.dg/pr53352.c: New test. -- Meador Inge CodeSourcery / Mentor Embedded http://www.mentor.com/embedded-software
Index: gcc/testsuite/gcc.dg/pr53352.c =================================================================== --- gcc/testsuite/gcc.dg/pr53352.c (revision 0) +++ gcc/testsuite/gcc.dg/pr53352.c (revision 0) @@ -0,0 +1,41 @@ +/* { dg-do run } */ +/* { dg-options "-O1" } */ + +#include <stdlib.h> + +typedef union +{ + struct + { + unsigned char a; + unsigned char b; + unsigned char c; + unsigned char d; + } parts; + unsigned long whole; +} T; + +T *g_t; + +void bar (unsigned long x) +{ + if (x != 0) + abort (); +} + +int main () +{ + T one; + T two; + T tmp1, tmp2; + + one.whole = 0xFFE0E0E0; + two.whole = 0xFF000000; + tmp1.parts = two.parts; + tmp2.parts = one.parts; + tmp2.parts.c = tmp1.parts.c; + one.parts = tmp2.parts; + + g_t = &one; + bar (0); +} Index: gcc/cse.c =================================================================== --- gcc/cse.c (revision 187470) +++ gcc/cse.c (working copy) @@ -3786,8 +3786,11 @@ equiv_constant (rtx x) } } - /* Otherwise see if we already have a constant for the inner REG. */ + /* Otherwise see if we already have a constant for the inner REG. + Don't bother with paradoxical subregs because we have no way + of knowing what the upper bytes are. */ if (REG_P (SUBREG_REG (x)) + && (GET_MODE_SIZE (mode) <= GET_MODE_SIZE (imode)) && (new_rtx = equiv_constant (SUBREG_REG (x))) != 0) return simplify_subreg (mode, new_rtx, imode, SUBREG_BYTE (x));