On Thu, Sep 27, 2012 at 4:25 PM, Richard Sandiford <rdsandif...@googlemail.com> wrote:
>>>> I agree (subreg:M (op:N A C) 0) to (op:M (subreg:N (A 0)) C) is >>>> a good transformation, but why do we need to handle as special >>>> the case where the subreg is itself the operand of a plus or minus? >>>> I think it should happen regardless of where the subreg occurs. >>> >>> Don't we need to restrict this to the low part though? >> >> I have tried this approach with attached patch. Unfortunately, >> although it survived bootstrap without libjava on x86_64, it failed >> building libjava with: >> >> /home/uros/gcc-svn/trunk/libjava/classpath/javax/swing/plaf/basic/BasicSliderUI.java:1299:0: >> error: insn does not satisfy its constraints: >> } >> ^ >> (insn 237 398 399 7 (set (reg:SI 1 dx [125]) >> (plus:SI (subreg:SI (mult:DI (reg:DI 1 dx [orig:72 D.78627 ] [72]) >> (const_int 2 [0x2])) 0) >> (reg:SI 5 di))) >> /home/uros/gcc-svn/trunk/libjava/classpath/javax/swing/plaf/basic/BasicSliderUI.java:1271 >> 240 {*leasi} >> (expr_list:REG_DEAD (reg:DI 5 di) >> (nil))) >> >> Original RTX was (subreg:SI (plus:DI (mult:DI (...) reg:DI))), which >> is valid RTX pattern for lea insn, the above is not. >> >> Due to these problems, I think the safer approach is to limit the >> transformation to (plus:SI (subreg:SI (plus:DI (...) 0)) RTXes, as was >> the case with original patch. This approach would fix a specific >> problem where simplify_plus_minus is not able to simplify the combined >> RTX at combine time. Please note, that combined RTXes are always >> checked for correctness at combine pass. > > I think instead the (subreg (plus ...)) handling should be applied > to (subreg (mult ...)) too. IMO the correct form of the above address > ought to be: > > (set (reg:SI 1 dx [125]) > (plus:SI (mult:SI (reg:SI 1 dx [orig:72 D.78627 ] [72]) > (const_int 2 [0x2])) > (reg:SI 5 di)) Great, this works as expected! After some off-line discussion with Richard, attached is v2 of the patch. 2012-09-27 Uros Bizjak <ubiz...@gmail.com> PR rtl-optimization/54457 * simplify-rtx.c (simplify_subreg): Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0) to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)). testsuite/ChangeLog: 2012-09-27 Uros Bizjak <ubiz...@gmail.com> PR rtl-optimization/54457 * gcc.target/i386/pr54457.c: New test. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. BTW: I propose that we start with limited selection of opcodes, so x32 autotester will pick and test the patch with SImode addresses. OK for mainline? Uros.
Index: simplify-rtx.c =================================================================== --- simplify-rtx.c (revision 191808) +++ simplify-rtx.c (working copy) @@ -5689,6 +5689,28 @@ simplify_subreg (enum machine_mode outermode, rtx return CONST0_RTX (outermode); } + /* Simplify (subreg:SI (op:DI ((x:DI) (y:DI)), 0) + to (op:SI (subreg:SI (x:DI) 0) (subreg:SI (x:DI) 0)), where + the outer subreg is effectively a truncation to the original mode. */ + if ((GET_CODE (op) == PLUS + || GET_CODE (op) == MINUS + || GET_CODE (op) == MULT) + && SCALAR_INT_MODE_P (outermode) + && SCALAR_INT_MODE_P (innermode) + && GET_MODE_PRECISION (outermode) < GET_MODE_PRECISION (innermode) + && byte == subreg_lowpart_offset (outermode, innermode)) + { + rtx op0 = simplify_gen_subreg (outermode, XEXP (op, 0), + innermode, byte); + if (op0) + { + rtx op1 = simplify_gen_subreg (outermode, XEXP (op, 1), + innermode, byte); + if (op1) + return simplify_gen_binary (GET_CODE (op), outermode, op0, op1); + } + } + /* Simplify (subreg:QI (lshiftrt:SI (sign_extend:SI (x:QI)) C), 0) into to (ashiftrt:QI (x:QI) C), where C is a suitable small constant and the outer subreg is effectively a truncation to the original mode. */ Index: testsuite/gcc.target/i386/pr54457.c =================================================================== --- testsuite/gcc.target/i386/pr54457.c (revision 0) +++ testsuite/gcc.target/i386/pr54457.c (working copy) @@ -0,0 +1,11 @@ +/* { dg-do compile { target { ! { ia32 } } } } */ +/* { dg-options "-O2 -mx32 -maddress-mode=short" } */ + +extern char array[40]; + +char foo (long long position) +{ + return array[position + 1]; +} + +/* { dg-final { scan-assembler-not "add\[lq\]?\[^\n\]*1" } } */