Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-14 Thread H.J. Lu
On Fri, Oct 14, 2011 at 9:23 AM, Paolo Bonzini wrote: > On 10/14/2011 05:36 PM, H.J. Lu wrote: >> >> There is a testcase at >> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50696 >> >> It passes with my patch. > > Cool, so let's wait for the results of testing. > > Paolo > Here is the complete p

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-14 Thread Paolo Bonzini
On 10/14/2011 05:36 PM, H.J. Lu wrote: There is a testcase at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50696 It passes with my patch. Cool, so let's wait for the results of testing. Paolo

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-14 Thread H.J. Lu
On Thu, Oct 13, 2011 at 11:51 PM, Paolo Bonzini wrote: > On 10/13/2011 10:07 PM, H.J. Lu wrote: >> >> On Thu, Oct 13, 2011 at 11:15 AM, Richard Kenner >>  wrote: The answer to H.J.'s "Why do we do it for MEM then?" is simply "because no one ever thought about not doing it" >>> >>>

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On 10/13/2011 10:07 PM, H.J. Lu wrote: On Thu, Oct 13, 2011 at 11:15 AM, Richard Kenner wrote: The answer to H.J.'s "Why do we do it for MEM then?" is simply "because no one ever thought about not doing it" No, that's false. The same expand_compound_operation / make_compound_operation pair

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 3:52 PM, Richard Kenner wrote: >> Like ths? > > Yes, that's what I meant.  Thanks. > > Again, I'd suggest doing some performance testing on this just to verify > that it doesn't pessimize things. > I will run SPEC CPU 2K/2006 on ia32, x86-64 and x32. -- H.J.

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> Like ths? Yes, that's what I meant. Thanks. Again, I'd suggest doing some performance testing on this just to verify that it doesn't pessimize things.

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 3:33 PM, Richard Kenner wrote: >> I am testing this patch.  The difference is it checks nonzero >> bits of the first operand. > > I would suggest moving (and expanding) the comments from the existing block > into your new block. > Like ths? -- H.J. --- diff --git a/gcc/c

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> I am testing this patch. The difference is it checks nonzero > bits of the first operand. I would suggest moving (and expanding) the comments from the existing block into your new block.

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> But the current code converts (and X 3) into a bit extraction > since ((i = exact_log2 (UINTVAL (XEXP (x, 1)) + 1)) >= 0) is true > when UINTVAL (XEXP (x, 1)) == 3. Should we do it or not? By adding the test for nonzero bits, you'd potentially be doing the conversion more often (which is the po

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 2:45 PM, H.J. Lu wrote: > On Thu, Oct 13, 2011 at 2:30 PM, Richard Kenner > wrote: >>> It is because mask 0x is optimized to 0xfffc by keeping track >>> of non-zero bits in registers and the above code doesn't take that >>> into account. >> >> Then I'd suggest

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 2:30 PM, Richard Kenner wrote: >> It is because mask 0x is optimized to 0xfffc by keeping track >> of non-zero bits in registers and the above code doesn't take that >> into account. > > Then I'd suggest modifying that code so that it does rather than > essentia

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> It is because mask 0x is optimized to 0xfffc by keeping track > of non-zero bits in registers and the above code doesn't take that > into account. Then I'd suggest modifying that code so that it does rather than essentially duplicating it. But I'd recommend running some performance

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 2:23 PM, Richard Kenner wrote: >> Does it look OK? > > No. > > If I understand your code correctly, there's essentially the same code > as you have a bit above that: > >      /* If the constant is one less than a power of two, this might be >         representable by an ext

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> Does it look OK? No. If I understand your code correctly, there's essentially the same code as you have a bit above that: /* If the constant is one less than a power of two, this might be representable by an extraction even if no shift is present. If it doesn't end

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 11:15 AM, Richard Kenner wrote: >> The answer to H.J.'s "Why do we do it for MEM then?" is simply >> "because no one ever thought about not doing it" > > No, that's false.  The same expand_compound_operation / > make_compound_operation > pair is present in the MEM case as

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> The answer to H.J.'s "Why do we do it for MEM then?" is simply > "because no one ever thought about not doing it" No, that's false. The same expand_compound_operation / make_compound_operation pair is present in the MEM case as in the SET case. It's just that there's some bug here that's noti

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> We first expand zero_extend:DI address to and:DI and then try > to restore zero_extend:DI. Why do we do this transformation > to begin with? Suppose there were an outer AND that duplicated what this one did. Then when you combine those two, you merge it to one AND. Then make_compound_operatio

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On Thu, Oct 13, 2011 at 19:06, Richard Kenner wrote: >> An and:DI is cheaper than a zero_extend:DI of an and:SI. > > That depends strongly on the constants and whether the machine is 32-bit > or 64-bit. Yes, the rtx_costs take care of that. > But that's irrelevant in this case since the and:SI w

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 10:21 AM, Paolo Bonzini wrote: > On Thu, Oct 13, 2011 at 19:19, H.J. Lu wrote: >> On Thu, Oct 13, 2011 at 10:01 AM, Paolo Bonzini wrote: >>> On 10/13/2011 06:35 PM, Richard Kenner wrote: > > It never calls make_extraction.  There are several cases handled > fo

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On Thu, Oct 13, 2011 at 19:19, H.J. Lu wrote: > On Thu, Oct 13, 2011 at 10:01 AM, Paolo Bonzini wrote: >> On 10/13/2011 06:35 PM, Richard Kenner wrote: It never calls make_extraction.  There are several cases handled for AND operation. But (and:DI (plus:DI (subreg:DI (mul

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 10:01 AM, Paolo Bonzini wrote: > On 10/13/2011 06:35 PM, Richard Kenner wrote: >>> >>> It never calls make_extraction.  There are several cases handled >>> for AND operation. But >>> >>> (and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ]) >>>                (const_int

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> An and:DI is cheaper than a zero_extend:DI of an and:SI. That depends strongly on the constants and whether the machine is 32-bit or 64-bit. But that's irrelevant in this case since the and:SI will be removed (it reflects what already been done).

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On 10/13/2011 06:35 PM, Richard Kenner wrote: It never calls make_extraction. There are several cases handled for AND operation. But (and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ]) (const_int 4 [0x4])) 0) (subreg:DI (reg:SI 106) 0)) (const_int 4294967292 [0x

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> It never calls make_extraction. There are several cases handled > for AND operation. But > > (and:DI (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ]) >(const_int 4 [0x4])) 0) >(subreg:DI (reg:SI 106) 0)) >(const_int 4294967292 [0xfffc])) > > isn't one of them.

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 9:11 AM, Richard Kenner wrote: >> at the end.  make_compound_operation doesn't know how to >> restore ZERO_EXTEND. > > It does in general.  See make_extraction, which it calls.  The question is > why it doesn't in this case.  That's the bug. > It never calls make_extractio

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> at the end. make_compound_operation doesn't know how to > restore ZERO_EXTEND. It does in general. See make_extraction, which it calls. The question is why it doesn't in this case. That's the bug.

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread H.J. Lu
On Thu, Oct 13, 2011 at 7:14 AM, Richard Kenner wrote: >> Or being fooled by the 0xfffc masking, perhaps. > > No, I'm pretty sure that's NOT the case.  The *whole point* of the > routine is to deal with that masking. > I got (gdb) step make_compound_operation (x=0x7139c4c8, in_code=MEM)

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> Or being fooled by the 0xfffc masking, perhaps. No, I'm pretty sure that's NOT the case. The *whole point* of the routine is to deal with that masking.

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On 10/13/2011 02:51 PM, Richard Kenner wrote: case MEM: /* Ensure that our address has any ASHIFTs converted to MULT in case address-recognizing predicates are called later. */ temp = make_compound_operation (XEXP (x, 0), MEM); SUBST (XEXP (x, 0), temp);

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Richard Kenner
> Same here, I don't like it but I hardly see any alternative. The only > possibility could be to prevent calling expand_compound_operation > completely for addresses. Richard, what do you think? Don't worry, > combine hasn't changed much since your days. :) The problem wasn't potential chan

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-13 Thread Paolo Bonzini
On 10/13/2011 01:04 AM, Richard Kenner wrote: I still don't like the patch, but I'm no longer as familiar with the code as I used to be so can't suggest a replacement. Let's see what others think about it. Same here, I don't like it but I hardly see any alternative. The only possibility cou

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-12 Thread Richard Kenner
> 1. The placement of subreg in > > (plus:DI (subreg:DI (mult:SI (reg/v:SI 85 [ i ]) > (const_int 4 [0x4])) 0) > (subreg:DI (reg:SI 106) 0)) > > isn't supported by x86 backend. That's easy to fix. > 2. The biggest problem is optimizing mask 0x to

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-12 Thread H.J. Lu
On Wed, Oct 12, 2011 at 3:40 PM, Richard Kenner wrote: >> X86 backend doesn't accept the new expression as valid address while >> (zero_extend:DI) works just fine.  This patches keeps ZERO_EXTEND >> when zero-extending address to Pmode.  It reduces number of lea from >> 24173 to 21428 in x32 libgf

Re: PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-12 Thread Richard Kenner
> X86 backend doesn't accept the new expression as valid address while > (zero_extend:DI) works just fine. This patches keeps ZERO_EXTEND > when zero-extending address to Pmode. It reduces number of lea from > 24173 to 21428 in x32 libgfortran.so. Does it make any senses? I'd be inclined to hav

PATCH: PR rtl-optimization/50696: [x32] Unnecessary lea

2011-10-12 Thread H.J. Lu
Hi, When combine tries to combine: (insn 37 35 39 3 (set (reg:SI 90) (plus:SI (mult:SI (reg/v:SI 84 [ i ]) (const_int 4 [0x4])) (reg:SI 106))) x.i:11 247 {*leasi_2} (nil)) (insn 39 37 41 3 (set (mem:SI (zero_extend:DI (reg:SI 90)) [3 MEM[symbol: x, index: