On Mon, Aug 8, 2011 at 8:16 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > Hello! > > Attached patch implements addr32 prefixed addresses for x86_64 > targets, where memory locations are accessed with 32bit base and index > registers in the form (zero_extend:DI (... SImode registers ...)). > The optimization rarely (if at all) triggers on x86_64, but is very > important on x32 (see [1]), where many LEAs get moved into addresses > of the operators. > > Of some interest is inability of reload to fix-up its own generated > moves for offsetable memory operand constraint "o", as it happens with > TImode moves. See [2] for further analysis and [3] for the workaround. > > 2011-08-08 Uros Bizjak <ubiz...@gmail.com> > > PR target/49781 > * config/i386/i386.c (ix86_decompose_address): Allow zero-extended > SImode addresses. > (ix86_print_operand_address): Handle zero-extended addresses. > (memory_address_length): Add length of addr32 prefix for > zero-extended addresses. > (ix86_secondary_reload): Handle moves to/from double-word general > registers from/to zero-extended addresses. > * config/i386/predicates.md (lea_address_operand): Reject > zero-extended operands. > > Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu > {,-m32}. Additionally, H.J. tested the patch on x32 target with GCC > bootstrap/regression tests, build of glibc (+regression tests) and > SPEC2000/2006. > > Patch was committed to mainline SVN. > > BTW: There is a strange optimization in combine pass, where > zero-extended address is converted on-the-fly to: > > Trying 9 -> 10: > Failed to match this instruction: > (... (and:DI (subreg:DI (plus:SI (ashift:SI (reg/v:SI 63 [ i ]) > (const_int 2 [0x2])) > (subreg:SI (reg/v/f:DI 62 [ a ]) 0)) 0) > (const_int 4294967295 [0xffffffff])) > ...) > > While it is easy to add a pattern recognizer for this RTX to > ix86_decompose_address/ix86_legitimate_address_p, I would like to > understand the purpose of the conversion better and eventually fix it > in combine pass. > > [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781 > [2] http://gcc.gnu.org/ml/gcc/2011-08/msg00129.html > [3] http://gcc.gnu.org/ml/gcc/2011-08/msg00157.html > > Uros. >
I checked in this testcase. Thanks. -- H.J. --- Index: gcc.target/i386/pr49781-1.c =================================================================== --- gcc.target/i386/pr49781-1.c (revision 0) +++ gcc.target/i386/pr49781-1.c (revision 0) @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fpic" } */ +/* { dg-require-effective-target fpic } */ + +static int heap[2*(256 +1+29)+1]; +static int heap_len; +static int heap_max; +void +foo (int elems) +{ + int n, m; + int max_code = -1; + int node = elems; + heap_len = 0, heap_max = (2*(256 +1+29)+1); + for (n = 0; n < elems; n++) + heap[++heap_len] = max_code = n; + do { + n = heap[1]; + heap[1] = heap[heap_len--]; + m = heap[1]; + heap[--heap_max] = n; + heap[--heap_max] = m; + } while (heap_len >= 2); +} + +/* { dg-final { scan-assembler-not "lea\[lq\]?\[ \t\]\\((%|)r\[a-z0-9\]*" } } */ Index: ChangeLog =================================================================== --- ChangeLog (revision 177568) +++ ChangeLog (working copy) @@ -1,3 +1,8 @@ +2011-08-08 H.J. Lu <hongjiu...@intel.com> + + PR target/49781 + * gcc.target/i386/pr49781-1.c: New. + 2011-08-08 Jason Merrill <ja...@redhat.com> * g++.dg/cpp0x/range-for20.C: Adjust to test 50020 as well.