On Thu, Nov 28, 2013 at 2:11 PM, Vladimir Makarov <[email protected]> wrote:
> The following patch fixes PR57293
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57293
>
> It is actually an implementation of missed LRA functionality in reg
> elimination. Before the patch any explicit change of stack pointer in
> RTL resulted in necessity to use the frame pointer.
>
> The patch has practically no effect on generic tuning of x86/x86-64.
> But it has a dramatic effect on code performance for other tunings
> like corei7 which don't use incoming args accumulation. The maximum
> SPEC2000 improvement 2.5% is achieved on x86 SPECInt2000. But
> SPECFP2000 rate also has improvement about 1% on x86 and x86-64. Too
> bad that I did not implement it at the first place. The results would
> have been even much better ones reported on 2012 GNU Cauldron as I
> also used -mtune=corei7 that time.
>
> The patch was bootstrapped and tested on x86-64/x86 and ppc.
>
> Committed as rev. 205498.
>
> 2013-11-28 Vladimir Makarov<[email protected]>
>
> PR target/57293
> * ira.h (ira_setup_eliminable_regset): Remove parameter.
> * ira.c (ira_setup_eliminable_regset): Ditto. Add
> SUPPORTS_STACK_ALIGNMENT for crtl->stack_realign_needed.
> Don't call lra_init_elimination.
> (ira): Call ira_setup_eliminable_regset without arguments.
> * loop-invariant.c (calculate_loop_reg_pressure): Remove argument
> from ira_setup_eliminable_regset call.
> * gcse.c (calculate_bb_reg_pressure): Ditto.
> * haifa-sched.c (sched_init): Ditto.
> * lra.h (lra_init_elimination): Remove the prototype.
> * lra-int.h (lra_insn_recog_data): New member sp_offset. Move
> used_insn_alternative upper.
> (lra_eliminate_regs_1): Add one more parameter.
> (lra-eliminate): Ditto.
> * lra.c (lra_invalidate_insn_data): Set sp_offset.
> (setup_sp_offset): New.
> (lra_process_new_insns): Call setup_sp_offset.
> (lra): Add argument to lra_eliminate calls.
> * lra-constraints.c (get_equiv_substitution): Rename to get_equiv.
> (get_equiv_with_elimination): New.
> (process_addr_reg): Call get_equiv_with_elimination instead of
> get_equiv_substitution.
> (equiv_address_substitution): Ditto.
> (loc_equivalence_change_p): Ditto.
> (loc_equivalence_callback, lra_constraints): Ditto.
> (curr_insn_transform): Ditto. Print the sp offset
> (process_alt_operands): Prevent stack pointer reloads.
> (lra_constraints): Remove one argument from lra_eliminate call.
> Move it up. Mark used hard regs bfore it. Use
> get_equiv_with_elimination instead of get_equiv_substitution.
> * lra-eliminations.c (lra_eliminate_regs_1): Add parameter and
> assert for param values combination. Use sp offset. Add argument
> to lra_eliminate_regs_1 calls.
> (lra_eliminate_regs): Add argument to lra_eliminate_regs_1 call.
> (curr_sp_change): New static var.
> (mark_not_eliminable): Add parameter. Update curr_sp_change.
> Don't prevent elimination to sp if we can calculate its change.
> Pass the argument to mark_not_eliminable calls.
> (eliminate_regs_in_insn): Add a parameter. Use sp offset. Add
> argument to lra_eliminate_regs_1 call.
> (update_reg_eliminate): Move calculation of hard regs for spill
> lower. Switch off lra_in_progress temporarily to generate regs
> involved into elimination.
> (lra_init_elimination): Rename to init_elimination. Make it
> static. Set up insn sp offset, check the offsets at the end of
> BBs.
> (process_insn_for_elimination): Add parameter. Pass its value to
> eliminate_regs_in_insn.
> (lra_eliminate): : Add parameter. Pass its value to
> process_insn_for_elimination. Add assert for param values
> combination. Call init_elimination. Don't update offsets in
> equivalence substitutions.
> * lra-spills.c (assign_mem_slot): Don't call lra_eliminate_regs_1
> for created stack slot.
> (remove_pseudos): Call lra_eliminate_regs_1 before changing memory
> onto stack slot.
>
Hi Vladimir,
Thanks for your hard work. I noticed a few regressions
on x86-64:
FAIL: gcc.dg/guality/pr54519-1.c -O2 line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c -O2 line 20 z == 6
FAIL: gcc.dg/guality/pr54519-1.c -O2 line 23 y == 117
FAIL: gcc.dg/guality/pr54519-1.c -O2 line 23 z == 8
FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 x == 36
FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c -O3 -fomit-frame-pointer line 20 z == 6
FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 x == 36
FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c -O3 -g line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto -fno-use-linker-plugin
-flto-partition=none line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c -O2 -flto -fno-use-linker-plugin
-flto-partition=none line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c -O2 line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c -O2 line 23 z == 8
FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c -O3 -fomit-frame-pointer line 23 z == 8
FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c -O3 -g line 23 z == 8
FAIL: gcc.dg/guality/pr54693-2.c -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions line 21 i == v + 1
FAIL: gcc.dg/guality/pr54693-2.c -O3 -fomit-frame-pointer
-funroll-loops line 21 i == v + 1
FAIL: gcc.target/i386/pr9771-1.c (internal compiler error)
Besides guality failures, there is
[hjl@gnu-6 gcc]$ make check-gcc
RUNTESTFLAGS="--target_board='unix{-m32\ -march=corei7}'
i386.exp=pr9771-1.c"
make[1]: Entering directory `/export/build/gnu/gcc/build-x86_64-linux/gcc'
test -d plugin || mkdir plugin
test -d testsuite || mkdir testsuite
test -d testsuite/gcc || mkdir testsuite/gcc
(rootme=`${PWDCMD-pwd}`; export rootme; \
srcdir=`cd /export/gnu/import/git/gcc/gcc; ${PWDCMD-pwd}` ; export srcdir ; \
cd testsuite/gcc; \
rm -f tmp-site.exp; \
sed '/set tmpdir/ s|testsuite$|testsuite/gcc|' \
< ../../site.exp > tmp-site.exp; \
/bin/sh ${srcdir}/../move-if-change tmp-site.exp site.exp; \
EXPECT=`if [ -f ${rootme}/../expect/expect ] ; then echo
${rootme}/../expect/expect ; else echo expect ; fi` ; export EXPECT ;
\
if [ -f ${rootme}/../expect/expect ] ; then \
TCL_LIBRARY=`cd .. ; cd ${srcdir}/../tcl/library ; ${PWDCMD-pwd}` ; \
export TCL_LIBRARY ; fi ; \
runtestflags= ; \
if [ -n "" ] ; then \
runtestflags=""; \
elif [ -n "" ] ; then \
parts="`echo ' ' \
| sed 's/=[^ ]* / /g'`"; \
for part in `find $srcdir/testsuite/gcc* -name \*.exp` ; do \
part=`basename $part` ; \
case " $parts $runtestflags " in \
*" $part "*) ;; \
*) runtestflags="$runtestflags $part" ;; \
esac ; \
done ; \
fi ; \
`if [ -f ${srcdir}/../dejagnu/runtest ] ; then echo
${srcdir}/../dejagnu/runtest ; else echo runtest; fi` --tool gcc
--target_board='unix{-m32\ -march=corei7}' i386.exp=pr9771-1.c
$runtestflags)
WARNING: Couldn't find the global config file.
Test Run By hjl on Thu Nov 28 16:50:05 2013
Native configuration is x86_64-unknown-linux-gnu
=== gcc tests ===
Schedule of variations:
unix/-m32 -march=corei7
Running target unix/-m32 -march=corei7
Using /usr/share/dejagnu/baseboards/unix.exp as board description file
for target.
Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
Using /export/gnu/import/git/gcc/gcc/testsuite/config/default.exp as
tool-and-target-specific interface file.
Running /export/gnu/import/git/gcc/gcc/testsuite/gcc.target/i386/i386.exp ...
FAIL: gcc.target/i386/pr9771-1.c (internal compiler error)
FAIL: gcc.target/i386/pr9771-1.c (test for excess errors)
=== gcc Summary ===
# of expected passes 10
# of unexpected failures 2
# of unresolved testcases 1
"-m32 -march=corei7" triggers the ICE.
--
H.J.