Re: IRA_COVER_CLASSES for m32c

2008-09-12 Thread Jeff Law

DJ Delorie wrote:

Opening this up to the gcc public, since I appear to be unable to get
this to work right.


Still no luck defining a working IRA_COVER_CLASSES for m32c.  My
latest attempt:

#define IRA_COVER_CLASSES \
{ \
  HC_REGS, MEM_REGS, LIM_REG_CLASSES\
}

(effectively GENERAL_REGS (which I also tried), but MEM_REGS regs are
implemented as memory locations, so I had tried without them) results
in this build failure in newlib:

m32c-elf-gcc -B/greed/dj/m32c/newlib/m32c-elf/m32c-elf/m32cm/newlib/ -isystem 
/greed/dj/m32c/newlib/m32c-elf/m32c-elf/m32cm/newlib/targ-include -isystem /greed/dj/m32c/newlib/src/newlib/libc/include 
-B/greed/dj/m32c/newlib/m32c-elf/m32c-elf/m32cm/libgloss/m32c -L/greed/dj/m32c/newlib/m32c-elf/m32c-elf/m32cm/libgloss/libnosys 
-L/greed/dj/m32c/newlib/src/libgloss/m32c  -mcpu=m32cm -DPACKAGE_NAME=\"newlib\" -DPACKAGE_TARNAME=\"newlib\" 
-DPACKAGE_VERSION=\"1.16.0\" -DPACKAGE_STRING=\"newlib\ 1.16.0\" -DPACKAGE_BUGREPORT=\"\" -I. 
-I../../../../../../src/newlib/libc/stdlib -Os -DPREFER_SIZE_OVER_SPEED -DABORT_PROVIDED -DSMALL_MEMORY -DMISSING_SYSCALL_NAMES 
-fno-builtin  -g -O2-mcpu=m32cm -DINTERNAL_NEWLIB -DDEFINE_MALLOC -c ../../../../../../src/newlib/libc/stdlib/mallocr.c 
-o lib_a-mallocr.o
../../../../../../src/newlib/libc/stdlib/mallocr.c: In function '_malloc_r':
../../../../../../src/newlib/libc/stdlib/mallocr.c:2588: error: unable to find 
a register to spill in class 'HL_REGS'
../../../../../../src/newlib/libc/stdlib/mallocr.c:2588: error: this is the 
insn:
(insn 661 660 662 101 ../../../../../../src/newlib/libc/stdlib/mallocr.c:2194 
(set (reg:HI 0 r0 [262])
(and:HI (subreg:HI (reg:PSI 5 a1 [258]) 0)
(const_int 127 [0x7f]))) 26 {andhi3_24} (expr_list:REG_DEAD 
(reg:PSI 5 a1 [258])
(nil)))
../../../../../../src/newlib/libc/stdlib/mallocr.c:2588: internal compiler 
error: in spill_failure, at reload1.c:2093
  
What are the recorded reloads for this insn?  I've never worked with the 
m32c, but ISTM that the only problem here is we need op0 and op1 to 
match which can be accomplished by reloading op1 into r0.


Jeff



worst case register classes (Was: Re: IRA_COVER_CLASSES for m32c)

2008-09-12 Thread Joern Rennecke
As I've said before, m32c is probably a "worst case" scenario for gcc
as it has not one, not two, not even three, but FOUR different types
of registers (8/16 bit general, 16 bit only general, 24 bit address
registers, and control (incl $fp) registers), and only a small number
(2) of each.

I think our mxp is more 'interesting'.  It got several hundred register
classes, auto-generated, and some important ones have only a single member.

We got a number of vector registers, a flags register, and an accumulator.
All of them are 128 bit wide and an 8-bit mask selects which 16-bit wide
lanes of the 128 bit word are operated on - i.e. 255 different combinations.
A lot of instructions clobber the accumulator and/or the flags
and to make sure that the registers are lined up, I have to make flags and
accumulator allocated registers.
Then the first few of the vector registers are also special because they
can also be used as 16 and 32 bit scalar registers.
A truely complete register class set would have 2^32 register classes even
before getting into such niceties as call used / clobered regs and sibcalls,
but I considered this slightly impractical considering the amount of
memory that would cost.


Re: worst case register classes (Was: Re: IRA_COVER_CLASSES for m32c)

2008-09-12 Thread Paolo Bonzini

> I think our mxp is more 'interesting'. [snip]

I think it's more like 'insane', :-) and a miracle that a retargetable
compiler can be ported to it.

Paolo


Re: extra instructions lost from -O0 to -O1

2008-09-12 Thread Thomas A.M. Bernard
Well I found another way to solve the problem by updating the dce for 
not taking out my instructions.


I inserted "setallocate" as a native operator in the back-end which 
comes from a GIMPLE node and map to the RTL pattern. Earlier in the 
discussion, it's been discussed that the dce was taking out the 
instruction when flag -O1 was engaged. To solve that, in 
'tree-ssa-dce.c', I flagged this node with the function, 
"mark_stmt_necessary". And it works fine so far. The instruction is not 
omitted anymore by the dce :-)


Thanks for your help guys.
Thomas

Ian Lance Taylor wrote:

"Thomas A.M. Bernard" <[EMAIL PROTECTED]> writes:

  

Ian Lance Taylor wrote:


"Thomas A.M. Bernard" <[EMAIL PROTECTED]> writes:

  
  

I guess I am missing something here. I've tried the following as Paolo
suggested,

(define_insn "setallocate"
[(unspec_volatile:DI [(match_operand:DI 0 "general_operand" "r")]
UNSPEC_ALLOCATE)]
 ""
 "allocate %0\t\t#TCB_INSTRUCTIONS"  [(set_attr "type" "multi")])

When flag -O0 is on, everything's fine. But when flag -O1 is engaged,
the instruction is still omitted. Something missing ?



The only that gcc will remove an unspec_volatile instruction is if it
is on a code path which is never executed.
  Look at the RTL dump files (from, e.g., -fdump-rtl-all), see where
it
is disappearing, and why.
  
  

With this pattern for setallocate, when the flag -O1 is engaged, the
instruction is already omitted just at the expansion. As Ian
mentioned, I suspect the problem comes from the fact that the compiler
thinks this code path won't be executed. I presume this should be done
at the CFG level. I added in CFG a "node" which describes
setallocate' Any clue to say explicitly this will be executed in any
case ?



The conventional way to get a special purpose instruction through the
tree code is to use a builtin function.  For example, look at the
calls to __builtin_XXX in config/i386/mmintrin.h (a header file
included by target programs) and the associated code in
config/i386/i386.c.

Ian
  


Re: extra instructions lost from -O0 to -O1

2008-09-12 Thread Paolo Bonzini
Thomas A.M. Bernard wrote:
> Well I found another way to solve the problem by updating the dce for
> not taking out my instructions.
> 
> I inserted "setallocate" as a native operator in the back-end which
> comes from a GIMPLE node and map to the RTL pattern. Earlier in the
> discussion, it's been discussed that the dce was taking out the
> instruction when flag -O1 was engaged. To solve that, in
> 'tree-ssa-dce.c', I flagged this node with the function,
> "mark_stmt_necessary". And it works fine so far. The instruction is not
> omitted anymore by the dce :-)

Do not add it as a GIMPLE node.  Add it as a builtin function, so that
the tree-level DCE will treat like every other call and not remove it.

IOW, do not add new kinds of node.  Use builtins for trees, and unspecs
for RTL.

Paolo


Problems building Windows hosted mips-elf toolchain using Linux as build machine

2008-09-12 Thread Øyvind Harboe
I'm trying to build a mips-elf toolchain hosted on Windows using
Linux as the build machine but I'm running into the following error:



mips-elf-gcc -nostdinc -isystem /tmp/gccbuild/build/gcc/./gcc/include
-B/tmp/gccbuild/build/gcc/mips-elf/newlib/ -isystem
/tmp/gccbuild/build/gcc/mips-elf/newlib/targ-include -isystem
/tmp/gccbuild/src/gcc/newlib/libc/include
-B/tmp/gccbuild/build/gcc/mips-elf/libgloss/mips
-L/tmp/gccbuild/build/gcc/mips-elf/libgloss/libnosys
-L/tmp/gccbuild/src/gcc/libgloss/mips -O2 -g -g -O2 -msoft-float -O2
-O2 -g -g -O2   -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE   -W -Wall
-Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes
-Wold-style-definition  -isystem ./include  -G 0 -g  -DIN_LIBGCC2
-D__GCC_FLOAT_NOT_NEEDED -Dinhibit_libc  -I. -I. -I../../.././gcc
-I/tmp/gccbuild/src/gcc/libgcc -I/tmp/gccbuild/src/gcc/libgcc/.
-I/tmp/gccbuild/src/gcc/libgcc/../gcc
-I/tmp/gccbuild/src/gcc/libgcc/../include  -DHAVE_CC_TLS -o
_fixunssfsi.o -MT _fixunssfsi.o -MD -MP -MF _fixunssfsi.dep
-DL_fixunssfsi -c /tmp/gccbuild/src/gcc/libgcc/../gcc/libgcc2.c  \
-DLIBGCC2_UNITS_PER_WORD=4
In file included from /tmp/gccbuild/src/gcc/libgcc/../gcc/libgcc2.c:1697:
/tmp/gccbuild/src/gcc/newlib/libc/include/limits.h:130:26: error: no
include path in which to search for limits.h
make[4]: *** [_fixunssfsi.o] Error 1
make[4]: Leaving directory `/tmp/gccbuild/build/gcc/mips-elf/soft-float/libgcc'
make[3]: *** [multi-do] Error 1
make[3]: Leaving directory `/tmp/gccbuild/build/gcc/mips-elf/libgcc'
make[2]: *** [all-multi] Error 2
make[2]: Leaving directory `/tmp/gccbuild/build/gcc/mips-elf/libgcc'
make[1]: *** [all-target-libgcc] Error 2
make[1]: Leaving directory `/tmp/gccbuild/build/gcc'
make: *** [all] Error 2
[EMAIL PROTECTED]:/tmp/toolchain/build$



What I did:

- built toolchain hosted on Linux, generating mips-elf target code. Added it
to the path.
- ran script below
- gcc 4.3.2, gmp 4.3.2, mpfr 2.3.1, newlib CVS HEAD(>16.0), binutils 2.18
- I messed around with trying to add the right include files without too much
luck. The problem is that I don't really know what's wrong...
- trying to build a GCC toolchain under Windows seems like a complete
non-starter
- tried GCC CVS HEAD
- tried some other targets without luck, same result. (arm-elf, powerpc-eabi).

The most recent google hits I've found:

- http://sourceware.org/ml/libc-alpha/2007-03/msg00017.html


# Build toolchain for $1 target

set -e
rm -rf $GCCBUILD/gcc
#rm -rf $GCCBUILD/binutils
rm -rf $GCCBUILD/gdb
rm -rf $GCCBUILD/install

export PATH=$GCCBUILD/install/bin:$PATH

export TARGET=$1

export HOST_OPTION="--host=i586-mingw32msvc --build=i686-pc-linux-gnu"




#mkdir $GCCBUILD/binutils
#cd $GCCBUILD/binutils
#$GCCSRC/binutils/configure $HOST_OPTION --enable-multilib
--enable-interwork --target=$TARGET  --prefix=$GCCBUILD/install
#make
#make install
#cd ..


mkdir $GCCBUILD/gcc
cd $GCCBUILD/gcc
$GCCSRC/gcc/configure  $HOST_OPTION --disable-libssp --target=$TARGET
--enable-languages=c,c++ --with-gnu-as --with-gnu-ld --with-newlib
--disable-shared --enable-newlib -v  --enable-multilib
--disable-threads --enable-sjlj-exceptions
--enable-libstdcxx-allocator=malloc  --prefix=$GCCBUILD/install
--enable-interwork --with-gmp=$GCCBUILD/gmp --with-mpfr=$GCCBUILD/mpfr
make
make install
cd ..

mkdir $GCCBUILD/gdb
cd $GCCBUILD/gdb
$GCCSRC/gdb/configure $HOST_OPTION  --target=$TARGET  --prefix=$GCCBUILD/install
make
make install
cd ..

-- 
Øyvind Harboe
http://www.zylin.com/zy1000.html
ARM7 ARM9 XScale Cortex
JTAG debugger and flash programmer


Re: IRA_COVER_CLASSES for m32c

2008-09-12 Thread Vladimir Makarov

DJ Delorie wrote:

Opening this up to the gcc public, since I appear to be unable to get
this to work right.


Still no luck defining a working IRA_COVER_CLASSES for m32c.  My
latest attempt:

#define IRA_COVER_CLASSES \
{ \
  HC_REGS, MEM_REGS, LIM_REG_CLASSES\
}

(effectively GENERAL_REGS (which I also tried), but MEM_REGS regs are
implemented as memory locations, so I had tried without them) results
in this build failure in newlib:

m32c-elf-gcc -B/greed/dj/m32c/newlib/m32c-elf/m32c-elf/m32cm/newlib/ -isystem 
/greed/dj/m32c/newlib/m32c-elf/m32c-elf/m32cm/newlib/targ-include -isystem /greed/dj/m32c/newlib/src/newlib/libc/include 
-B/greed/dj/m32c/newlib/m32c-elf/m32c-elf/m32cm/libgloss/m32c -L/greed/dj/m32c/newlib/m32c-elf/m32c-elf/m32cm/libgloss/libnosys 
-L/greed/dj/m32c/newlib/src/libgloss/m32c  -mcpu=m32cm -DPACKAGE_NAME=\"newlib\" -DPACKAGE_TARNAME=\"newlib\" 
-DPACKAGE_VERSION=\"1.16.0\" -DPACKAGE_STRING=\"newlib\ 1.16.0\" -DPACKAGE_BUGREPORT=\"\" -I. 
-I../../../../../../src/newlib/libc/stdlib -Os -DPREFER_SIZE_OVER_SPEED -DABORT_PROVIDED -DSMALL_MEMORY -DMISSING_SYSCALL_NAMES 
-fno-builtin  -g -O2-mcpu=m32cm -DINTERNAL_NEWLIB -DDEFINE_MALLOC -c ../../../../../../src/newlib/libc/stdlib/mallocr.c 
-o lib_a-mallocr.o
../../../../../../src/newlib/libc/stdlib/mallocr.c: In function '_malloc_r':
../../../../../../src/newlib/libc/stdlib/mallocr.c:2588: error: unable to find 
a register to spill in class 'HL_REGS'
../../../../../../src/newlib/libc/stdlib/mallocr.c:2588: error: this is the 
insn:
(insn 661 660 662 101 ../../../../../../src/newlib/libc/stdlib/mallocr.c:2194 
(set (reg:HI 0 r0 [262])
(and:HI (subreg:HI (reg:PSI 5 a1 [258]) 0)
(const_int 127 [0x7f]))) 26 {andhi3_24} (expr_list:REG_DEAD 
(reg:PSI 5 a1 [258])
(nil)))
../../../../../../src/newlib/libc/stdlib/mallocr.c:2588: internal compiler 
error: in spill_failure, at reload1.c:2093


As I've said before, m32c is probably a "worst case" scenario for gcc
as it has not one, not two, not even three, but FOUR different types
of registers (8/16 bit general, 16 bit only general, 24 bit address
registers, and control (incl $fp) registers), and only a small number
(2) of each.

I'm beginning to suspect that anyone doing ANYTHING with register
allocation or reload should include m32c in their testing, as it seems
to have broken every time those got changed.

Vlad, could you try your hand at this?  Please? :-)

  
Sure, DJ.  I'll look at this but unfortunately I can do it on next week 
because I am busy with numerous other IRA bugs.  As I wrote m32c is 
pretty nasty case and may be will need even insn description changes.


If we find that RA is not possible for m32c or other weird targets 
(Joerne wrote about one them) with non-intersected register classes 
(cover classes), I could implement priority coloring for intersected 
register classes.  I think it will be about 100 lines of additional 
code.  But I'd rather avoid this scenario.




Re: IRA_COVER_CLASSES for m32c

2008-09-12 Thread DJ Delorie

> Sure, DJ.  I'll look at this but unfortunately I can do it on next week 
> because I am busy with numerous other IRA bugs.

Next week would be fine :-)

> As I wrote m32c is pretty nasty case and may be will need even insn
> description changes.

I'm OK with that.


Re: IRA_COVER_CLASSES for m32c

2008-09-12 Thread Jeff Law

DJ Delorie wrote:

Opening this up to the gcc public, since I appear to be unable to get
this to work right.


Still no luck defining a working IRA_COVER_CLASSES for m32c.  My
latest attempt:

#define IRA_COVER_CLASSES \
{ \
  HC_REGS, MEM_REGS, LIM_REG_CLASSES\
}
  

[ ... ]
I actually got a failure building libgcc.

The first thing I'll note is the CR register class.If I read the 
port correctly, it's got 3 members.  SP, FP and one non-fixed 
allocatable register. 

The first thing I'd recommend would be to remove SP from that class.  
You can't ever allocate the SP register, so there's litttle point in 
including it in the CR class.  Removing it may give some of the tiny 
register class heuristics a chance to try and avoid CR regs when the 
frame pointer is not being eliminated.


Second, you don't generally want the CR class to be the goal class for 
reload.  ISTM that alternatives which consist solely of CR_REGS ought to 
have a '!' prefix in their constraint.  Adding that gets me to the same 
failure you're seeing building m32cm mallocr.


Jeff



Bootstrap failure in sparc-sun-solaris2.10

2008-09-12 Thread Arthur Haas
Hi.

Even with the patch listed in the bug report below my bootstrap would
still fail when the compiler tries to build libgcc.

The bug report for the bootstrap failure is here:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37424

The build failed while building libgcc - the code would trip up on an
assert in haifa-sched.c, similar to the message posted here, though now
the assert is on line 2318:

http://gcc.gnu.org/ml/gcc/2008-09/msg00106.html

Applying the patch Adam created and posted in the message below resolved
the issue and the compiler successfully bootstrapped:

http://gcc.gnu.org/ml/gcc/2008-09/msg00139.html

There was one reply to this message; I don't know if the patch is being
reworked or been formally submitted yet, but it did fix my build.

Art Haas


Re: Bootstrap failure in sparc-sun-solaris2.10

2008-09-12 Thread Eric Botcazou
> Even with the patch listed in the bug report below my bootstrap would
> still fail when the compiler tries to build libgcc.
>
> The bug report for the bootstrap failure is here:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37424

That patch is not the fix, this is explained in the audit trail.  The fix is

2008-09-11  Jeff Law <[EMAIL PROTECTED]>

* reload1.c (alter_reg): Undo the BYTE_BIG_ENDIAN correction performed
by assign_stack_local on the IRA path for stack slot sharing
as well as the non-IRA path.

> The build failed while building libgcc - the code would trip up on an
> assert in haifa-sched.c, similar to the message posted here, though now
> the assert is on line 2318:
>
> http://gcc.gnu.org/ml/gcc/2008-09/msg00106.html
>
> Applying the patch Adam created and posted in the message below resolved
> the issue and the compiler successfully bootstrapped:
>
> http://gcc.gnu.org/ml/gcc/2008-09/msg00139.html

Thanks for reporting this.  I now can close PR 37424.

> There was one reply to this message; I don't know if the patch is being
> reworked or been formally submitted yet, but it did fix my build.

OK, I'll take a look.

-- 
Eric Botcazou


Announce: MPFR 2.3.2 is released

2008-09-12 Thread Vincent Lefevre
MPFR 2.3.2 is now available for download from the MPFR web site:

  http://www.mpfr.org/mpfr-2.3.2/

Thanks very much to those who sent us bug reports and/or tested
the release candidate.

The MD5's:
e02dff02dbcc813572395f20a89c4d96  mpfr-2.3.2.tar.lzma
527147c097874340cb9cee0579dacf3b  mpfr-2.3.2.tar.bz2
3559d1713b97baef53f241c374be291a  mpfr-2.3.2.tar.gz
ff18379f950ccbef6794bd10ecf77100  mpfr-2.3.2.zip

Changes from version 2.3.1 to version 2.3.2:
- Bug fixes; see .
- Improved MPFR manual.
- Behavior of mpfr_check_range changed: if the value is an inexact
  infinity, the overflow flag is set (in case it was lost).
- Function mpfr_init_gmp_rand (only defined when building MPFR without
  the --with-gmp-build configure option) is no longer defined at all.
  This function was private and not documented, and was used only in
  the MPFR test suite. User code that calls it is regarded as broken
  and may fail as a consequence. Running the old test suite against
  MPFR 2.3.2 may also fail.

It has also been decided to change a small part of the copyright
notice of the MPFR manual, so that the manual can be included in
Debian (together with the free packages).

  with no Invariant Sections, with the Front-Cover Texts being "A GNU
  Manual", and with the Back-Cover Texts being "You have freedom to
  copy and modify this GNU Manual, like GNU software".

has been changed to:

  with no Invariant Sections, with no Front-Cover Texts, and with no
  Back-Cover Texts.

You can send success and failure reports to <[EMAIL PROTECTED]>, and
give us the canonical system name as returned by the config.guess
script, the processor and compiler version, in order to complete
the "Platforms Known to Support MPFR" section of the MPFR 2.3.2
web page.

Regards,

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


Re: IRA performance regressions on PPC

2008-09-12 Thread Vladimir Makarov

Vladimir Makarov wrote:

Luis Machado wrote:


Upon further investigation on facerec's regression, it looks like the
code generated by the IRA-enabled gcc has many more spills than the one
with a disabled IRA, twice or sometimes three times more.

I'm trying to reduce the testcase a bit further so it's simpler to
analyse.

  


I am going to look at facerec too.  Only, please, don't expect all 
this problem will be solved soon.





 Analysis of 187.facerec problem was actually easier than applu one.
It has one very hot (80%) function localmove::graphRoutines.f90 and
there is only one hot loop in the function.  Although the loop is
pretty big because of inlining TopCostFct.  The loop contains a few
if-statements and several switch-statements.  But only part of the loop
body is hot.

 After comparisons of the hot loop parts I found that IRA generates
about 20 insns more which are some stores but mostly load.  I did not
find any problem with spilling in reload (that is after fixing one
spilling problem about which I wrote in my previous email).  Reload
spills only two registers which lives throughout the hot parts.

 So I concluded that the problem was actually in IRA allocation.  I
did not find any wrong in IRA implementation of coloring algorithm.
Changing spill first heuristic there (we choose allocno with smaller
number to spill) from

   SPILL_COST / (NUMBER_OF_LEFT_CONFLICTS * NUMBER_HARD_REGISTER_NEEDED 
+ 1)


to

   SPILL_COST * log (NREFS) / (NUMBER_OF_LEFT_CONFLICTS
   * NUMBER_HARD_REGISTER_NEEDED
   * LIVE_RANGE_LENGTH + 1)

increased facerec score on Power6 from 1760 to 2039 (vs 1850 for the
old RA).  It looks very good but unfortunately the overall SPECFP2000
score was not changed much and SPECINT2000 was not change at all.
Some programs score became better the rest became worse.

 This is classical example of heuristic approach drawback.  You never
get best code using one heuristic for all programs.  Sometimes the old
RA will generate better code.  What are goal should be is to achieve
better code in "average", i.e. for some credible benchmark like
SPECFP2000 and we can achieve this with IRA.

 It is hard to say what heuristic will work best for given program
and make the choice automatically (especially without profiling
because branch probability prediction algorithm is not that accurate).
Although we could add an option to choose different spill heuristics
in hope to achieve best overall score using machine learning algorithm
like Google or Netflix use to find the better prediction
(e.g. neighborhood based search).  There is one promising project for
GCC using this approach.  It was reported by Grigory Fursin on this
year GCC Summit.  I think it is promising because different spill
heuristics results in very different times (about 20% for facerec).

 Probably I'll submit the new heuristic because SPECFP2000 score a
bit better (about 0.5%) with using this.  But I need some time to
check it on other platforms.



Re: Bootstrap failure in sparc-sun-solaris2.10

2008-09-12 Thread Adam Nemet
Eric Botcazou <[EMAIL PROTECTED]> writes:
>> Applying the patch Adam created and posted in the message below resolved
>> the issue and the compiler successfully bootstrapped:
>>
>> http://gcc.gnu.org/ml/gcc/2008-09/msg00139.html
>
> Thanks for reporting this.  I now can close PR 37424.
>
>> There was one reply to this message; I don't know if the patch is being
>> reworked or been formally submitted yet, but it did fix my build.
>
> OK, I'll take a look.

Yes it was formally submitted here, no review so far:

  http://gcc.gnu.org/ml/gcc-patches/2008-09/msg00574.html

Adam


gdb test suite failure on i386 and x86_64 in gdb.base/break.exp

2008-09-12 Thread Cary Coutant
There are a couple of failures in the gdb test suite on i386 and
x86_64 with gcc 4.3.0 or newer. The tests gdb.base/break.exp and
gdb.base/sepdebug.exp are failing with a function begins with a while
loop. The into_cfg_layout_mode pass was added in 4.3.0 (see
http://gcc.gnu.org/ml/gcc-patches/2007-03/msg00687.html), and that
pass removes the instruction that jumps from the end of the first
basic block to the bottom of the while loop. The outof_cfg_layout_mode
pass reintroduces the branch (in force_nonfallthru_and_redirect) once
the basic blocks have been laid out, but the source location
information has been lost by that point.

When you try to set a breakpoint at the beginning of the function, gdb
looks for the second row in the line table (it skips the first to get
past the prologue), and sets the breakpoint there. Because of the
missing locator on the jump, the second row is now the interior of the
while loop, and the breakpoint is in the wrong place.

Here's a reduced test case:

void foo(int a)
{
   while (a) { // line 3
 a--;  // line 4
   }
}

If you compile this (for x86_64) with a top-of-trunk gcc with -S -g,
you can see that the jmp to .L2 has no .loc directive in front of it,
and the first .loc directive is now the one for the body of the while
loop:

.file 1 "foo.cc"
.loc 1 1 0
pushq   %rbp
.LCFI0:
movq%rsp, %rbp
.LCFI1:
movl%edi, -4(%rbp)
jmp .L2
.L3:
.loc 1 4 0
subl$1, -4(%rbp)
.L2:
.loc 1 3 0
cmpl$0, -4(%rbp)
jne .L3
.loc 1 6 0
leave
ret

For comparison, here's the output from gcc 4.2.1:

.file 1 "foo.cc"
.loc 1 1 0
pushq   %rbp
.LCFI0:
movq%rsp, %rbp
.LCFI1:
movl%edi, -4(%rbp)
.loc 1 3 0
jmp .L2
.L3:
.loc 1 4 0
subl$1, -4(%rbp)
.L2:
.loc 1 3 0
cmpl$0, -4(%rbp)
jne .L3
.loc 1 6 0
leave
ret

I've tried changing force_nonfallthru_and_redirect (in cfgrtl.c) to
use to e->goto_locus field as the location for the reintroduced jump,
but that seems to mark the jump with line #6 (goto_locus might not
even be valid yet at this point, I'm told, and I'm not even sure that
a locus can be used where an INSN_LOCATOR is expected -- the
location_from_locus macro was removed). I've also tried looking
through the target bb's instruction list to find the first instruction
with an INSN_LOCATOR and using that for the locator of the jump -- it
fixed this problem, but broke other tests because now a forward branch
in other contexts (if-then-else, for example) gets the line number of
its target, and gdb will now use that branch as the breakpoint
location for that line number.

I'd argue that gcc really ought to be flagging the end of the prologue
-- there's a debug hook for that, and it's used by most of the debug
formats, but not by DWARF. The DWARF spec was extended (in version 3)
to allow the line number table to indicate the end of prologue, so gcc
(and gas) ought to be updated to record it in the line table, and gdb
ought to be taught to use that in lieu of looking for the second row
in the line table. Until all that happens, though, I think a quicker
fix is necessary.

Any suggestions?

-cary


Re: gdb test suite failure on i386 and x86_64 in gdb.base/break.exp

2008-09-12 Thread Joseph S. Myers
On Fri, 12 Sep 2008, Cary Coutant wrote:

> There are a couple of failures in the gdb test suite on i386 and
> x86_64 with gcc 4.3.0 or newer. The tests gdb.base/break.exp and
> gdb.base/sepdebug.exp are failing with a function begins with a while
> loop. The into_cfg_layout_mode pass was added in 4.3.0 (see

This is PR 36690 which has various bits of analysis.

-- 
Joseph S. Myers
[EMAIL PROTECTED]


Re: IRA performance regressions on PPC

2008-09-12 Thread Luis Machado
Hi Vladimir,

Firstly, thanks for looking into this.

>   Analysis of 187.facerec problem was actually easier than applu one.
> It has one very hot (80%) function localmove::graphRoutines.f90 and
> there is only one hot loop in the function.  Although the loop is
> pretty big because of inlining TopCostFct.  The loop contains a few
> if-statements and several switch-statements.  But only part of the loop
> body is hot.

I've been chasing the specific portion of code that performs badly and
i've come to the same conclusion. I could isolate some parts of the loop
that have a greater effect on the performance, but it's still a big
loop. I was going to open a PR for this. Might be a good idea to keep
track of the progress, and we can attach the testcases there.

>   After comparisons of the hot loop parts I found that IRA generates
> about 20 insns more which are some stores but mostly load.  I did not
> find any problem with spilling in reload (that is after fixing one
> spilling problem about which I wrote in my previous email).  Reload
> spills only two registers which lives throughout the hot parts.

I've seen quite a number of loads/stores using the same one or two
registers over and over again. For example, r12 was heavily used during
that hot loop, according to what i saw.

>   So I concluded that the problem was actually in IRA allocation.  I
> did not find any wrong in IRA implementation of coloring algorithm.
> Changing spill first heuristic there (we choose allocno with smaller
> number to spill) from
> 
> SPILL_COST / (NUMBER_OF_LEFT_CONFLICTS * NUMBER_HARD_REGISTER_NEEDED 
> + 1)
> 
> to
> 
> SPILL_COST * log (NREFS) / (NUMBER_OF_LEFT_CONFLICTS
> * NUMBER_HARD_REGISTER_NEEDED
> * LIVE_RANGE_LENGTH + 1)
> 
> increased facerec score on Power6 from 1760 to 2039 (vs 1850 for the
> old RA).  It looks very good but unfortunately the overall SPECFP2000
> score was not changed much and SPECINT2000 was not change at all.
> Some programs score became better the rest became worse.
> 
>   This is classical example of heuristic approach drawback.  You never
> get best code using one heuristic for all programs.  Sometimes the old
> RA will generate better code.  What are goal should be is to achieve
> better code in "average", i.e. for some credible benchmark like
> SPECFP2000 and we can achieve this with IRA.
> 
>   It is hard to say what heuristic will work best for given program
> and make the choice automatically (especially without profiling
> because branch probability prediction algorithm is not that accurate).
> Although we could add an option to choose different spill heuristics
> in hope to achieve best overall score using machine learning algorithm
> like Google or Netflix use to find the better prediction
> (e.g. neighborhood based search).  There is one promising project for
> GCC using this approach.  It was reported by Grigory Fursin on this
> year GCC Summit.  I think it is promising because different spill
> heuristics results in very different times (about 20% for facerec).
> 
>   Probably I'll submit the new heuristic because SPECFP2000 score a
> bit better (about 0.5%) with using this.  But I need some time to
> check it on other platforms.

If you have any patches that you think are worth testing, let me know so
i can give them a go. I'll be following closely the progress on this
topic.

Thanks,
Luis



YU TUBE

2008-09-12 Thread [EMAIL PROTECTED]

Hi to group. New source of yu tube. 
http://videos.videosextube2009.com/yu-tube.html
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Miami Couples" group.
To post to this group, send email to Miami-Couples@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/Miami-Couples
-~--~~~~--~~--~--~---