Re: Mudflap and freeing C runtime memory upon exit (feature request)

2006-08-05 Thread Vesselin Peev

[EMAIL PROTECTED] (Frank Ch. Eigler) wrote:


> Vesselin Peev" <[EMAIL PROTECTED]> writes:

> [...]  I have a feature request for mudflap. It should have an
> option to run glibc's _libc_freeres function that forces the C
> runtime library to free all of its memory [...]

Good idea.  (It should not take more than a dozen lines of code - a
threshold below which one may not even need a copyright assignment in
order to contribute.)


I get the message :). Thanks for the pointer, I'll go for it. Once done, 
I'll most likely have to get back to the appropriate list to help resolve 
the "mudflap warning: unaccessed registered object" problem that I 
described.





Re: libgcc-math specification

2006-08-05 Thread Richard Guenther

On 8/4/06, Sashan Govender <[EMAIL PROTECTED]> wrote:

Hi

Is there a specification that describes a set of routines for
libgcc-math? I read through previous emails on this topic and it seems
that it has been removed from head. I'd like to contribute but not
sure what direction to go in. Is there a specific branch that needs
checking out?


It is not on a branch at the moment, but is supposed to re-appear once
4.2 branches.  Also there is not a fixed set of routines at the moment, but
what is used by the patch supporting vectorization of math intrinsics.

Richard.


Can't use exceptions on non-mainstream targets

2006-08-05 Thread Aaron Graham

It appears that sjlj exceptions are broken (PRs 19774/25266/28493).
I'm not sure this is true for _every_ target, but it appears to be
true for many.

So it seems that those of us affected by this bug have three workarounds:
1) compile everything with -fno-exceptions and remove all eh from project code
2) implement dwarf2 eh for the target
3) fix the #19774 regression

Given the relative significance of tasks #1 and #2, I'm surprised that
a fix for #19774 isn't higher priority.  In any case, I am attempting
to fix it myself.  I have been dumping trees and have narrowed it down
to either the eh or vregs RTL pass, and would be glad to hear if
anyone else has information on this problem.

Aaron


Re: Can't use exceptions on non-mainstream targets

2006-08-05 Thread Paolo Bonzini

Aaron Graham wrote:

It appears that sjlj exceptions are broken (PRs 19774/25266/28493).
I'm not sure this is true for _every_ target, but it appears to be
true for many.


The real bug is PR28493, not PR19774.  The latter (and PR25266 which is 
related) only affect less common cases, i.e. using alloca or 
variable-length arrays.  Unlike PR19774, which is as old as 3.4, PR28493 
is new to 4.1.0 (as you know because you are the reporter).


I think it is low priority only because nobody has yet confirmed it 
(most likely because most GCC developers work on dw2 targets).


Paolo


gcc-4.2-20060805 is now available

2006-08-05 Thread gccadmin
Snapshot gcc-4.2-20060805 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20060805/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.2 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 115951

You'll find:

gcc-4.2-20060805.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.2-20060805.tar.bz2 C front end and core compiler

gcc-ada-4.2-20060805.tar.bz2  Ada front end and runtime

gcc-fortran-4.2-20060805.tar.bz2  Fortran front end and runtime

gcc-g++-4.2-20060805.tar.bz2  C++ front end and runtime

gcc-java-4.2-20060805.tar.bz2 Java front end and runtime

gcc-objc-4.2-20060805.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.2-20060805.tar.bz2The GCC testsuite

Diffs from 4.2-20060729 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.2
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-08-05 Thread Wolfgang Mües
Rask,

On Friday 21 July 2006 15:26, Rask Ingemann Lambertsen wrote:
> I found that this peephole optimization improves the code a whole
> lot:

Done.

> Another way of improving the code was to swap the order of the two
> last alternatives of _arm_movqi_insn_swp.

Done.

Anyway, the problems with reload continues...error: unrecognizable insn

First, I have had a problem with loading a register with a constant.
(no clobber). I have solved this problem by adding

> (define_insn "_arm_movqi_insn_const"
>   [(set (match_operand:QI 0 "register_operand" "=r")
>   (match_operand:QI 1 "const_int_operand" ""))]
>   "TARGET_ARM && TARGET_SWP_BYTE_WRITES
>&& (   register_operand (operands[0], QImode))"
>   "@
>mov%?\\t%0, %1"
>   [(set_attr "type" "*")
>(set_attr "predicable" "yes")]
> )

I am very shure that this does only cure the symptoms, and it will 
better to fix this in the reload stage, but at least, it worked, and I 
was able to compile the whole linux kernel!

After testing that the kernel is running, I have tried to compile 
uCLinux. And there is the next problem

> ../ncurses/./base/lib_set_term.c: In function '_nc_setupscreen':
> ../ncurses/./base/lib_set_term.c:470: error: unrecognizable insn:
> (insn 1199 1198 696 37 ../ncurses/./base/lib_set_term.c:429 (parallel
> [ (set (mem/s/j:QI (reg/f:SI 3 r3 [491]) [0 ._clear+0 S1
> A8]  
>  ) (reg:QI 0 r0))
> (clobber (subreg:QI (reg:DI 11 fp) 0))
> ]) -1 (nil)
> (nil))
> ../ncurses/./base/lib_set_term.c:470: internal compiler error: in
> extract_insn,
> at recog.c:2020 P

The source code line is:

>newscr->_clear = TRUE;

Obviously, TRUE is loaded in r0, but I don't know why this construct 
(storing a byte into a struct member referenced by a pointer) is not
evaluated.

I fear that these problems are creating an endless story, and sorry for 
generating traffic on this list, because I'm still no gcc expert...

On the other hand, the compiler now has generated code from hundreds of 
files, and maybe I'm very near to success now.

regards

Wolfgang
-- 
We're back to the times when men were men 
and wrote their own device drivers.

(Linus Torvalds)


Re: Modifying ARM code generator for elimination of 8bit writes - need help

2006-08-05 Thread Rask Ingemann Lambertsen
On Sat, Aug 05, 2006 at 09:03:34PM +0200, Wolfgang Mües wrote:

> First, I have had a problem with loading a register with a constant.
> (no clobber). I have solved this problem by adding
> 
> > (define_insn "_arm_movqi_insn_const"
[cut]
> 
> I am very shure that this does only cure the symptoms, and it will 
> better to fix this in the reload stage, but at least, it worked, and I 
> was able to compile the whole linux kernel!

Yes, it only cures the symptom, but it could take a lot of time to find the
cause, and the gain is small, so I think it is OK to leave it like this for
now.
 
> After testing that the kernel is running, I have tried to compile 
> uCLinux. And there is the next problem
> 
> > ../ncurses/./base/lib_set_term.c: In function '_nc_setupscreen':
> > ../ncurses/./base/lib_set_term.c:470: error: unrecognizable insn:
> > (insn 1199 1198 696 37 ../ncurses/./base/lib_set_term.c:429 (parallel
> > [ (set (mem/s/j:QI (reg/f:SI 3 r3 [491]) [0 ._clear+0 S1
> > A8]  
> >  ) (reg:QI 0 r0))
> > (clobber (subreg:QI (reg:DI 11 fp) 0))
> > ]) -1 (nil)
> > (nil))
> > ../ncurses/./base/lib_set_term.c:470: internal compiler error: in
> > extract_insn,
> > at recog.c:2020 P

This insn was generated from the "reload_outqi" pattern. I don't completely
understand why it isn't recognized. The (subreg:QI (reg:DI 11 fp) 0) part
won't be matched by (match_scratch ...), but simplify_gen_subreg() should
have simplified it to (reg:QI 11 fp) since this is one of the main purposes
of having simplify_(gen_)subreg() in the first place. Try changing

   operands[3] = simplify_gen_subreg (QImode, operands[2], DImode, 0);

into

   operands[3] = gen_rtx_REG (QImode, REGNO (operands[2]));

(in "reload_outqi") and see if that works.

> I fear that these problems are creating an endless story, and sorry for 
> generating traffic on this list, because I'm still no gcc expert...

You shouldn't be sorry about that. GCC provides a good, solid foundation
for learning something new every day.

> On the other hand, the compiler now has generated code from hundreds of 
> files, and maybe I'm very near to success now.

I think so too.

-- 
Rask Ingemann Lambertsen


___divti3 and ___umodti3 missing on Darwin

2006-08-05 Thread Jack Howarth
While testing the state of gfortran in gcc trunk at -m64 on MacOS X 10.4
I discovered a huge number of test failures (848 compared to 26 with -m32).
Almost all of these failures appear to be due to two undefined symbols in
libgfortran's shared library in the ppc64 version...

http://gcc.gnu.org/ml/fortran/2006-08/msg00112.html

The symbols, ___divti3 and ___umodti3,  are not present in 
darwin-libgcc.10.4.ver
or darwin-libgcc.10.5.ver found in gcc/config/rs6000 but they are present
in the libgcc-std.ver in the gcc directory. I haven't found anything in
bugzilla about this issue, however it really should be addressed before
gcc 4.2 is released.
   Jack


___divti3 and ___umodti3 missing on Darwin

2006-08-05 Thread Jack Howarth
   I can produce the first missing symbol with current gcc trunk on
MacOS X 10.4 using this testcase...

main()
{
__int128_t a, b;
b= a % 10;
}

When compiled with "gcc-4 -m64 modulo.c", this produces the linkage
failure of...

can't resolve symbols:
  ___modti3, referenced from:
  _main in cc87vdlF.o
ld64 failed: symbol(s) not found
collect2: ld returned 1 exit status

Hopefully this helps. 
Jack


RE: ___divti3 and ___umodti3 missing on Darwin

2006-08-05 Thread Jack Howarth
The second missing symbol generated by gcc trunk on
MacOS X 10.4 when compiling with -m64 can be demonstrated with
this test case...

main()
{
__int128_t a, b;
b= a / 10;
}

which when compiled with "gcc-4 -m64 division.c" produces the
linkage error...

can't resolve symbols:
  ___divti3, referenced from:
  _main in ccDcwJYL.o
ld64 failed: symbol(s) not found
collect2: ld returned 1 exit status

Hopefully fixes for these already reside in Apple's gcc branch and
can be ported over without much grief.
Jack


fancy x87 ops, SSE and -mfpmath=sse,387 performance

2006-08-05 Thread tbp

Basically i'd like to have the cake and also eat it.

With g++-4.2-20060805/cygwin on a k8 box on some software path with
lots of sp float ops but no transcendentals or library calls
-mfpmath=sse,387: 5.2 Mray/s
-mfpmath=sse: 6 Mray/s
That 15% performance difference is no surprise when you see things like
 4037c8:   flds   0x4(%esp)
 4037cc:   mulss  %xmm5,%xmm2
 4037d0:   fsubrp %st,%st(1)
 4037d2:   movss  %xmm1,0x4(%esp)
 4037d8:   addss  0x278(%esp,%ecx,4),%xmm0
 4037e1:   flds   0x4(%esp)
 4037e5:   fsubrp %st,%st(1)
 4037e7:   addss  %xmm2,%xmm0
 4037eb:   movss  %xmm0,0x4(%esp)
 4037f1:   flds   0x4(%esp)
 4037f5:   fdivrp %st,%st(1)
 4037f7:   fcomi  %st(1),%st
 4037f9:   fldz
 4037fb:   setae  %dl
 4037fe:   fcomip %st(1),%st
 403800:   seta   %al
 403803:   or %al,%dl
 403805:   je 4036ca

Therefore -mfpmath=sse is the way to go and is in fact on par or
better than what i get out of icc 9.1 for the same code.
Where it gets ugly is when, for example, you throw some cosf() into
the same compilation unit as with -mfpmath=sse you pay for some really
really slow library function calls (at least on cygwin).
Wishful thinking got me trying -march=k8 -mfpmath=sse
-mfancy-math-387, to no avail :(
Is there a way to enable such exotic codegen for 32bit environments?