Re: volatile qualifier hurts single-threaded optimized case

2006-08-30 Thread Benjamin Kosnik

> bits/atomicity.h has volatile qualifiers on the _Atomic_word* arguments to
> the __*_single and __*_dispatch variants of the atomic operations.  This
> huts especially the single-threaded optimization variants which are usually
> inlined.  Removing those qualifiers allows to reduce code size significantly
> as can be seen in the following simple testcase

I've been able to reproduce this with your example and the following
patch. Thanks for looking at this.

without volatile:
19:    546 FUNCGLOBAL DEFAULT2 _Z3fooPKcS0_

with:
19:    578 FUNCGLOBAL DEFAULT2 _Z3fooPKcS0_

I don't understand the ABI objections to your suggestion, and feel like
there must be a misunderstanding somewhere. These helper functions are
not exported at all, in fact. Also, the *_dispatch and *_single parts
of the atomicity.h interface are new with 4.2, so I'd like to get the
correct signatures in with their introduction, and not have to patch
this up later.

?

tested x86/linux
abi tested x86/linux

-benjamin





2006-08-30  Benjamin Kosnik  <[EMAIL PROTECTED]>
Richard Guenther  <[EMAIL PROTECTED]>

* config/abi/pre/gnu.ver: Spell out exact signatures for atomic
access functions.

* include/bits/atomicity.h (__atomic_add_dispatch): Remove
volatile qualification for _Atomic_word argument.
(__atomic_add_single): Same.
(__exchange_and_add_dispatch): Same.
(__exchange_and_add_single): Same.

Index: include/bits/atomicity.h
===
--- include/bits/atomicity.h(revision 116581)
+++ include/bits/atomicity.h(working copy)
@@ -60,7 +60,7 @@
 #endif
 
   static inline _Atomic_word
-  __exchange_and_add_single(volatile _Atomic_word* __mem, int __val)
+  __exchange_and_add_single(_Atomic_word* __mem, int __val)
   {
 _Atomic_word __result = *__mem;
 *__mem += __val;
@@ -68,12 +68,12 @@
   }
 
   static inline void
-  __atomic_add_single(volatile _Atomic_word* __mem, int __val)
+  __atomic_add_single(_Atomic_word* __mem, int __val)
   { *__mem += __val; }
 
   static inline _Atomic_word
   __attribute__ ((__unused__))
-  __exchange_and_add_dispatch(volatile _Atomic_word* __mem, int __val)
+  __exchange_and_add_dispatch(_Atomic_word* __mem, int __val)
   {
 #ifdef __GTHREADS
 if (__gthread_active_p())
@@ -87,7 +87,7 @@
 
   static inline void
   __attribute__ ((__unused__))
-  __atomic_add_dispatch(volatile _Atomic_word* __mem, int __val)
+  __atomic_add_dispatch(_Atomic_word* __mem, int __val)
   {
 #ifdef __GTHREADS
 if (__gthread_active_p())
@@ -101,8 +101,9 @@
 
 _GLIBCXX_END_NAMESPACE
 
-// Even if the CPU doesn't need a memory barrier, we need to ensure that
-// the compiler doesn't reorder memory accesses across the barriers.
+// Even if the CPU doesn't need a memory barrier, we need to ensure
+// that the compiler doesn't reorder memory accesses across the
+// barriers.
 #ifndef _GLIBCXX_READ_MEM_BARRIER
 #define _GLIBCXX_READ_MEM_BARRIER __asm __volatile ("":::"memory")
 #endif
Index: config/abi/pre/gnu.ver
===
--- config/abi/pre/gnu.ver  (revision 116581)
+++ config/abi/pre/gnu.ver  (working copy)
@@ -378,8 +378,8 @@
 
 # __gnu_cxx::__atomic_add
 # __gnu_cxx::__exchange_and_add
-_ZN9__gnu_cxx12__atomic_add*;
-_ZN9__gnu_cxx18__exchange_and_add*;
+_ZN9__gnu_cxx12__atomic_addEPVii;
+_ZN9__gnu_cxx18__exchange_and_addEPVii;
 
 # debug mode
 _ZN10__gnu_norm15_List_node_base4hook*;


Re: volatile qualifier hurts single-threaded optimized case

2006-08-30 Thread Richard Guenther

On 8/30/06, Benjamin Kosnik <[EMAIL PROTECTED]> wrote:


> bits/atomicity.h has volatile qualifiers on the _Atomic_word* arguments to
> the __*_single and __*_dispatch variants of the atomic operations.  This
> huts especially the single-threaded optimization variants which are usually
> inlined.  Removing those qualifiers allows to reduce code size significantly
> as can be seen in the following simple testcase

I've been able to reproduce this with your example and the following
patch. Thanks for looking at this.

without volatile:
19:    546 FUNCGLOBAL DEFAULT2 _Z3fooPKcS0_

with:
19:    578 FUNCGLOBAL DEFAULT2 _Z3fooPKcS0_

I don't understand the ABI objections to your suggestion, and feel like
there must be a misunderstanding somewhere. These helper functions are
not exported at all, in fact. Also, the *_dispatch and *_single parts
of the atomicity.h interface are new with 4.2, so I'd like to get the
correct signatures in with their introduction, and not have to patch
this up later.


Oh, indeed.  If the *_dispatch and *_single parts are new, we can change the
function signatures there.  The patch looks fine and should get us the runtime
and size improvements I saw.

Thanks for having a look!

Richard.


Re: Successful Build: gcc-4.1-20051230 i686-pc-mingw32

2006-08-30 Thread klamer

Using: 
> ../gcc-4.1.1/configure --host=mingw32 --build=mingw32 --target=mingw32
> --enab
le-threads --enable-optimize --disable-nls --enable-languages=c,c++,fortran
--p
refix=/c/prog/mingw4 --with-cpu=pentium4 --with-ld=/c/prog/mingw4/bin/ld.exe
--
with-as=/c/prog/mingw4/bin/as.exe --with-gmp=/c/prog/mingw4
 ... (succesfull output)
> make bootstrap
 ...
stage1/xgcc.exe -Bstage1/ -B/c/prog/mingw4/mingw32/bin/ -c   -g -O2 -DIN_GCC  
-W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic
-Wno-long-long -Wno-variadic-macros -Wold-style-definition
-Wmissing-format-attribute  -Wno-format-DHAVE_CONFIG_H -I. -Ifortran
-I../../gcc-4.1.1/gcc -I../../gcc-4.1.1/gcc/fortran
-I../../gcc-4.1.1/gcc/../include -I../../gcc-4.1.1/gcc/../libcpp/include
-I/c/prog/mingw4/include ../../gcc-4.1.1/gcc/fortran/arith.c -o
fortran/arith.o
In file included from ../../gcc-4.1.1/gcc/system.h:42,
 from ../../gcc-4.1.1/gcc/fortran/arith.c:29:
c:/prog/mingw4/include/stdio.h:219: warning: no previous prototype for
'vsnprintf'
c:/prog/mingw4/include/stdio.h:258: warning: no previous prototype for
'getc'
c:/prog/mingw4/include/stdio.h:265: warning: no previous prototype for
'putc'
c:/prog/mingw4/include/stdio.h:272: warning: no previous prototype for
'getchar'
c:/prog/mingw4/include/stdio.h:279: warning: no previous prototype for
'putchar'
In file included from ../../gcc-4.1.1/gcc/system.h:42,
 from ../../gcc-4.1.1/gcc/fortran/arith.c:29:
c:/prog/mingw4/include/stdio.h:401: warning: no previous prototype for
'fopen64'
c:/prog/mingw4/include/stdio.h:413: warning: no previous prototype for
'ftello64'
c:/prog/mingw4/include/stdio.h:468: warning: no previous prototype for
'vsnwprintf'
In file included from ../../gcc-4.1.1/gcc/system.h:195,
 from ../../gcc-4.1.1/gcc/fortran/arith.c:29:
c:/prog/mingw4/include/string.h:97: warning: no previous prototype for
'strcasecmp'
c:/prog/mingw4/include/string.h:103: warning: no previous prototype for
'strncasecmp'
In file included from c:/prog/mingw4/include/unistd.h:10,
 from ../../gcc-4.1.1/gcc/system.h:231,
 from ../../gcc-4.1.1/gcc/fortran/arith.c:29:
c:/prog/mingw4/include/io.h:150: warning: no previous prototype for
'lseek64'
In file included from ../../gcc-4.1.1/gcc/fortran/arith.c:31:
../../gcc-4.1.1/gcc/fortran/gfortran.h:1738: error: expected declaration
specifiers or '...' before 'uint'
make[2]: *** [fortran/arith.o] Error 1
make[2]: Leaving directory `/home/schuttek/gcc-4.1.1-build/gcc'
make[1]: *** [stage2_build] Error 2
make[1]: Leaving directory `/home/schuttek/gcc-4.1.1-build/gcc'
make: *** [bootstrap] Error 2

Anyone an idea how to solve this? I
-- 
View this message in context: 
http://www.nabble.com/Successful-Build%3A-gcc-4.1-20051230-i686-pc-mingw32-tf834182.html#a6058353
Sent from the gcc - Dev forum at Nabble.com.



segmentation fault in building __floatdisf.o

2006-08-30 Thread kernel coder

hi,
   I'm having some problem during build up of libgcc2 in function
__floatdisf(build up of __floatdisf.o).Actually i'm modifying mips
backend.The error is

../../gcc-4.1.0/gcc/libgcc2.c: In function '__floatdisf':
../../gcc-4.1.0/gcc/libgcc2.c:1354: internal compiler error: Segmentation fault

I tried to debug the reason of crash and following are my findings

before crash following pattern is called

(define_expand "cmp"
 [(set (cc0)
   (compare:CC (match_operand:GPR 0 "register_operand")
   (match_operand:GPR 1 "nonmemory_operand")))]
 ""
{
fprintf(stderr," cmp \n");
 branch_cmp[0] = operands[0];
 branch_cmp[1] = operands[1];
debug_rtx(branch_cmp[0]);
debug_rtx(branch_cmp[1]);
 DONE;
})

as u can see i've printed the operands  which are as follows

operand[0]
--
(subreg:SI (reg:DI 30) 4)

operand[1]
-
(subreg:SI (reg:DI 25 [ u.0 ]) 4)

after this i think it tries to mach some s bCOND pattern but in this
case it fails .

Is this my proposition correct?



In another working case where no error is being generated.Following is
the sequence of called patterns

(define_expand "cmp"
 [(set (cc0)
   (compare:CC (match_operand:GPR 0 "register_operand")
   (match_operand:GPR 1 "nonmemory_operand")))]
 ""
{
fprintf(stderr," cmp \n");
 cmp_operands[0] = operands[0];
 cmp_operands[1] = operands[1];
debug_rtx(cmp_operands[0]);
debug_rtx(cmp_operands[1]);
 DONE;
})

here the operand are

operands[0]
---
(subreg:SI (reg:DI 30) 0)


operands[1]
---
(subreg:SI (reg:DI 25 [ u.0 ]) 0)

Then the following pattern is matched

(define_expand "bltu"
 [(set (pc)
   (if_then_else (ltu (cc0) (const_int 0))
 (label_ref (match_operand 0 ""))
 (pc)))]
 ""
{
fprintf(stderr,"\n branch_fp 8 bltu\n");
})


So in first failed case it must match the above mentioned pattern but
fails to do so.So the only difference seems to be that bytenum offset
in subreg expression is different.In failed case it is 4 and in
successful case it is 0.

Both directories seems  to be copy of each other.Then why operands are
defiierent in cmpsi patterns.There are no floating point registers.The
option passed to gcc is -msoft-float.

I've tried my best to track the problem but could not due my limited
knowledge.Would you please give me some hint to debug the problem .

thanks,
shahzad


RE: segmentation fault in building __floatdisf.o

2006-08-30 Thread Dave Korn
On 30 August 2006 15:11, kernel coder wrote:

> hi,
> I'm having some problem during build up of libgcc2 in function
> __floatdisf(build up of __floatdisf.o).Actually i'm modifying mips
> backend.The error is
> 
> ../../gcc-4.1.0/gcc/libgcc2.c: In function '__floatdisf':
> ../../gcc-4.1.0/gcc/libgcc2.c:1354: internal compiler error: Segmentation
> fault 

  This is always the first sign of a bug in your backend, as it's the first
bit of code that gets compiled for the target by the newly-compiled backend.

  In this case, it's a really bad bug, because we're bombing out with a SEGV,
rather than getting a nice assert.  This could be because of dereferencing a
null pointer.

> before crash following pattern is called
> 
> (define_expand "cmp"
>   [(set (cc0)
> (compare:CC (match_operand:GPR 0 "register_operand")
> (match_operand:GPR 1 "nonmemory_operand")))]
>   ""
> {
> fprintf(stderr," cmp \n");
>   branch_cmp[0] = operands[0];
>   branch_cmp[1] = operands[1];
> debug_rtx(branch_cmp[0]);
> debug_rtx(branch_cmp[1]);
>   DONE;
> })

> (subreg:SI (reg:DI 30) 4)
> (subreg:SI (reg:DI 25 [ u.0 ]) 4)
> 
> after this i think it tries to mach some s bCOND pattern but in this
> case it fails .
> 
> Is this my proposition correct?

  Unlikely.  You'd expect to see a proper ICE message with backtrace if recog
failed.

> In another working case where no error is being generated.Following is
> the sequence of called patterns
> 
> (define_expand "cmp"
>   [(set (cc0)
> (compare:CC (match_operand:GPR 0 "register_operand")
> (match_operand:GPR 1 "nonmemory_operand")))]
>   ""
> {
> fprintf(stderr," cmp \n");
>   cmp_operands[0] = operands[0];
>   cmp_operands[1] = operands[1];
> debug_rtx(cmp_operands[0]);
> debug_rtx(cmp_operands[1]);
>   DONE;
> })

> (subreg:SI (reg:DI 30) 0)
> (subreg:SI (reg:DI 25 [ u.0 ]) 0)
> 
> Then the following pattern is matched
> 
> (define_expand "bltu"
>   [(set (pc)
> (if_then_else (ltu (cc0) (const_int 0))
>   (label_ref (match_operand 0 ""))
>   (pc)))]
>   ""
> {
> fprintf(stderr,"\n branch_fp 8 bltu\n");
> })

> I've tried my best to track the problem but could not due my limited
> knowledge.Would you please give me some hint to debug the problem .

  I suspect that what's going wrong is that, in the error case, one of the
'branch_cmp' or 'cmp_operands' arrays is getting set, but when the branch
pattern comes to be matched, it is the /other/ array that is used, which is
where the null pointer is coming from.

  You've given no information about what you've actually changed, and I'm no
MIPS expert, but in my local copy of the gcc 4 sources there's no such thing
as 'branch_cmp', only 'cmp_operands', whereas in gcc series 3m, it's the other
way round

  So I'm guessing that you've added a new variant of the cmp expander,
and you've based it on some old code from a series 3 version of the compiler,
and it's not good in series 4?

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



MyGCC and whole program static analysis

2006-08-30 Thread Basile STARYNKEVITCH

Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which
seems to be an extended GCC to add some kind of static analysis.

I'm quite surprised that the mygcc page gives x86/linux binaries, but
no source tarball of their compiler (this seems to me against the
spirit of the GPL licence, but I am not a lawyer).

As some few people might already know, the GGCC (globalgcc) project is
just starting (partly funded within the ITEA framework by french,
spanish, swedish public money) - its kick off meeting is next week in
Paris.

GGCC aims to provide a (GPL opensource) extension to GCC for program
wide static analysis (& optimisations) and coding rules
validation. But this mail is not a formal announcement of it...

I am also extremely interested in the LTO framework, in particular
their persistence of GIMPLE trees. Could LTO people explain (if
possible) if their framework is extensible (to some new Gimple nodes)
and usable in some other setting (for example, storing program wide
static analysis partial results, perhaps in a "project" related
database or file). It is too bad that they only store information in
DWARF3 format... Do they have some technical description of their
format and persistence machineray (I've read their introductory
paper).

BTW, I am still a newbie on GCC...

Regards.

-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/ 
email: basilestarynkevitchnet 
aliases: basiletunesorg = bstarynknerimnet
8, rue de la Faïencerie, 92340 Bourg La Reine, France


Re: MyGCC and whole program static analysis

2006-08-30 Thread Basile STARYNKEVITCH
Le Wed, Aug 30, 2006 at 06:36:19PM +0200, basile écrivait/wrote:
> 
> Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which
> seems to be an extended GCC to add some kind of static analysis.
> 
> I'm quite surprised that the mygcc page gives x86/linux binaries, but
> no source tarball of their compiler (this seems to me against the
> spirit of the GPL licence, but I am not a lawyer).
> 

My public apologies to MyGCC. There is a patch on
http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01437.html but sadly the
http://mygcc.free.fr does not provide any link to it.

It is sad to have to google to find their patch, it would be simpler
if they linked it (or even gave full source tarball).

Regards.
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/ 
email: basilestarynkevitchnet 
aliases: basiletunesorg = bstarynknerimnet
8, rue de la Faïencerie, 92340 Bourg La Reine, France


Re: MyGCC and whole program static analysis

2006-08-30 Thread Joe Buck
On Wed, Aug 30, 2006 at 06:36:19PM +0200, Basile STARYNKEVITCH wrote:
> Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which
> seems to be an extended GCC to add some kind of static analysis.
> 
> I'm quite surprised that the mygcc page gives x86/linux binaries, but
> no source tarball of their compiler (this seems to me against the
> spirit of the GPL licence, but I am not a lawyer).

Not just the spirit; the GPL requires that full sources be made available,
not just a patch (alternatively, a written offer to provide full source
can be provided with the binary, but one or the other is required).
However, I can't tell if there is a violation, since I haven't downloaded
the tarball and looked for written offers, and it appears that if there is
a violation, it is a well-intentioned error.  This is the kind of thing
that the FSF usually resolves quietly and amicably.




Re: MyGCC and whole program static analysis

2006-08-30 Thread Joe Buck
On Wed, Aug 30, 2006 at 06:52:59PM +0200, Basile STARYNKEVITCH wrote:
> Le Wed, Aug 30, 2006 at 06:36:19PM +0200, basile écrivait/wrote:
> > 
> > Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which
> > seems to be an extended GCC to add some kind of static analysis.
> > 
> > I'm quite surprised that the mygcc page gives x86/linux binaries, but
> > no source tarball of their compiler (this seems to me against the
> > spirit of the GPL licence, but I am not a lawyer).
> > 
> 
> My public apologies to MyGCC. There is a patch on
> http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01437.html but sadly the
> http://mygcc.free.fr does not provide any link to it.

You shouldn't apologize.  The MyGCC people need to read the GPL FAQ,
particularly

http://www.gnu.org/licenses/gpl-faq.html#DistributingSourceIsInconvenient




RE: MyGCC and whole program static analysis

2006-08-30 Thread Dave Korn
On 30 August 2006 17:53, Joe Buck wrote:

> On Wed, Aug 30, 2006 at 06:36:19PM +0200, Basile STARYNKEVITCH wrote:
>> Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which
>> seems to be an extended GCC to add some kind of static analysis.
>> 
>> I'm quite surprised that the mygcc page gives x86/linux binaries, but
>> no source tarball of their compiler (this seems to me against the
>> spirit of the GPL licence, but I am not a lawyer).
> 
> Not just the spirit; the GPL requires that full sources be made available,
> not just a patch (alternatively, a written offer to provide full source
> can be provided with the binary, but one or the other is required).
> However, I can't tell if there is a violation, since I haven't downloaded
> the tarball and looked for written offers, and it appears that if there is
> a violation, it is a well-intentioned error.  This is the kind of thing
> that the FSF usually resolves quietly and amicably.

  Do you know if the included copy of $prefix/man/man7/gpl.7 counts as a
"written offer"?

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: MyGCC and whole program static analysis

2006-08-30 Thread Sebastian Pop
Joe Buck wrote:
> On Wed, Aug 30, 2006 at 06:52:59PM +0200, Basile STARYNKEVITCH wrote:
> > Le Wed, Aug 30, 2006 at 06:36:19PM +0200, basile écrivait/wrote:
> > > 
> > > Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which
> > > seems to be an extended GCC to add some kind of static analysis.
> > > 
> > > I'm quite surprised that the mygcc page gives x86/linux binaries, but
> > > no source tarball of their compiler (this seems to me against the
> > > spirit of the GPL licence, but I am not a lawyer).
> > > 
> > 
> > My public apologies to MyGCC. There is a patch on
> > http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01437.html but sadly the
> > http://mygcc.free.fr does not provide any link to it.
> 
> You shouldn't apologize.  The MyGCC people need to read the GPL FAQ,
> particularly
> 
> http://www.gnu.org/licenses/gpl-faq.html#DistributingSourceIsInconvenient
> 

If somebody wants to try mygcc, I included the patch in the
graphite-branch some months ago.  In my opinion the patch needs major
rework and improvements to be included in trunk.  You're welcome to
help improving that patch.

Sebastian


Re: MyGCC and whole program static analysis

2006-08-30 Thread Sebastian Pop
Sebastian Pop wrote:

> In my opinion the patch needs major rework and improvements to be
> included in trunk.

Here is my short review of the mygcc patch that lists some possible
improvements and things that have to be redesigned:
http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00616.html



Re: MyGCC and whole program static analysis

2006-08-30 Thread Joe Buck
On Wed, Aug 30, 2006 at 06:18:06PM +0100, Dave Korn wrote:
>   Do you know if the included copy of $prefix/man/man7/gpl.7 counts as a
> "written offer"?

We're way off topic, so I'll reply to Dave offline.  It appears otherwise
we'll have a big gnu.misc.discuss thread here.


Inserting function calls

2006-08-30 Thread jean-christophe . beyler

Dear all,

I have been trying to insert function calls during a new pass in the  
compiler but it does not seem to like my way of doing it. The basic  
idea is to insert a function call before any load in the program  
(later on I'll be selecting a few loads but for now I just want to do  
it for each and every one).


This looks like the profiling pass but I can't seem to understand what  
is the matter...


This is what the compiler says when I try to compile this test case :

hello.c:18: internal compiler error: tree check: expected ssa_name,  
have symbol_memory_tag in verify_ssa, at tree-ssa.c:776


This is what I did in my pass :

/* Generation of the call type, function will be of type
 *  de type void (*) (int) */
tree call_type = build_function_type_list ( void_type_node,
integer_type_node,
NULL_TREE);
tree call_fn = build_fn_decl ("__MarkovMainEntry",call_type);

data_reference_p a;
for (j = 0; VEC_iterate (data_reference_p, datarefs, j, a); j++)
{
tree stmt = DR_STMT (a);

/* Is it a load */
   if (DR_IS_READ (a))
{
printf("Have a load : %d\n",compteur_interne);
tree compteur = build_int_cst (integer_type_node,  
compteur_interne);

compteur_interne++;

/* Argument creation, just pass the constant integer node */
tree args = tree_cons (NULL_TREE, compteur, NULL_TREE);

tree call = build_function_call_expr (call_fn, args);

block_stmt_iterator bsi;
bsi = bsi_for_stmt (stmt);
bsi_insert_after(&bsi, call, BSI_SAME_STMT);
}
}


And the test code is :

#include 

int fct(int *t);

int main()
{
int tab[10];
int i;

tab[0] = 0;

printf("Hello World %d\n",tab[5]);
printf("Sum is : %d\n",fct(tab));
return 0;
}

int fct(int *t)
{
int i=10;
int tab[10];
while(i) {
tab[i] = tab[i]*2+1+tab[i+1];
i--;
printf("Here %d\n",tab[5]);
}

return t[i];
}


Does anyone have any ideas on to how I can modify my function and get  
it to insert the functions correctly ?


Thanx for any help in finishing this pass,
Jc

Finally here is the patch that shows what I did using the trunk 4.2 :


Index: doc/invoke.texi
===
--- doc/invoke.texi(revision 116394)
+++ doc/invoke.texi(working copy)
@@ -342,7 +342,7 @@ Objective-C and Objective-C++ Dialects}.
 -fsplit-ivs-in-unroller -funswitch-loops @gol
 -fvariable-expansion-in-unroller @gol
 -ftree-pre  -ftree-ccp  -ftree-dce -ftree-loop-optimize @gol
--ftree-loop-linear -ftree-loop-im -ftree-loop-ivcanon -fivopts @gol
+-ftree-loop-linear -ftree-load-inst -ftree-loop-im  
-ftree-loop-ivcanon -fivopts @gol

 -ftree-dominator-opts -ftree-dse -ftree-copyrename -ftree-sink @gol
 -ftree-ch -ftree-sra -ftree-ter -ftree-lrs -ftree-fre -ftree-vectorize @gol
 -ftree-vect-loop-version -ftree-salias -fipa-pta -fweb @gol
@@ -5120,6 +5120,10 @@ at @option{-O} and higher.
 Perform linear loop transformations on tree.  This flag can improve cache
 performance and allow further loop optimizations to take place.

[EMAIL PROTECTED] -ftree-load-inst
+Perform instrumentation of load on trees. This flag inserts a call to  
a profiling

+function before the loads of a program.
+
 @item -ftree-loop-im
 Perform loop invariant motion on trees.  This pass moves only invariants that
 would be hard to handle at RTL level (function calls, operations  
that expand to

Index: tree-pass.h
===
--- tree-pass.h(revision 116394)
+++ tree-pass.h(working copy)
@@ -251,6 +251,7 @@ extern struct tree_opt_pass pass_empty_l
 extern struct tree_opt_pass pass_record_bounds;
 extern struct tree_opt_pass pass_if_conversion;
 extern struct tree_opt_pass pass_vectorize;
+extern struct tree_opt_pass pass_load_inst;
 extern struct tree_opt_pass pass_complete_unroll;
 extern struct tree_opt_pass pass_loop_prefetch;
 extern struct tree_opt_pass pass_iv_optimize;
Index: tree-load-inst.c
===
--- tree-load-inst.c(revision 0)
+++ tree-load-inst.c(revision 0)
@@ -0,0 +1,125 @@
+#include 
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "ggc.h"
+#include "tree.h"
+#include "target.h"
+
+#include "rtl.h"
+#include "basic-block.h"
+#include "diagnostic.h"
+#include "tree-flow.h"
+#include "tree-dump.h"
+#include "timevar.h"
+#include "cfgloop.h"
+#include "expr.h"
+#include "optabs.h"
+#include "tree-chrec.h"
+#include "tree-data-ref.h"
+#include "tree-scalar-evolution.h"
+#include "tree-pass.h"
+#include "lambda.h"
+
+extern struct loops *current_loops;
+static void tree_handle_loop (struct loops *loops);

Re: Can we limit one bug fix per checkin please?

2006-08-30 Thread H. J. Lu
On Sun, Jul 30, 2006 at 04:38:38PM -0700, H. J. Lu wrote:
> When one checkin is used to fix multiple bugs, it isn't easy to back
> out just the offending bug fix only if one of the bug fixes causes
> regression. Can we limit one bug fix per checkin?
> 
> Thanks.

It happened again. This checkin:

http://gcc.gnu.org/ml/gcc-cvs/2006-08/msg00427.html

causes the regression:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28908

There are several bug fixes in revision 116268. It is hard to just back
out one bug fix. Paul, would you mind just fixing one bug in one
checkin?

Thanks.


H.J.


Re: Can we limit one bug fix per checkin please?

2006-08-30 Thread Andrew Pinski
> It happened again. This checkin:

Yes the standard thing is one checkin pre fix.  but it also annoying that you 
(HJL)
don't understand how to file a bug report which is actually documented.

-- Pinski


Re: Inserting function calls

2006-08-30 Thread Diego Novillo
[EMAIL PROTECTED] wrote on 08/30/06 14:44:

> Does anyone have any ideas on to how I can modify my function and get it
> to insert the functions correctly ?
> 
Browse through omp-low.c.  In particular create_omp_child_function and
expand_omp_parallel.  The new function needs to be added to the call
graph and queued for processing (cgraph_add_new_function).


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Tom Tromey
> "KZ" == Kenneth Zadeck <[EMAIL PROTECTED]> writes:

KZ> 2) To have a discussion about the use of DWARF3.  I am now against the
KZ> use of DWARF3 for encoding the GIMPLE.

FWIW your arguments convinced me.

I think what matters most is that the resulting format be relatively
well documented (say, better than GIMPLE), efficient, suitable, etc.
Reusing DWARF3 seems cute but inessential.

[...]
KZ> +case TRUTH_NOT_EXPR:
KZ> +case VIEW_CONVERT_EXPR:
KZ> +#if STUPID_TYPE_SYSTEM
KZ> +  output_type_ref (ob, TREE_TYPE (expr));
KZ> +#endif

I think VIEW_CONVERT_EXPR needs to be treated like NOP_EXPR and
CONVERT_EXPR in the STUPID_TYPE_SYSTEM case.  VIEW_CONVERT_EXPR is a
type-casting expression.

KZ> +/* When we get a strongly typed gimple, which may not be for another
KZ> +   15 years, this flag should be set to 0 so we do not waste so much
KZ> +   space printing out largely redundant type codes.  */
KZ> +#define STUPID_TYPE_SYSTEM 1

You could write a more compact form by emitting explicit "fake nop"
nodes where needed, and then strip those when reading.  I think this
would avoid tweaking the optimizer bugs, as the reloaded trees would
be identical.

This does bring up another point about the format though: there's got
to be some versioning capability in there.

Tom


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Andrew Pinski
> [...]
> KZ> +case TRUTH_NOT_EXPR:
> KZ> +case VIEW_CONVERT_EXPR:
> KZ> +#if STUPID_TYPE_SYSTEM
> KZ> +  output_type_ref (ob, TREE_TYPE (expr));
> KZ> +#endif
> 
> I think VIEW_CONVERT_EXPR needs to be treated like NOP_EXPR and
> CONVERT_EXPR in the STUPID_TYPE_SYSTEM case.  VIEW_CONVERT_EXPR is a
> type-casting expression.

No it is not, it is more complex than a just a simple type-casting
expression, it is a cast which does nothing to the bits at all.  It acts
more like a reference than a cast as it is also can be used on the left
hand side.  I am working on a patch to have GCC use VCE more.

> You could write a more compact form by emitting explicit "fake nop"
> nodes where needed, and then strip those when reading.  I think this
> would avoid tweaking the optimizer bugs, as the reloaded trees would
> be identical.

Actually it would be better just to fix the problem in the first place
as mentioned before.

-- Pinski


Re: Can we limit one bug fix per checkin please?

2006-08-30 Thread Paul Thomas

Andrew Pinski wrote:


It happened again. This checkin:
   

Yes, we did discuss it before - sorry, HJ; I am trying to get as much 
done before I am forced to reduce my work on gfortran. It is much easier 
to do multiple patches but I will desist.




Yes the standard thing is one checkin pre fix ...snip...
 


OK - will do. However, all that said, I am onto it

Paul



problem when returning a structure containing arrays.

2006-08-30 Thread Uwe Schmitt

Hi, 

I compiled the follwing code with gcc -shared buglib.c -o buglib.dll:


>>> buglib.h is:

   struct T
   {
 double x[256];
 double y[256];
 int   i;
   };
   struct T fun(int a);

>>> buglib.c is
   
   #include "buglib.h"

   struct T fun(int a)
   {
 struct T retval;
 int i;
 for (i=0; i<256;++i)
 {
 retval.x[i]=(double)i;
 retval.y[i]=(double)i;  
 }
 return retval;
   }

If I linkt this lib to

>>> main.c 

#include 
#include "buglib.h"


int main()
{
struct T x = fun(1);
int i;
for (i=0; i<10; ++i)
printf("%d %d\n", x.x[i],  x.y[i]);
}

Now the output is totally wrong !
I tried it with the cygwin port gcc 3.4.4 and with
gcc 3.3.1 on Suse Linux.

Any hints ?

Greetings, Uwe



Re: problem when returning a structure containing arrays

2006-08-30 Thread Uwe Schmitt

Sorry, 
I made a mistake with the printf()-formatting-
charcters.

Greetings, Uwe



Re: Inserting function calls

2006-08-30 Thread jean-christophe . beyler

Browse through omp-low.c.  In particular create_omp_child_function


I understand the beginning of the function with its declaration of the  
function but I have a question about these lines :


/* Allocate memory for the function structure.  The call to
 allocate_struct_function clobbers CFUN, so we need to restore
 it afterward.  */
  allocate_struct_function (decl);
  DECL_SOURCE_LOCATION (decl) = EXPR_LOCATION (ctx->stmt);
  cfun->function_end_locus = EXPR_LOCATION (ctx->stmt);
  cfun = ctx->cb.src_cfun;

Is that a necessary process for the declaration of a function ? I ask  
because I do not want the compiler to compile directly my function but  
rather ask the linker to take care of that (it will be an external  
function).



and expand_omp_parallel.


I notice that at the end of that function there is a call to  
expand_parallel_call and in that function I don't see a difference  
with how I prepare the arguments.


This leads me to think that the problem lies with the declaration of  
the function, am I correct ?



The new function needs to be added to the call
graph and queued for processing (cgraph_add_new_function).


This would be true if I wanted it to be compiled but if I do not  
(using a precompiled version) ?


Thank you for your time,
Jc
-
‹Degskalle› There is no point in arguing with an idiot, they will just
drag you down to their level and beat you with experience

Référence: http://www.bash.org/?latest
-




Re: Inserting function calls

2006-08-30 Thread Diego Novillo
[EMAIL PROTECTED] wrote on 08/30/06 16:41:

> Is that a necessary process for the declaration of a function ? I ask
> because I do not want the compiler to compile directly my function but
> rather ask the linker to take care of that (it will be an external
> function).
> 
Oh, so you only want to insert a library call?  In that case the work
done in omp-low.c is going to be a lot more than you need.  You only
need to check how the actuall CALL_EXPR is built to call the newly added
function.

In create_omp_child_function, an identifier for the new function is
created.  We then create a call to it using build_function_call_expr in
expand_parallel_call.


gcc 4.1.1 - successful build and install - i386-pc-mingw32 (msys running on a WinXP box)

2006-08-30 Thread Marcelo Slomp
Follows the build info:

config.guess:
i386-pc-mingw32

$ gcc -v
Using built-in specs.
Target: mingw32
Configured with: ../../source/gcc-4.1.1/configure --prefix=/mingw 
--host=mingw32 --target=mingw32 --program-prefix="" --with-gcc --with-gnu-ld 
--with-gnu-as --enable-threads --disable-nls --enable-languages=c,c++ 
--disable-win32-registry --disable-shared --without-x --enable-interpreter 
--enable-hash-synchronization --enable-libstdcxx-debug
Thread model: win32
gcc version 4.1.1

$ uname -a
MINGW32_NT-5.1 THERGOTHON 1.0.10(0.46/3/2) 2004-03-15 07:17 i686 unknown

host system: WinXP Pro SP2 i686

/me: Marcelo A B Slomp - Brazil

-- 
__
Now you can search for products and services
http://search.mail.com

Powered by Outblaze


Re: Inserting function calls

2006-08-30 Thread jean-christophe . beyler

In create_omp_child_function, an identifier for the new function is
created.  We then create a call to it using build_function_call_expr in
expand_parallel_call.


Ok so that's what I saw, is this call necessary for what I'd need :

  decl = lang_hooks.decls.pushdecl (decl);


Then simplifying the problem, I just want to call a void _foo(void)  
function so, taking from create_omp_child_function and  
expand_parallel_call I did this :


type = build_function_type_list (void_type_node, NULL_TREE);

decl = build_decl (FUNCTION_DECL, "_foo" , type);

TREE_STATIC (decl) = 1;
TREE_USED (decl) = 1;
DECL_ARTIFICIAL (decl) = 1;
DECL_IGNORED_P (decl) = 0;
TREE_PUBLIC (decl) = 0;
DECL_UNINLINABLE (decl) = 1;
DECL_EXTERNAL (decl) = 0;
DECL_CONTEXT (decl) = NULL_TREE;
DECL_INITIAL (decl) = make_node (BLOCK);

t = build_decl (RESULT_DECL, NULL_TREE, void_type_node);
DECL_ARTIFICIAL (t) = 1;
DECL_IGNORED_P (t) = 1;
DECL_RESULT (decl) = t;

t = build_decl (PARM_DECL, NULL_TREE, void_type_node);
DECL_ARTIFICIAL (t) = 1;
DECL_ARG_TYPE (t) = void_type_node;
DECL_CONTEXT (t) = current_function_decl;
TREE_USED (t) = 1;
DECL_ARGUMENTS (decl) = t;

tree list = NULL_TREE;
tree call = build_function_call_expr (decl, NULL);
gimplify_and_add(call,&list);

bsi_insert_before(&bsi, list, BSI_CONTINUE_LINKING);


But this gives me this error when I try compiling it

hello.c:18: internal compiler error: tree check: expected  
identifier_node, have obj_type_ref in special_function_p, at calls.c:475


Any ideas ?
Jc

-
‹Degskalle› There is no point in arguing with an idiot, they will just
drag you down to their level and beat you with experience

Référence: http://www.bash.org/?latest
-




Re: Inserting function calls

2006-08-30 Thread Zdenek Dvorak
Hello,

> I have been trying to insert function calls during a new pass in the  
> compiler but it does not seem to like my way of doing it. The basic  
> idea is to insert a function call before any load in the program  
> (later on I'll be selecting a few loads but for now I just want to do  
> it for each and every one).
> 
> This looks like the profiling pass but I can't seem to understand what  
> is the matter...
> 
> This is what the compiler says when I try to compile this test case :
> 
> >hello.c:18: internal compiler error: tree check: expected ssa_name,  
> >have symbol_memory_tag in verify_ssa, at tree-ssa.c:776

what you do seems basically OK to me.  The problem is that you also need
to fix the ssa form for the virtual operands of the added calls
(i.e., you must call mark_new_vars_to_rename for each of the calls,
and update_ssa once at the end of tree_handle_loop).

Zdenek


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Mark Mitchell

Kenneth Zadeck wrote:


This posting is a progress report of my task of encoding and decoding
the GIMPLE stream into LTO.   Included in this posting is a patch that
encodes functions and dumps the result to files.  


[I'm sorry for not replying to this sooner.  I've been on a plane or in 
a meeting virtually all of my waking hours since you wrote this...]


Exciting!


2) To have a discussion about the use of DWARF3.  I am now against the
use of DWARF3 for encoding the GIMPLE. 


As the person who suggested this to you, I'm sorry it doesn't seem to 
meet your needs.  I'll make a few more specific comments below, but, to 
echo what I said at the time: I think it was a good idea to try it, but 
if it's not the right tool for the job, then so be it.  My opinion has 
always been that, for the bodies of functions, DWARF is just an 
encoding: if it's a decent encoding, then, it's nice that it's a 
standard; if it's a bad encoding, then, by all means, let's not use it!


I do think DWARF is a good choice for the non-executable information, 
for the same reasons I did initially:


(a) for debugging, we need to generate most of that information anyhow, 
so we're piggy-backing on existing code -- and not making object files 
bigger by encoding the same information twice in the case of "-O2 -g", 
which is the default way that many GNU applications are built.


(b) it's a standard, and we already have tools for reading DWARF, so it 
saves the trouble of coming up with a new encoding,


(c) because bugs in the DWARF emission may not result in problems at 
LTO, we'll be validating our LTO information every time we use GDB, and, 
similarly, improving the GDB experience every time we fix an LTO bug in 
this area.


I understand that some of these benefits do not apply to the executable 
code, and that, even to the extent they may apply, the tradeoffs are 
different.  The comments I've made below about specific issues should 
therefore be considered as academic responses, not an attempt to argue 
the decision you have made.



3) To get some one to tell me what option we are going to add to the
compiler to tell it to write this information.  


I think a reasonable spelling is probably "-flto".  It should not be a 
-m option since it is not machine-specific.  I don't think it should be 
a -O option either, since writing out LTO information isn't really 
supposed to affect optimization per se.



2) The code is, by design, fragile.  It takes nothing for granted.
Every case statement has gcc_unreachable as it's default case. 


That's the same way I've approached the DWARF reading code, and for the 
same reason.  I think that's exactly the right decision.



1) ABBREV TABLES ARE BAD FOR LTO.

However, this mechanism is only self descriptive if you do not extend
the set of tags.  That is not an option for LTO.  


Definitely true.  When we talked on the phone, we talked about creating 
a tag corresponding to each GIMPLE tree code.  However, you could also 
create a numeric attribute giving the GIMPLE tree code.  If you did 
that, you might find that the abbreviation table became extremely small 
-- because almost all interior nodes would be DW_TAG_GIMPLE_EXPR nodes. 
 The downside, of course, is that the storage required to store the 
nodes would be greater, as it would now contain the expression code 
(e.g., PLUS_EXPR), rather than having a DW_TAG_GIMPLE_PLUS_EXPR.



I strongly believe that for LTO to work, we are going to have to
implement some mechanism where the function bodies are loaded into the
compiler on demand (possibly kept in cache, but maybe not). 


Agreed.


This
will be more cumbersome if we have to keep reloading each object
file's abbrev table just to tear apart a single function in that .o
file.  While the abbrev sections average slightly less than %2 of the
of the size of the GIMPLE encoding for an entire file, each abbrev table
averages about the same size as a single function.


Interesting datapoint.

(Implied, but not stated, in your mail is the fact that the abbreviation 
table cannot be indexed directly.  If it could be, then you wouldn't 
have to read the entire abbreviation table for each function; you would 
just read the referenced abbreviations.  Because the abbreviation table 
records are of variable length, it is indeed true that you cannot make 
random accesses to the table.  So, this paragraph is just fleshing out 
your argument.)


I think the conclusion that you reach (that the size of the tables is a 
problem) depends on how you expect the compiler to process functions at 
link-time.  My expectation was that you would form a global control-flow 
graph for the entire program (by reading CFG data encoded in each .o 
file), eliminate unreachable functions, and then inline/optimize 
functions one-at-a-time.


If you sort the function-reading so that you prefer to read functions 
from the same object file in order, then I would expect that you would 
considerably reduce the impact of readin

Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Kenneth Zadeck
Mark Mitchell wrote:
> Kenneth Zadeck wrote:
>
>
>
>> This
>> will be more cumbersome if we have to keep reloading each object
>> file's abbrev table just to tear apart a single function in that .o
>> file.  While the abbrev sections average slightly less than %2 of the
>> of the size of the GIMPLE encoding for an entire file, each abbrev table
>> averages about the same size as a single function.
>
> Interesting datapoint.
>
> (Implied, but not stated, in your mail is the fact that the
> abbreviation table cannot be indexed directly.  If it could be, then
> you wouldn't have to read the entire abbreviation table for each
> function; you would just read the referenced abbreviations.  Because
> the abbreviation table records are of variable length, it is indeed
> true that you cannot make random accesses to the table.  So, this
> paragraph is just fleshing out your argument.)
>
> I think the conclusion that you reach (that the size of the tables is
> a problem) depends on how you expect the compiler to process functions
> at link-time.  My expectation was that you would form a global
> control-flow graph for the entire program (by reading CFG data encoded
> in each .o file), eliminate unreachable functions, and then
> inline/optimize functions one-at-a-time.
>
> If you sort the function-reading so that you prefer to read functions
> from the same object file in order, then I would expect that you would
> considerably reduce the impact of reading the abbreviation tables. 
> I'm making the assumption that it f calls N functions, then they
> probably come from < N object files.  I have no data to back up that
> assumption.
>
> (There is nothing that says that you can only have one abbreviation
> table for all functions.  You can equally well have one abbreviation
> table per function.  In that mode, you trade space (more abbreviation
> tables, and the same abbreviation appearing in multiple tables)
> against the fact that you now only need to read the abbreviation
> tables you need.  I'm not claiming this is a good idea.)
>
> I don't find this particular argument (that the abbreviation tables
> will double file I/O) very convincing.  I don't think it's likely that
> the problem we're going to have with LTO is running out of *virtual*
> memory, especially as 64-bit hardware becomes nearly universal.  The
> problem is going to be running out of physical memory, and thereby
> paging like crazy, running out of D-cache.  So, I'd assume you'd just
> read the tables as-needed, and never both discarding them.  As long as
> there is reasonable locality of reference to abbreviation tables
> (i.e., you can arrange to hit object files in groups), then the cost
> here doesn't seem like it would be very big.
>
Even if we decide that we are going to process all of the functions in
one file at one time, we still have to have access to the functions that
are going to be inlined into the function being compiled.  Getting at
those functions that are going to be inlined is where the double the i/o
arguement comes from. 

I have never depended on the kindness of strangers or the virtues of
virtual memory.  I fear the size of the virtual memory when we go to
compile really large programs. 


>> 2) I PROMISED TO USE THE DWARF3 STACK MACHINE AND I DID NOT.
>
> I never imagined you doing this; as per above, I always expected that
> you would use DWARF tags for the expression nodes.  I agree that the
> stack-machine is ill-suited.
>
>> 3) THERE IS NO COMPRESSION IN DWARF3.
>
>> In 1 file per mode, zlib -9 compression is almost 6:1.  In 1 function
>> per mode, zlib -9 compression averages about 3:1.
>
> In my opinion, if you considered DWARF + zlib to be satisfactory, then
> I think that would be fine.  For LTO, we're allowed to do whatever we
> want.  I feel the same about your confession that you invented a new
> record form; if DWARF + extensions is a suitable format, that's fine.
> In other words, in principle, using a somewhat non-standard variant of
> DWARF for LTO doesn't seem evil to me -- if that met our needs.
>
One of the comments that was made by a person on the dwarf committee is
that the abbrev tables really can be used for compression.  If you have
information that is really common to a bunch of records, you can build
an abbrev entry with the common info in it. 

I have not seen a place where any use can be made of this for encoding
gimple except for a couple of places where I have encoded a true or
false.  I therefor really do not see that they really add anything
except for the code to read and write them. 


>> 2) LOCAL DECLARATIONS 
>> Mark was going to do all of the types and all of the declarations.
>> His plan was to use the existing DWARF3 and enhance it where it was
>> necessary eventually replacing the GCC type trees with direct
>> references to the DWARF3 symbol table.  
>
> > The types and global variables are likely OK, or at least Mark
>> should be able to add any missing info.
>
I had a discussion on chat t

Re: regress and -m64

2006-08-30 Thread Bradley Lucier
After some discussion with Jack Howarth, I have found that the  
gfortran and libgomp executable tests on powerpc-apple-darwin8.7.0  
(at least) do not link the correct, just-built-using-"make  
bootstrap", libraries until those libraries have first been installed  
in $prefix/lib/...


I filed a bug report at

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28913

I noted the different results there between two "make check"  
commands, one just before "make install" and one just after.


64-bit test results are now as follows.  (See also

http://gcc.gnu.org/ml/gcc-testresults/2006-08/msg01383.html

)

=== g++ Summary ===

# of expected passes11595
# of unexpected failures1350
# of expected failures  69
# of unresolved testcases   28
# of unsupported tests  129
/Users/lucier/programs/gcc/mainline/objdir64/gcc/testsuite/g++/../../g 
++  version 4.2.0 20060829 (experimental)

=== gcc Summary ===

# of expected passes41550
# of unexpected failures45
# of unexpected successes   1
# of expected failures  108
# of untested testcases 28
# of unsupported tests  507
/Users/lucier/programs/gcc/mainline/objdir64/gcc/xgcc  version 4.2.0  
20060829 (experimental)

=== gfortran Summary ===

# of expected passes14014
# of unexpected failures33
# of unexpected successes   3
# of expected failures  7
# of unsupported tests  41
/Users/lucier/programs/gcc/mainline/objdir64/gcc/testsuite/ 
gfortran/../../gfortran  version 4.2.0 20060829 (experimental)

=== objc Summary ===

# of expected passes1707
# of unexpected failures68
# of expected failures  7
# of unresolved testcases   1
# of unsupported tests  2
/Users/lucier/programs/gcc/mainline/objdir64/gcc/xgcc  version 4.2.0  
20060829 (experimental)

=== libffi Summary ===

# of expected passes472
# of unexpected failures384
# of expected failures  8
# of unsupported tests  8
=== libgomp Summary ===

# of expected passes1075
# of unexpected failures205
# of unsupported tests  111
=== libjava Summary ===

# of expected passes1776
# of unexpected failures2069
# of expected failures  32
# of untested testcases 3021
=== libstdc++ Summary ===

# of expected passes2052
# of unexpected failures1668
# of unexpected successes   1
# of expected failures  15
# of unsupported tests  321



Re: regress and -m64

2006-08-30 Thread Jack Howarth
Bradley,
Something still is as astray with your build configuration.
Look at my last set of results.

http://gcc.gnu.org/ml/gcc-testresults/2006-08/msg01333.html

I only have 28 unexpected failures for g++ at -m64 and you
have 1350. Likewise for libstdc++ at -m64, I only have 6 
unexpected failure whereas you have 1668. Try building
some of the g++ testcases manually and see what the errors
are. Assuming you really have the ld64 from Xcode 2.3 installed
I am guessing you have some bogus copies of libstdc++ laying
around in your path.
Jack


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Mark Mitchell

Kenneth Zadeck wrote:


Even if we decide that we are going to process all of the functions in
one file at one time, we still have to have access to the functions that
are going to be inlined into the function being compiled.  Getting at
those functions that are going to be inlined is where the double the i/o
arguement comes from. 


I understand -- but it's natural to expect that those functions will be 
clumped together.  In a gigantic program, I expect there are going to be 
clumps of tightly connected object files, with relatively few 
connections between the clumps.  So, you're likely to get good cache 
behavior for any per-object-file specific data that you need to access.



I have never depended on the kindness of strangers or the virtues of
virtual memory.  I fear the size of the virtual memory when we go to
compile really large programs. 


I don't think we're going to blow out a 64-bit address space any time 
soon.  Disks are big, but they are nowhere near *that* big, so it's 
going to be pretty hard for anyone to hand us that many .o files.  And, 
there's no point manually reading/writing stuff (as opposed to mapping 
it into memory), unless we actually run out of address space.


In fact, if you're going to design your own encoding formats, I would 
consider a format with self-relative pointers (or, offsets from some 
fixed base) that you could just map into memory.  It wouldn't be as 
compact as using compression, so the total number of bytes written when 
generating the object files would be bigger.  But, it will be very quick 
to load it into memory.


I guess my overriding concern is that we're focusing heavily on the data 
format here (DWARF?  Something else?  Memory-mappable?  What compression 
scheme?) and we may not have enough data.  I guess we just have to pick 
something and run with it.  I think we should try to keep that code as 
as separate as possible so that we can recover easily if whatever we 
pick turns out to be (another) bad choice. :-)



One of the comments that was made by a person on the dwarf committee is
that the abbrev tables really can be used for compression.  If you have
information that is really common to a bunch of records, you can build
an abbrev entry with the common info in it. 


Yes.  I was a little bit surprised that you don't seem to have seen much 
commonality.  If you recorded most of the tree flags, and treated them 
as DWARF attributes, I'd expect you would see relatively many 
expressions of a fixed form.  Like, there must be a lot of PLUS_EXPRs 
with TREE_USED set on them.  But, I gather that you're trying to avoid 
recording some of these flags, hoping either that (a) they won't be 
needed, or (b) you can recreate them when reading the file.  I think 
both (a) and (b) hold in many cases, so I think it's reasonable to 
assume we're writing out very few attributes.



I had a discussion on chat today with drow and he indicated that you
were busily adding all of the missing stuff here.


"All" is an overstatement. :-) Sandra is busily adding missing stuff and 
I'll be working on the new APIs you need.



I told him that I
thought this was fine as long as there is not a temporal drift in
information encoded for the types and decls between the time I write my
stuff and when the types and decls are written.


I'm not sure what this means.

Thanks,

--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713



gcc.target/powerpc vs -m64

2006-08-30 Thread Jack Howarth
Geoff,
   I am assuming that quite a few of the remaining regressions at -m64
on Darwin PPC with your TImode patch applied will be resolved when Eric
posts his x86_64 patches. However there are a few in gcc.target/powerpc
which likely won't be addressed by those patches. I am seeing the following
test cases fail for -m64 only...

gcc-4 -m64  -O2 ppc64-abi-1.c
/var/tmp//cchkbsnr.s:22:Parameter syntax error (parameter 2)
/var/tmp//cchkbsnr.s:22:Invalid mnemonic 'got(r2)'
/var/tmp//cchkbsnr.s:93:Parameter syntax error (parameter 2)
/var/tmp//cchkbsnr.s:93:Invalid mnemonic 'got(r2)'
/var/tmp//cchkbsnr.s:161:Parameter syntax error (parameter 2)
/var/tmp//cchkbsnr.s:161:Invalid mnemonic 'got(r2)'
/var/tmp//cchkbsnr.s:252:Parameter syntax error (parameter 2)
/var/tmp//cchkbsnr.s:252:Invalid mnemonic 'got(r2)'
/var/tmp//cchkbsnr.s:372:Parameter syntax error (parameter 2)
/var/tmp//cchkbsnr.s:372:Invalid mnemonic 'got(r2)'
/var/tmp//cchkbsnr.s:450:Parameter syntax error (parameter 2)
/var/tmp//cchkbsnr.s:450:Invalid mnemonic 'got(r2)'

gcc-4 -m64 -O2 darwin-bool-1.c

darwin-bool-1.c:5: error: size of array 'dummy1' is too large

The failures for...

FAIL: gcc.target/powerpc/ppc-and-1.c scan-assembler rlwinm [0-9]+,[0-9]+,0,0,30
FAIL: gcc.target/powerpc/ppc-and-1.c scan-assembler rlwinm [0-9]+,[0-9]+,0,29,30
FAIL: gcc.target/powerpc/ppc-negeq0-1.c scan-assembler-not cntlzw

are a tad confusing because if I do...

gcc-4 -O2 -m64 -S -c ppc-and-1.c
grep rlwinm ppc-and-1.s
rlwinm r4,r4,0,0,30
rlwinm r4,r4,0,29,30
grep rldicr ppc-and-1.s
(no results)

This is confusing because it suggests the test *should* be passing!
On the other hand, the failure in the ppc-negeq0-1 is understandable...

gcc-4 -O2 -m64 -S -c ppc-negeq0-1.c
grep cntlzw ppc-negeq0-1.s
cntlzw r3,r3
cntlzw r3,r3

Geoff, should I file PR's for each of these?
Jack


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Seongbae Park

On 8/30/06, Mark Mitchell <[EMAIL PROTECTED]> wrote:
...

I guess my overriding concern is that we're focusing heavily on the data
format here (DWARF?  Something else?  Memory-mappable?  What compression
scheme?) and we may not have enough data.  I guess we just have to pick
something and run with it.  I think we should try to keep that code as
as separate as possible so that we can recover easily if whatever we
pick turns out to be (another) bad choice. :-)


At the risk of stating the obvious and also repeating myself,
please allow me give my thought on this issue.

I think we should take even a step further than "try to keep the code
as separate".
We should try to come up with a set of
procedural and datastructural interface for the input/output
of the program structure,
and try  to *completely* separate the optimization/datastructure cleanup work
from the encoding/decoding.

Beside the basic requirement of being able to pass through
enough information to produce valid program,
I think there is a critical requirement
to implement inter-module/inter-procedural optimization efficiently
- that the I/O interface allows efficient handling of
iterating through module/procedure-level information
without reading each and every module/procedure bodies
(as Ken mentioned).

There are certain amount of information per object/procedure that
are accessed during different optimization and with sufficiently
different pattern -
e.g. type tree is naturally an object-level information
that we may want to go through for each and every object file,
without read all function bodies,
and other function level information such as caller/callee information
would be useful without the function body.

We'll need to identify such information (in other words,
the requirement of the interprocedural optimization/analysis)
so that the new interface would provide ways to walk through them
without loading the entire function bodies - even with large address space,
if the data is scattered everywhere, it becomes extremely inefficient
on modern machines to go through them,
so it's actually more important to identify what logical information
that we want to access during various interprocedural optimizations
and the I/O interface needs to handle them efficiently.

This requirement should dictate how we encode/layout the data
into the disk, before anything else. Also how the information is
presented to the actual inter-module optimization/analysis.

Also, part of defining the interface would involve restricting
the existing structures (e.g. GIMPLE) in possibly more limited form
than what's currently allowed. By virtue of having an interface
that separates the encoding/decoding from the rest of the compilation,
we can throw away and recompute certain information
(e.g. often certain control flow graph can be recovered,
hence does not need to be encoded)
but those details can be worked out as the implementation of the IO interface
gets more in shape.
--
#pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com";