Re: volatile qualifier hurts single-threaded optimized case
> bits/atomicity.h has volatile qualifiers on the _Atomic_word* arguments to > the __*_single and __*_dispatch variants of the atomic operations. This > huts especially the single-threaded optimization variants which are usually > inlined. Removing those qualifiers allows to reduce code size significantly > as can be seen in the following simple testcase I've been able to reproduce this with your example and the following patch. Thanks for looking at this. without volatile: 19: 546 FUNCGLOBAL DEFAULT2 _Z3fooPKcS0_ with: 19: 578 FUNCGLOBAL DEFAULT2 _Z3fooPKcS0_ I don't understand the ABI objections to your suggestion, and feel like there must be a misunderstanding somewhere. These helper functions are not exported at all, in fact. Also, the *_dispatch and *_single parts of the atomicity.h interface are new with 4.2, so I'd like to get the correct signatures in with their introduction, and not have to patch this up later. ? tested x86/linux abi tested x86/linux -benjamin 2006-08-30 Benjamin Kosnik <[EMAIL PROTECTED]> Richard Guenther <[EMAIL PROTECTED]> * config/abi/pre/gnu.ver: Spell out exact signatures for atomic access functions. * include/bits/atomicity.h (__atomic_add_dispatch): Remove volatile qualification for _Atomic_word argument. (__atomic_add_single): Same. (__exchange_and_add_dispatch): Same. (__exchange_and_add_single): Same. Index: include/bits/atomicity.h === --- include/bits/atomicity.h(revision 116581) +++ include/bits/atomicity.h(working copy) @@ -60,7 +60,7 @@ #endif static inline _Atomic_word - __exchange_and_add_single(volatile _Atomic_word* __mem, int __val) + __exchange_and_add_single(_Atomic_word* __mem, int __val) { _Atomic_word __result = *__mem; *__mem += __val; @@ -68,12 +68,12 @@ } static inline void - __atomic_add_single(volatile _Atomic_word* __mem, int __val) + __atomic_add_single(_Atomic_word* __mem, int __val) { *__mem += __val; } static inline _Atomic_word __attribute__ ((__unused__)) - __exchange_and_add_dispatch(volatile _Atomic_word* __mem, int __val) + __exchange_and_add_dispatch(_Atomic_word* __mem, int __val) { #ifdef __GTHREADS if (__gthread_active_p()) @@ -87,7 +87,7 @@ static inline void __attribute__ ((__unused__)) - __atomic_add_dispatch(volatile _Atomic_word* __mem, int __val) + __atomic_add_dispatch(_Atomic_word* __mem, int __val) { #ifdef __GTHREADS if (__gthread_active_p()) @@ -101,8 +101,9 @@ _GLIBCXX_END_NAMESPACE -// Even if the CPU doesn't need a memory barrier, we need to ensure that -// the compiler doesn't reorder memory accesses across the barriers. +// Even if the CPU doesn't need a memory barrier, we need to ensure +// that the compiler doesn't reorder memory accesses across the +// barriers. #ifndef _GLIBCXX_READ_MEM_BARRIER #define _GLIBCXX_READ_MEM_BARRIER __asm __volatile ("":::"memory") #endif Index: config/abi/pre/gnu.ver === --- config/abi/pre/gnu.ver (revision 116581) +++ config/abi/pre/gnu.ver (working copy) @@ -378,8 +378,8 @@ # __gnu_cxx::__atomic_add # __gnu_cxx::__exchange_and_add -_ZN9__gnu_cxx12__atomic_add*; -_ZN9__gnu_cxx18__exchange_and_add*; +_ZN9__gnu_cxx12__atomic_addEPVii; +_ZN9__gnu_cxx18__exchange_and_addEPVii; # debug mode _ZN10__gnu_norm15_List_node_base4hook*;
Re: volatile qualifier hurts single-threaded optimized case
On 8/30/06, Benjamin Kosnik <[EMAIL PROTECTED]> wrote: > bits/atomicity.h has volatile qualifiers on the _Atomic_word* arguments to > the __*_single and __*_dispatch variants of the atomic operations. This > huts especially the single-threaded optimization variants which are usually > inlined. Removing those qualifiers allows to reduce code size significantly > as can be seen in the following simple testcase I've been able to reproduce this with your example and the following patch. Thanks for looking at this. without volatile: 19: 546 FUNCGLOBAL DEFAULT2 _Z3fooPKcS0_ with: 19: 578 FUNCGLOBAL DEFAULT2 _Z3fooPKcS0_ I don't understand the ABI objections to your suggestion, and feel like there must be a misunderstanding somewhere. These helper functions are not exported at all, in fact. Also, the *_dispatch and *_single parts of the atomicity.h interface are new with 4.2, so I'd like to get the correct signatures in with their introduction, and not have to patch this up later. Oh, indeed. If the *_dispatch and *_single parts are new, we can change the function signatures there. The patch looks fine and should get us the runtime and size improvements I saw. Thanks for having a look! Richard.
Re: Successful Build: gcc-4.1-20051230 i686-pc-mingw32
Using: > ../gcc-4.1.1/configure --host=mingw32 --build=mingw32 --target=mingw32 > --enab le-threads --enable-optimize --disable-nls --enable-languages=c,c++,fortran --p refix=/c/prog/mingw4 --with-cpu=pentium4 --with-ld=/c/prog/mingw4/bin/ld.exe -- with-as=/c/prog/mingw4/bin/as.exe --with-gmp=/c/prog/mingw4 ... (succesfull output) > make bootstrap ... stage1/xgcc.exe -Bstage1/ -B/c/prog/mingw4/mingw32/bin/ -c -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wold-style-definition -Wmissing-format-attribute -Wno-format-DHAVE_CONFIG_H -I. -Ifortran -I../../gcc-4.1.1/gcc -I../../gcc-4.1.1/gcc/fortran -I../../gcc-4.1.1/gcc/../include -I../../gcc-4.1.1/gcc/../libcpp/include -I/c/prog/mingw4/include ../../gcc-4.1.1/gcc/fortran/arith.c -o fortran/arith.o In file included from ../../gcc-4.1.1/gcc/system.h:42, from ../../gcc-4.1.1/gcc/fortran/arith.c:29: c:/prog/mingw4/include/stdio.h:219: warning: no previous prototype for 'vsnprintf' c:/prog/mingw4/include/stdio.h:258: warning: no previous prototype for 'getc' c:/prog/mingw4/include/stdio.h:265: warning: no previous prototype for 'putc' c:/prog/mingw4/include/stdio.h:272: warning: no previous prototype for 'getchar' c:/prog/mingw4/include/stdio.h:279: warning: no previous prototype for 'putchar' In file included from ../../gcc-4.1.1/gcc/system.h:42, from ../../gcc-4.1.1/gcc/fortran/arith.c:29: c:/prog/mingw4/include/stdio.h:401: warning: no previous prototype for 'fopen64' c:/prog/mingw4/include/stdio.h:413: warning: no previous prototype for 'ftello64' c:/prog/mingw4/include/stdio.h:468: warning: no previous prototype for 'vsnwprintf' In file included from ../../gcc-4.1.1/gcc/system.h:195, from ../../gcc-4.1.1/gcc/fortran/arith.c:29: c:/prog/mingw4/include/string.h:97: warning: no previous prototype for 'strcasecmp' c:/prog/mingw4/include/string.h:103: warning: no previous prototype for 'strncasecmp' In file included from c:/prog/mingw4/include/unistd.h:10, from ../../gcc-4.1.1/gcc/system.h:231, from ../../gcc-4.1.1/gcc/fortran/arith.c:29: c:/prog/mingw4/include/io.h:150: warning: no previous prototype for 'lseek64' In file included from ../../gcc-4.1.1/gcc/fortran/arith.c:31: ../../gcc-4.1.1/gcc/fortran/gfortran.h:1738: error: expected declaration specifiers or '...' before 'uint' make[2]: *** [fortran/arith.o] Error 1 make[2]: Leaving directory `/home/schuttek/gcc-4.1.1-build/gcc' make[1]: *** [stage2_build] Error 2 make[1]: Leaving directory `/home/schuttek/gcc-4.1.1-build/gcc' make: *** [bootstrap] Error 2 Anyone an idea how to solve this? I -- View this message in context: http://www.nabble.com/Successful-Build%3A-gcc-4.1-20051230-i686-pc-mingw32-tf834182.html#a6058353 Sent from the gcc - Dev forum at Nabble.com.
segmentation fault in building __floatdisf.o
hi, I'm having some problem during build up of libgcc2 in function __floatdisf(build up of __floatdisf.o).Actually i'm modifying mips backend.The error is ../../gcc-4.1.0/gcc/libgcc2.c: In function '__floatdisf': ../../gcc-4.1.0/gcc/libgcc2.c:1354: internal compiler error: Segmentation fault I tried to debug the reason of crash and following are my findings before crash following pattern is called (define_expand "cmp" [(set (cc0) (compare:CC (match_operand:GPR 0 "register_operand") (match_operand:GPR 1 "nonmemory_operand")))] "" { fprintf(stderr," cmp \n"); branch_cmp[0] = operands[0]; branch_cmp[1] = operands[1]; debug_rtx(branch_cmp[0]); debug_rtx(branch_cmp[1]); DONE; }) as u can see i've printed the operands which are as follows operand[0] -- (subreg:SI (reg:DI 30) 4) operand[1] - (subreg:SI (reg:DI 25 [ u.0 ]) 4) after this i think it tries to mach some s bCOND pattern but in this case it fails . Is this my proposition correct? In another working case where no error is being generated.Following is the sequence of called patterns (define_expand "cmp" [(set (cc0) (compare:CC (match_operand:GPR 0 "register_operand") (match_operand:GPR 1 "nonmemory_operand")))] "" { fprintf(stderr," cmp \n"); cmp_operands[0] = operands[0]; cmp_operands[1] = operands[1]; debug_rtx(cmp_operands[0]); debug_rtx(cmp_operands[1]); DONE; }) here the operand are operands[0] --- (subreg:SI (reg:DI 30) 0) operands[1] --- (subreg:SI (reg:DI 25 [ u.0 ]) 0) Then the following pattern is matched (define_expand "bltu" [(set (pc) (if_then_else (ltu (cc0) (const_int 0)) (label_ref (match_operand 0 "")) (pc)))] "" { fprintf(stderr,"\n branch_fp 8 bltu\n"); }) So in first failed case it must match the above mentioned pattern but fails to do so.So the only difference seems to be that bytenum offset in subreg expression is different.In failed case it is 4 and in successful case it is 0. Both directories seems to be copy of each other.Then why operands are defiierent in cmpsi patterns.There are no floating point registers.The option passed to gcc is -msoft-float. I've tried my best to track the problem but could not due my limited knowledge.Would you please give me some hint to debug the problem . thanks, shahzad
RE: segmentation fault in building __floatdisf.o
On 30 August 2006 15:11, kernel coder wrote: > hi, > I'm having some problem during build up of libgcc2 in function > __floatdisf(build up of __floatdisf.o).Actually i'm modifying mips > backend.The error is > > ../../gcc-4.1.0/gcc/libgcc2.c: In function '__floatdisf': > ../../gcc-4.1.0/gcc/libgcc2.c:1354: internal compiler error: Segmentation > fault This is always the first sign of a bug in your backend, as it's the first bit of code that gets compiled for the target by the newly-compiled backend. In this case, it's a really bad bug, because we're bombing out with a SEGV, rather than getting a nice assert. This could be because of dereferencing a null pointer. > before crash following pattern is called > > (define_expand "cmp" > [(set (cc0) > (compare:CC (match_operand:GPR 0 "register_operand") > (match_operand:GPR 1 "nonmemory_operand")))] > "" > { > fprintf(stderr," cmp \n"); > branch_cmp[0] = operands[0]; > branch_cmp[1] = operands[1]; > debug_rtx(branch_cmp[0]); > debug_rtx(branch_cmp[1]); > DONE; > }) > (subreg:SI (reg:DI 30) 4) > (subreg:SI (reg:DI 25 [ u.0 ]) 4) > > after this i think it tries to mach some s bCOND pattern but in this > case it fails . > > Is this my proposition correct? Unlikely. You'd expect to see a proper ICE message with backtrace if recog failed. > In another working case where no error is being generated.Following is > the sequence of called patterns > > (define_expand "cmp" > [(set (cc0) > (compare:CC (match_operand:GPR 0 "register_operand") > (match_operand:GPR 1 "nonmemory_operand")))] > "" > { > fprintf(stderr," cmp \n"); > cmp_operands[0] = operands[0]; > cmp_operands[1] = operands[1]; > debug_rtx(cmp_operands[0]); > debug_rtx(cmp_operands[1]); > DONE; > }) > (subreg:SI (reg:DI 30) 0) > (subreg:SI (reg:DI 25 [ u.0 ]) 0) > > Then the following pattern is matched > > (define_expand "bltu" > [(set (pc) > (if_then_else (ltu (cc0) (const_int 0)) > (label_ref (match_operand 0 "")) > (pc)))] > "" > { > fprintf(stderr,"\n branch_fp 8 bltu\n"); > }) > I've tried my best to track the problem but could not due my limited > knowledge.Would you please give me some hint to debug the problem . I suspect that what's going wrong is that, in the error case, one of the 'branch_cmp' or 'cmp_operands' arrays is getting set, but when the branch pattern comes to be matched, it is the /other/ array that is used, which is where the null pointer is coming from. You've given no information about what you've actually changed, and I'm no MIPS expert, but in my local copy of the gcc 4 sources there's no such thing as 'branch_cmp', only 'cmp_operands', whereas in gcc series 3m, it's the other way round So I'm guessing that you've added a new variant of the cmp expander, and you've based it on some old code from a series 3 version of the compiler, and it's not good in series 4? cheers, DaveK -- Can't think of a witty .sigline today
MyGCC and whole program static analysis
Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which seems to be an extended GCC to add some kind of static analysis. I'm quite surprised that the mygcc page gives x86/linux binaries, but no source tarball of their compiler (this seems to me against the spirit of the GPL licence, but I am not a lawyer). As some few people might already know, the GGCC (globalgcc) project is just starting (partly funded within the ITEA framework by french, spanish, swedish public money) - its kick off meeting is next week in Paris. GGCC aims to provide a (GPL opensource) extension to GCC for program wide static analysis (& optimisations) and coding rules validation. But this mail is not a formal announcement of it... I am also extremely interested in the LTO framework, in particular their persistence of GIMPLE trees. Could LTO people explain (if possible) if their framework is extensible (to some new Gimple nodes) and usable in some other setting (for example, storing program wide static analysis partial results, perhaps in a "project" related database or file). It is too bad that they only store information in DWARF3 format... Do they have some technical description of their format and persistence machineray (I've read their introductory paper). BTW, I am still a newbie on GCC... Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet aliases: basiletunesorg = bstarynknerimnet 8, rue de la Faïencerie, 92340 Bourg La Reine, France
Re: MyGCC and whole program static analysis
Le Wed, Aug 30, 2006 at 06:36:19PM +0200, basile écrivait/wrote: > > Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which > seems to be an extended GCC to add some kind of static analysis. > > I'm quite surprised that the mygcc page gives x86/linux binaries, but > no source tarball of their compiler (this seems to me against the > spirit of the GPL licence, but I am not a lawyer). > My public apologies to MyGCC. There is a patch on http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01437.html but sadly the http://mygcc.free.fr does not provide any link to it. It is sad to have to google to find their patch, it would be simpler if they linked it (or even gave full source tarball). Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet aliases: basiletunesorg = bstarynknerimnet 8, rue de la Faïencerie, 92340 Bourg La Reine, France
Re: MyGCC and whole program static analysis
On Wed, Aug 30, 2006 at 06:36:19PM +0200, Basile STARYNKEVITCH wrote: > Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which > seems to be an extended GCC to add some kind of static analysis. > > I'm quite surprised that the mygcc page gives x86/linux binaries, but > no source tarball of their compiler (this seems to me against the > spirit of the GPL licence, but I am not a lawyer). Not just the spirit; the GPL requires that full sources be made available, not just a patch (alternatively, a written offer to provide full source can be provided with the binary, but one or the other is required). However, I can't tell if there is a violation, since I haven't downloaded the tarball and looked for written offers, and it appears that if there is a violation, it is a well-intentioned error. This is the kind of thing that the FSF usually resolves quietly and amicably.
Re: MyGCC and whole program static analysis
On Wed, Aug 30, 2006 at 06:52:59PM +0200, Basile STARYNKEVITCH wrote: > Le Wed, Aug 30, 2006 at 06:36:19PM +0200, basile écrivait/wrote: > > > > Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which > > seems to be an extended GCC to add some kind of static analysis. > > > > I'm quite surprised that the mygcc page gives x86/linux binaries, but > > no source tarball of their compiler (this seems to me against the > > spirit of the GPL licence, but I am not a lawyer). > > > > My public apologies to MyGCC. There is a patch on > http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01437.html but sadly the > http://mygcc.free.fr does not provide any link to it. You shouldn't apologize. The MyGCC people need to read the GPL FAQ, particularly http://www.gnu.org/licenses/gpl-faq.html#DistributingSourceIsInconvenient
RE: MyGCC and whole program static analysis
On 30 August 2006 17:53, Joe Buck wrote: > On Wed, Aug 30, 2006 at 06:36:19PM +0200, Basile STARYNKEVITCH wrote: >> Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which >> seems to be an extended GCC to add some kind of static analysis. >> >> I'm quite surprised that the mygcc page gives x86/linux binaries, but >> no source tarball of their compiler (this seems to me against the >> spirit of the GPL licence, but I am not a lawyer). > > Not just the spirit; the GPL requires that full sources be made available, > not just a patch (alternatively, a written offer to provide full source > can be provided with the binary, but one or the other is required). > However, I can't tell if there is a violation, since I haven't downloaded > the tarball and looked for written offers, and it appears that if there is > a violation, it is a well-intentioned error. This is the kind of thing > that the FSF usually resolves quietly and amicably. Do you know if the included copy of $prefix/man/man7/gpl.7 counts as a "written offer"? cheers, DaveK -- Can't think of a witty .sigline today
Re: MyGCC and whole program static analysis
Joe Buck wrote: > On Wed, Aug 30, 2006 at 06:52:59PM +0200, Basile STARYNKEVITCH wrote: > > Le Wed, Aug 30, 2006 at 06:36:19PM +0200, basile écrivait/wrote: > > > > > > Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which > > > seems to be an extended GCC to add some kind of static analysis. > > > > > > I'm quite surprised that the mygcc page gives x86/linux binaries, but > > > no source tarball of their compiler (this seems to me against the > > > spirit of the GPL licence, but I am not a lawyer). > > > > > > > My public apologies to MyGCC. There is a patch on > > http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01437.html but sadly the > > http://mygcc.free.fr does not provide any link to it. > > You shouldn't apologize. The MyGCC people need to read the GPL FAQ, > particularly > > http://www.gnu.org/licenses/gpl-faq.html#DistributingSourceIsInconvenient > If somebody wants to try mygcc, I included the patch in the graphite-branch some months ago. In my opinion the patch needs major rework and improvements to be included in trunk. You're welcome to help improving that patch. Sebastian
Re: MyGCC and whole program static analysis
Sebastian Pop wrote: > In my opinion the patch needs major rework and improvements to be > included in trunk. Here is my short review of the mygcc patch that lists some possible improvements and things that have to be redesigned: http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00616.html
Re: MyGCC and whole program static analysis
On Wed, Aug 30, 2006 at 06:18:06PM +0100, Dave Korn wrote: > Do you know if the included copy of $prefix/man/man7/gpl.7 counts as a > "written offer"? We're way off topic, so I'll reply to Dave offline. It appears otherwise we'll have a big gnu.misc.discuss thread here.
Inserting function calls
Dear all, I have been trying to insert function calls during a new pass in the compiler but it does not seem to like my way of doing it. The basic idea is to insert a function call before any load in the program (later on I'll be selecting a few loads but for now I just want to do it for each and every one). This looks like the profiling pass but I can't seem to understand what is the matter... This is what the compiler says when I try to compile this test case : hello.c:18: internal compiler error: tree check: expected ssa_name, have symbol_memory_tag in verify_ssa, at tree-ssa.c:776 This is what I did in my pass : /* Generation of the call type, function will be of type * de type void (*) (int) */ tree call_type = build_function_type_list ( void_type_node, integer_type_node, NULL_TREE); tree call_fn = build_fn_decl ("__MarkovMainEntry",call_type); data_reference_p a; for (j = 0; VEC_iterate (data_reference_p, datarefs, j, a); j++) { tree stmt = DR_STMT (a); /* Is it a load */ if (DR_IS_READ (a)) { printf("Have a load : %d\n",compteur_interne); tree compteur = build_int_cst (integer_type_node, compteur_interne); compteur_interne++; /* Argument creation, just pass the constant integer node */ tree args = tree_cons (NULL_TREE, compteur, NULL_TREE); tree call = build_function_call_expr (call_fn, args); block_stmt_iterator bsi; bsi = bsi_for_stmt (stmt); bsi_insert_after(&bsi, call, BSI_SAME_STMT); } } And the test code is : #include int fct(int *t); int main() { int tab[10]; int i; tab[0] = 0; printf("Hello World %d\n",tab[5]); printf("Sum is : %d\n",fct(tab)); return 0; } int fct(int *t) { int i=10; int tab[10]; while(i) { tab[i] = tab[i]*2+1+tab[i+1]; i--; printf("Here %d\n",tab[5]); } return t[i]; } Does anyone have any ideas on to how I can modify my function and get it to insert the functions correctly ? Thanx for any help in finishing this pass, Jc Finally here is the patch that shows what I did using the trunk 4.2 : Index: doc/invoke.texi === --- doc/invoke.texi(revision 116394) +++ doc/invoke.texi(working copy) @@ -342,7 +342,7 @@ Objective-C and Objective-C++ Dialects}. -fsplit-ivs-in-unroller -funswitch-loops @gol -fvariable-expansion-in-unroller @gol -ftree-pre -ftree-ccp -ftree-dce -ftree-loop-optimize @gol --ftree-loop-linear -ftree-loop-im -ftree-loop-ivcanon -fivopts @gol +-ftree-loop-linear -ftree-load-inst -ftree-loop-im -ftree-loop-ivcanon -fivopts @gol -ftree-dominator-opts -ftree-dse -ftree-copyrename -ftree-sink @gol -ftree-ch -ftree-sra -ftree-ter -ftree-lrs -ftree-fre -ftree-vectorize @gol -ftree-vect-loop-version -ftree-salias -fipa-pta -fweb @gol @@ -5120,6 +5120,10 @@ at @option{-O} and higher. Perform linear loop transformations on tree. This flag can improve cache performance and allow further loop optimizations to take place. [EMAIL PROTECTED] -ftree-load-inst +Perform instrumentation of load on trees. This flag inserts a call to a profiling +function before the loads of a program. + @item -ftree-loop-im Perform loop invariant motion on trees. This pass moves only invariants that would be hard to handle at RTL level (function calls, operations that expand to Index: tree-pass.h === --- tree-pass.h(revision 116394) +++ tree-pass.h(working copy) @@ -251,6 +251,7 @@ extern struct tree_opt_pass pass_empty_l extern struct tree_opt_pass pass_record_bounds; extern struct tree_opt_pass pass_if_conversion; extern struct tree_opt_pass pass_vectorize; +extern struct tree_opt_pass pass_load_inst; extern struct tree_opt_pass pass_complete_unroll; extern struct tree_opt_pass pass_loop_prefetch; extern struct tree_opt_pass pass_iv_optimize; Index: tree-load-inst.c === --- tree-load-inst.c(revision 0) +++ tree-load-inst.c(revision 0) @@ -0,0 +1,125 @@ +#include +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "ggc.h" +#include "tree.h" +#include "target.h" + +#include "rtl.h" +#include "basic-block.h" +#include "diagnostic.h" +#include "tree-flow.h" +#include "tree-dump.h" +#include "timevar.h" +#include "cfgloop.h" +#include "expr.h" +#include "optabs.h" +#include "tree-chrec.h" +#include "tree-data-ref.h" +#include "tree-scalar-evolution.h" +#include "tree-pass.h" +#include "lambda.h" + +extern struct loops *current_loops; +static void tree_handle_loop (struct loops *loops);
Re: Can we limit one bug fix per checkin please?
On Sun, Jul 30, 2006 at 04:38:38PM -0700, H. J. Lu wrote: > When one checkin is used to fix multiple bugs, it isn't easy to back > out just the offending bug fix only if one of the bug fixes causes > regression. Can we limit one bug fix per checkin? > > Thanks. It happened again. This checkin: http://gcc.gnu.org/ml/gcc-cvs/2006-08/msg00427.html causes the regression: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28908 There are several bug fixes in revision 116268. It is hard to just back out one bug fix. Paul, would you mind just fixing one bug in one checkin? Thanks. H.J.
Re: Can we limit one bug fix per checkin please?
> It happened again. This checkin: Yes the standard thing is one checkin pre fix. but it also annoying that you (HJL) don't understand how to file a bug report which is actually documented. -- Pinski
Re: Inserting function calls
[EMAIL PROTECTED] wrote on 08/30/06 14:44: > Does anyone have any ideas on to how I can modify my function and get it > to insert the functions correctly ? > Browse through omp-low.c. In particular create_omp_child_function and expand_omp_parallel. The new function needs to be added to the call graph and queued for processing (cgraph_add_new_function).
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
> "KZ" == Kenneth Zadeck <[EMAIL PROTECTED]> writes: KZ> 2) To have a discussion about the use of DWARF3. I am now against the KZ> use of DWARF3 for encoding the GIMPLE. FWIW your arguments convinced me. I think what matters most is that the resulting format be relatively well documented (say, better than GIMPLE), efficient, suitable, etc. Reusing DWARF3 seems cute but inessential. [...] KZ> +case TRUTH_NOT_EXPR: KZ> +case VIEW_CONVERT_EXPR: KZ> +#if STUPID_TYPE_SYSTEM KZ> + output_type_ref (ob, TREE_TYPE (expr)); KZ> +#endif I think VIEW_CONVERT_EXPR needs to be treated like NOP_EXPR and CONVERT_EXPR in the STUPID_TYPE_SYSTEM case. VIEW_CONVERT_EXPR is a type-casting expression. KZ> +/* When we get a strongly typed gimple, which may not be for another KZ> + 15 years, this flag should be set to 0 so we do not waste so much KZ> + space printing out largely redundant type codes. */ KZ> +#define STUPID_TYPE_SYSTEM 1 You could write a more compact form by emitting explicit "fake nop" nodes where needed, and then strip those when reading. I think this would avoid tweaking the optimizer bugs, as the reloaded trees would be identical. This does bring up another point about the format though: there's got to be some versioning capability in there. Tom
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
> [...] > KZ> +case TRUTH_NOT_EXPR: > KZ> +case VIEW_CONVERT_EXPR: > KZ> +#if STUPID_TYPE_SYSTEM > KZ> + output_type_ref (ob, TREE_TYPE (expr)); > KZ> +#endif > > I think VIEW_CONVERT_EXPR needs to be treated like NOP_EXPR and > CONVERT_EXPR in the STUPID_TYPE_SYSTEM case. VIEW_CONVERT_EXPR is a > type-casting expression. No it is not, it is more complex than a just a simple type-casting expression, it is a cast which does nothing to the bits at all. It acts more like a reference than a cast as it is also can be used on the left hand side. I am working on a patch to have GCC use VCE more. > You could write a more compact form by emitting explicit "fake nop" > nodes where needed, and then strip those when reading. I think this > would avoid tweaking the optimizer bugs, as the reloaded trees would > be identical. Actually it would be better just to fix the problem in the first place as mentioned before. -- Pinski
Re: Can we limit one bug fix per checkin please?
Andrew Pinski wrote: It happened again. This checkin: Yes, we did discuss it before - sorry, HJ; I am trying to get as much done before I am forced to reduce my work on gfortran. It is much easier to do multiple patches but I will desist. Yes the standard thing is one checkin pre fix ...snip... OK - will do. However, all that said, I am onto it Paul
problem when returning a structure containing arrays.
Hi, I compiled the follwing code with gcc -shared buglib.c -o buglib.dll: >>> buglib.h is: struct T { double x[256]; double y[256]; int i; }; struct T fun(int a); >>> buglib.c is #include "buglib.h" struct T fun(int a) { struct T retval; int i; for (i=0; i<256;++i) { retval.x[i]=(double)i; retval.y[i]=(double)i; } return retval; } If I linkt this lib to >>> main.c #include #include "buglib.h" int main() { struct T x = fun(1); int i; for (i=0; i<10; ++i) printf("%d %d\n", x.x[i], x.y[i]); } Now the output is totally wrong ! I tried it with the cygwin port gcc 3.4.4 and with gcc 3.3.1 on Suse Linux. Any hints ? Greetings, Uwe
Re: problem when returning a structure containing arrays
Sorry, I made a mistake with the printf()-formatting- charcters. Greetings, Uwe
Re: Inserting function calls
Browse through omp-low.c. In particular create_omp_child_function I understand the beginning of the function with its declaration of the function but I have a question about these lines : /* Allocate memory for the function structure. The call to allocate_struct_function clobbers CFUN, so we need to restore it afterward. */ allocate_struct_function (decl); DECL_SOURCE_LOCATION (decl) = EXPR_LOCATION (ctx->stmt); cfun->function_end_locus = EXPR_LOCATION (ctx->stmt); cfun = ctx->cb.src_cfun; Is that a necessary process for the declaration of a function ? I ask because I do not want the compiler to compile directly my function but rather ask the linker to take care of that (it will be an external function). and expand_omp_parallel. I notice that at the end of that function there is a call to expand_parallel_call and in that function I don't see a difference with how I prepare the arguments. This leads me to think that the problem lies with the declaration of the function, am I correct ? The new function needs to be added to the call graph and queued for processing (cgraph_add_new_function). This would be true if I wanted it to be compiled but if I do not (using a precompiled version) ? Thank you for your time, Jc - Degskalle There is no point in arguing with an idiot, they will just drag you down to their level and beat you with experience Référence: http://www.bash.org/?latest -
Re: Inserting function calls
[EMAIL PROTECTED] wrote on 08/30/06 16:41: > Is that a necessary process for the declaration of a function ? I ask > because I do not want the compiler to compile directly my function but > rather ask the linker to take care of that (it will be an external > function). > Oh, so you only want to insert a library call? In that case the work done in omp-low.c is going to be a lot more than you need. You only need to check how the actuall CALL_EXPR is built to call the newly added function. In create_omp_child_function, an identifier for the new function is created. We then create a call to it using build_function_call_expr in expand_parallel_call.
gcc 4.1.1 - successful build and install - i386-pc-mingw32 (msys running on a WinXP box)
Follows the build info: config.guess: i386-pc-mingw32 $ gcc -v Using built-in specs. Target: mingw32 Configured with: ../../source/gcc-4.1.1/configure --prefix=/mingw --host=mingw32 --target=mingw32 --program-prefix="" --with-gcc --with-gnu-ld --with-gnu-as --enable-threads --disable-nls --enable-languages=c,c++ --disable-win32-registry --disable-shared --without-x --enable-interpreter --enable-hash-synchronization --enable-libstdcxx-debug Thread model: win32 gcc version 4.1.1 $ uname -a MINGW32_NT-5.1 THERGOTHON 1.0.10(0.46/3/2) 2004-03-15 07:17 i686 unknown host system: WinXP Pro SP2 i686 /me: Marcelo A B Slomp - Brazil -- __ Now you can search for products and services http://search.mail.com Powered by Outblaze
Re: Inserting function calls
In create_omp_child_function, an identifier for the new function is created. We then create a call to it using build_function_call_expr in expand_parallel_call. Ok so that's what I saw, is this call necessary for what I'd need : decl = lang_hooks.decls.pushdecl (decl); Then simplifying the problem, I just want to call a void _foo(void) function so, taking from create_omp_child_function and expand_parallel_call I did this : type = build_function_type_list (void_type_node, NULL_TREE); decl = build_decl (FUNCTION_DECL, "_foo" , type); TREE_STATIC (decl) = 1; TREE_USED (decl) = 1; DECL_ARTIFICIAL (decl) = 1; DECL_IGNORED_P (decl) = 0; TREE_PUBLIC (decl) = 0; DECL_UNINLINABLE (decl) = 1; DECL_EXTERNAL (decl) = 0; DECL_CONTEXT (decl) = NULL_TREE; DECL_INITIAL (decl) = make_node (BLOCK); t = build_decl (RESULT_DECL, NULL_TREE, void_type_node); DECL_ARTIFICIAL (t) = 1; DECL_IGNORED_P (t) = 1; DECL_RESULT (decl) = t; t = build_decl (PARM_DECL, NULL_TREE, void_type_node); DECL_ARTIFICIAL (t) = 1; DECL_ARG_TYPE (t) = void_type_node; DECL_CONTEXT (t) = current_function_decl; TREE_USED (t) = 1; DECL_ARGUMENTS (decl) = t; tree list = NULL_TREE; tree call = build_function_call_expr (decl, NULL); gimplify_and_add(call,&list); bsi_insert_before(&bsi, list, BSI_CONTINUE_LINKING); But this gives me this error when I try compiling it hello.c:18: internal compiler error: tree check: expected identifier_node, have obj_type_ref in special_function_p, at calls.c:475 Any ideas ? Jc - Degskalle There is no point in arguing with an idiot, they will just drag you down to their level and beat you with experience Référence: http://www.bash.org/?latest -
Re: Inserting function calls
Hello, > I have been trying to insert function calls during a new pass in the > compiler but it does not seem to like my way of doing it. The basic > idea is to insert a function call before any load in the program > (later on I'll be selecting a few loads but for now I just want to do > it for each and every one). > > This looks like the profiling pass but I can't seem to understand what > is the matter... > > This is what the compiler says when I try to compile this test case : > > >hello.c:18: internal compiler error: tree check: expected ssa_name, > >have symbol_memory_tag in verify_ssa, at tree-ssa.c:776 what you do seems basically OK to me. The problem is that you also need to fix the ssa form for the virtual operands of the added calls (i.e., you must call mark_new_vars_to_rename for each of the calls, and update_ssa once at the end of tree_handle_loop). Zdenek
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Kenneth Zadeck wrote: This posting is a progress report of my task of encoding and decoding the GIMPLE stream into LTO. Included in this posting is a patch that encodes functions and dumps the result to files. [I'm sorry for not replying to this sooner. I've been on a plane or in a meeting virtually all of my waking hours since you wrote this...] Exciting! 2) To have a discussion about the use of DWARF3. I am now against the use of DWARF3 for encoding the GIMPLE. As the person who suggested this to you, I'm sorry it doesn't seem to meet your needs. I'll make a few more specific comments below, but, to echo what I said at the time: I think it was a good idea to try it, but if it's not the right tool for the job, then so be it. My opinion has always been that, for the bodies of functions, DWARF is just an encoding: if it's a decent encoding, then, it's nice that it's a standard; if it's a bad encoding, then, by all means, let's not use it! I do think DWARF is a good choice for the non-executable information, for the same reasons I did initially: (a) for debugging, we need to generate most of that information anyhow, so we're piggy-backing on existing code -- and not making object files bigger by encoding the same information twice in the case of "-O2 -g", which is the default way that many GNU applications are built. (b) it's a standard, and we already have tools for reading DWARF, so it saves the trouble of coming up with a new encoding, (c) because bugs in the DWARF emission may not result in problems at LTO, we'll be validating our LTO information every time we use GDB, and, similarly, improving the GDB experience every time we fix an LTO bug in this area. I understand that some of these benefits do not apply to the executable code, and that, even to the extent they may apply, the tradeoffs are different. The comments I've made below about specific issues should therefore be considered as academic responses, not an attempt to argue the decision you have made. 3) To get some one to tell me what option we are going to add to the compiler to tell it to write this information. I think a reasonable spelling is probably "-flto". It should not be a -m option since it is not machine-specific. I don't think it should be a -O option either, since writing out LTO information isn't really supposed to affect optimization per se. 2) The code is, by design, fragile. It takes nothing for granted. Every case statement has gcc_unreachable as it's default case. That's the same way I've approached the DWARF reading code, and for the same reason. I think that's exactly the right decision. 1) ABBREV TABLES ARE BAD FOR LTO. However, this mechanism is only self descriptive if you do not extend the set of tags. That is not an option for LTO. Definitely true. When we talked on the phone, we talked about creating a tag corresponding to each GIMPLE tree code. However, you could also create a numeric attribute giving the GIMPLE tree code. If you did that, you might find that the abbreviation table became extremely small -- because almost all interior nodes would be DW_TAG_GIMPLE_EXPR nodes. The downside, of course, is that the storage required to store the nodes would be greater, as it would now contain the expression code (e.g., PLUS_EXPR), rather than having a DW_TAG_GIMPLE_PLUS_EXPR. I strongly believe that for LTO to work, we are going to have to implement some mechanism where the function bodies are loaded into the compiler on demand (possibly kept in cache, but maybe not). Agreed. This will be more cumbersome if we have to keep reloading each object file's abbrev table just to tear apart a single function in that .o file. While the abbrev sections average slightly less than %2 of the of the size of the GIMPLE encoding for an entire file, each abbrev table averages about the same size as a single function. Interesting datapoint. (Implied, but not stated, in your mail is the fact that the abbreviation table cannot be indexed directly. If it could be, then you wouldn't have to read the entire abbreviation table for each function; you would just read the referenced abbreviations. Because the abbreviation table records are of variable length, it is indeed true that you cannot make random accesses to the table. So, this paragraph is just fleshing out your argument.) I think the conclusion that you reach (that the size of the tables is a problem) depends on how you expect the compiler to process functions at link-time. My expectation was that you would form a global control-flow graph for the entire program (by reading CFG data encoded in each .o file), eliminate unreachable functions, and then inline/optimize functions one-at-a-time. If you sort the function-reading so that you prefer to read functions from the same object file in order, then I would expect that you would considerably reduce the impact of readin
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Mark Mitchell wrote: > Kenneth Zadeck wrote: > > > >> This >> will be more cumbersome if we have to keep reloading each object >> file's abbrev table just to tear apart a single function in that .o >> file. While the abbrev sections average slightly less than %2 of the >> of the size of the GIMPLE encoding for an entire file, each abbrev table >> averages about the same size as a single function. > > Interesting datapoint. > > (Implied, but not stated, in your mail is the fact that the > abbreviation table cannot be indexed directly. If it could be, then > you wouldn't have to read the entire abbreviation table for each > function; you would just read the referenced abbreviations. Because > the abbreviation table records are of variable length, it is indeed > true that you cannot make random accesses to the table. So, this > paragraph is just fleshing out your argument.) > > I think the conclusion that you reach (that the size of the tables is > a problem) depends on how you expect the compiler to process functions > at link-time. My expectation was that you would form a global > control-flow graph for the entire program (by reading CFG data encoded > in each .o file), eliminate unreachable functions, and then > inline/optimize functions one-at-a-time. > > If you sort the function-reading so that you prefer to read functions > from the same object file in order, then I would expect that you would > considerably reduce the impact of reading the abbreviation tables. > I'm making the assumption that it f calls N functions, then they > probably come from < N object files. I have no data to back up that > assumption. > > (There is nothing that says that you can only have one abbreviation > table for all functions. You can equally well have one abbreviation > table per function. In that mode, you trade space (more abbreviation > tables, and the same abbreviation appearing in multiple tables) > against the fact that you now only need to read the abbreviation > tables you need. I'm not claiming this is a good idea.) > > I don't find this particular argument (that the abbreviation tables > will double file I/O) very convincing. I don't think it's likely that > the problem we're going to have with LTO is running out of *virtual* > memory, especially as 64-bit hardware becomes nearly universal. The > problem is going to be running out of physical memory, and thereby > paging like crazy, running out of D-cache. So, I'd assume you'd just > read the tables as-needed, and never both discarding them. As long as > there is reasonable locality of reference to abbreviation tables > (i.e., you can arrange to hit object files in groups), then the cost > here doesn't seem like it would be very big. > Even if we decide that we are going to process all of the functions in one file at one time, we still have to have access to the functions that are going to be inlined into the function being compiled. Getting at those functions that are going to be inlined is where the double the i/o arguement comes from. I have never depended on the kindness of strangers or the virtues of virtual memory. I fear the size of the virtual memory when we go to compile really large programs. >> 2) I PROMISED TO USE THE DWARF3 STACK MACHINE AND I DID NOT. > > I never imagined you doing this; as per above, I always expected that > you would use DWARF tags for the expression nodes. I agree that the > stack-machine is ill-suited. > >> 3) THERE IS NO COMPRESSION IN DWARF3. > >> In 1 file per mode, zlib -9 compression is almost 6:1. In 1 function >> per mode, zlib -9 compression averages about 3:1. > > In my opinion, if you considered DWARF + zlib to be satisfactory, then > I think that would be fine. For LTO, we're allowed to do whatever we > want. I feel the same about your confession that you invented a new > record form; if DWARF + extensions is a suitable format, that's fine. > In other words, in principle, using a somewhat non-standard variant of > DWARF for LTO doesn't seem evil to me -- if that met our needs. > One of the comments that was made by a person on the dwarf committee is that the abbrev tables really can be used for compression. If you have information that is really common to a bunch of records, you can build an abbrev entry with the common info in it. I have not seen a place where any use can be made of this for encoding gimple except for a couple of places where I have encoded a true or false. I therefor really do not see that they really add anything except for the code to read and write them. >> 2) LOCAL DECLARATIONS >> Mark was going to do all of the types and all of the declarations. >> His plan was to use the existing DWARF3 and enhance it where it was >> necessary eventually replacing the GCC type trees with direct >> references to the DWARF3 symbol table. > > > The types and global variables are likely OK, or at least Mark >> should be able to add any missing info. > I had a discussion on chat t
Re: regress and -m64
After some discussion with Jack Howarth, I have found that the gfortran and libgomp executable tests on powerpc-apple-darwin8.7.0 (at least) do not link the correct, just-built-using-"make bootstrap", libraries until those libraries have first been installed in $prefix/lib/... I filed a bug report at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28913 I noted the different results there between two "make check" commands, one just before "make install" and one just after. 64-bit test results are now as follows. (See also http://gcc.gnu.org/ml/gcc-testresults/2006-08/msg01383.html ) === g++ Summary === # of expected passes11595 # of unexpected failures1350 # of expected failures 69 # of unresolved testcases 28 # of unsupported tests 129 /Users/lucier/programs/gcc/mainline/objdir64/gcc/testsuite/g++/../../g ++ version 4.2.0 20060829 (experimental) === gcc Summary === # of expected passes41550 # of unexpected failures45 # of unexpected successes 1 # of expected failures 108 # of untested testcases 28 # of unsupported tests 507 /Users/lucier/programs/gcc/mainline/objdir64/gcc/xgcc version 4.2.0 20060829 (experimental) === gfortran Summary === # of expected passes14014 # of unexpected failures33 # of unexpected successes 3 # of expected failures 7 # of unsupported tests 41 /Users/lucier/programs/gcc/mainline/objdir64/gcc/testsuite/ gfortran/../../gfortran version 4.2.0 20060829 (experimental) === objc Summary === # of expected passes1707 # of unexpected failures68 # of expected failures 7 # of unresolved testcases 1 # of unsupported tests 2 /Users/lucier/programs/gcc/mainline/objdir64/gcc/xgcc version 4.2.0 20060829 (experimental) === libffi Summary === # of expected passes472 # of unexpected failures384 # of expected failures 8 # of unsupported tests 8 === libgomp Summary === # of expected passes1075 # of unexpected failures205 # of unsupported tests 111 === libjava Summary === # of expected passes1776 # of unexpected failures2069 # of expected failures 32 # of untested testcases 3021 === libstdc++ Summary === # of expected passes2052 # of unexpected failures1668 # of unexpected successes 1 # of expected failures 15 # of unsupported tests 321
Re: regress and -m64
Bradley, Something still is as astray with your build configuration. Look at my last set of results. http://gcc.gnu.org/ml/gcc-testresults/2006-08/msg01333.html I only have 28 unexpected failures for g++ at -m64 and you have 1350. Likewise for libstdc++ at -m64, I only have 6 unexpected failure whereas you have 1668. Try building some of the g++ testcases manually and see what the errors are. Assuming you really have the ld64 from Xcode 2.3 installed I am guessing you have some bogus copies of libstdc++ laying around in your path. Jack
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Kenneth Zadeck wrote: Even if we decide that we are going to process all of the functions in one file at one time, we still have to have access to the functions that are going to be inlined into the function being compiled. Getting at those functions that are going to be inlined is where the double the i/o arguement comes from. I understand -- but it's natural to expect that those functions will be clumped together. In a gigantic program, I expect there are going to be clumps of tightly connected object files, with relatively few connections between the clumps. So, you're likely to get good cache behavior for any per-object-file specific data that you need to access. I have never depended on the kindness of strangers or the virtues of virtual memory. I fear the size of the virtual memory when we go to compile really large programs. I don't think we're going to blow out a 64-bit address space any time soon. Disks are big, but they are nowhere near *that* big, so it's going to be pretty hard for anyone to hand us that many .o files. And, there's no point manually reading/writing stuff (as opposed to mapping it into memory), unless we actually run out of address space. In fact, if you're going to design your own encoding formats, I would consider a format with self-relative pointers (or, offsets from some fixed base) that you could just map into memory. It wouldn't be as compact as using compression, so the total number of bytes written when generating the object files would be bigger. But, it will be very quick to load it into memory. I guess my overriding concern is that we're focusing heavily on the data format here (DWARF? Something else? Memory-mappable? What compression scheme?) and we may not have enough data. I guess we just have to pick something and run with it. I think we should try to keep that code as as separate as possible so that we can recover easily if whatever we pick turns out to be (another) bad choice. :-) One of the comments that was made by a person on the dwarf committee is that the abbrev tables really can be used for compression. If you have information that is really common to a bunch of records, you can build an abbrev entry with the common info in it. Yes. I was a little bit surprised that you don't seem to have seen much commonality. If you recorded most of the tree flags, and treated them as DWARF attributes, I'd expect you would see relatively many expressions of a fixed form. Like, there must be a lot of PLUS_EXPRs with TREE_USED set on them. But, I gather that you're trying to avoid recording some of these flags, hoping either that (a) they won't be needed, or (b) you can recreate them when reading the file. I think both (a) and (b) hold in many cases, so I think it's reasonable to assume we're writing out very few attributes. I had a discussion on chat today with drow and he indicated that you were busily adding all of the missing stuff here. "All" is an overstatement. :-) Sandra is busily adding missing stuff and I'll be working on the new APIs you need. I told him that I thought this was fine as long as there is not a temporal drift in information encoded for the types and decls between the time I write my stuff and when the types and decls are written. I'm not sure what this means. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
gcc.target/powerpc vs -m64
Geoff, I am assuming that quite a few of the remaining regressions at -m64 on Darwin PPC with your TImode patch applied will be resolved when Eric posts his x86_64 patches. However there are a few in gcc.target/powerpc which likely won't be addressed by those patches. I am seeing the following test cases fail for -m64 only... gcc-4 -m64 -O2 ppc64-abi-1.c /var/tmp//cchkbsnr.s:22:Parameter syntax error (parameter 2) /var/tmp//cchkbsnr.s:22:Invalid mnemonic 'got(r2)' /var/tmp//cchkbsnr.s:93:Parameter syntax error (parameter 2) /var/tmp//cchkbsnr.s:93:Invalid mnemonic 'got(r2)' /var/tmp//cchkbsnr.s:161:Parameter syntax error (parameter 2) /var/tmp//cchkbsnr.s:161:Invalid mnemonic 'got(r2)' /var/tmp//cchkbsnr.s:252:Parameter syntax error (parameter 2) /var/tmp//cchkbsnr.s:252:Invalid mnemonic 'got(r2)' /var/tmp//cchkbsnr.s:372:Parameter syntax error (parameter 2) /var/tmp//cchkbsnr.s:372:Invalid mnemonic 'got(r2)' /var/tmp//cchkbsnr.s:450:Parameter syntax error (parameter 2) /var/tmp//cchkbsnr.s:450:Invalid mnemonic 'got(r2)' gcc-4 -m64 -O2 darwin-bool-1.c darwin-bool-1.c:5: error: size of array 'dummy1' is too large The failures for... FAIL: gcc.target/powerpc/ppc-and-1.c scan-assembler rlwinm [0-9]+,[0-9]+,0,0,30 FAIL: gcc.target/powerpc/ppc-and-1.c scan-assembler rlwinm [0-9]+,[0-9]+,0,29,30 FAIL: gcc.target/powerpc/ppc-negeq0-1.c scan-assembler-not cntlzw are a tad confusing because if I do... gcc-4 -O2 -m64 -S -c ppc-and-1.c grep rlwinm ppc-and-1.s rlwinm r4,r4,0,0,30 rlwinm r4,r4,0,29,30 grep rldicr ppc-and-1.s (no results) This is confusing because it suggests the test *should* be passing! On the other hand, the failure in the ppc-negeq0-1 is understandable... gcc-4 -O2 -m64 -S -c ppc-negeq0-1.c grep cntlzw ppc-negeq0-1.s cntlzw r3,r3 cntlzw r3,r3 Geoff, should I file PR's for each of these? Jack
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
On 8/30/06, Mark Mitchell <[EMAIL PROTECTED]> wrote: ... I guess my overriding concern is that we're focusing heavily on the data format here (DWARF? Something else? Memory-mappable? What compression scheme?) and we may not have enough data. I guess we just have to pick something and run with it. I think we should try to keep that code as as separate as possible so that we can recover easily if whatever we pick turns out to be (another) bad choice. :-) At the risk of stating the obvious and also repeating myself, please allow me give my thought on this issue. I think we should take even a step further than "try to keep the code as separate". We should try to come up with a set of procedural and datastructural interface for the input/output of the program structure, and try to *completely* separate the optimization/datastructure cleanup work from the encoding/decoding. Beside the basic requirement of being able to pass through enough information to produce valid program, I think there is a critical requirement to implement inter-module/inter-procedural optimization efficiently - that the I/O interface allows efficient handling of iterating through module/procedure-level information without reading each and every module/procedure bodies (as Ken mentioned). There are certain amount of information per object/procedure that are accessed during different optimization and with sufficiently different pattern - e.g. type tree is naturally an object-level information that we may want to go through for each and every object file, without read all function bodies, and other function level information such as caller/callee information would be useful without the function body. We'll need to identify such information (in other words, the requirement of the interprocedural optimization/analysis) so that the new interface would provide ways to walk through them without loading the entire function bodies - even with large address space, if the data is scattered everywhere, it becomes extremely inefficient on modern machines to go through them, so it's actually more important to identify what logical information that we want to access during various interprocedural optimizations and the I/O interface needs to handle them efficiently. This requirement should dictate how we encode/layout the data into the disk, before anything else. Also how the information is presented to the actual inter-module optimization/analysis. Also, part of defining the interface would involve restricting the existing structures (e.g. GIMPLE) in possibly more limited form than what's currently allowed. By virtue of having an interface that separates the encoding/decoding from the rest of the compilation, we can throw away and recompute certain information (e.g. often certain control flow graph can be recovered, hence does not need to be encoded) but those details can be worked out as the implementation of the IO interface gets more in shape. -- #pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com";