Re: How to activate instruction scheduling in GCC?
Thanks .. your reply is really helpful ... Btw, I checked the MIPS backend at MIPS.c, but I can't find the definition of some functions such as: get_attr_hazard(), gen_hazard_nop (), etc. Anyone know where those functions defined? Ian Lance Taylor-3 wrote: > > petruk_gile <[EMAIL PROTECTED]> writes: > >> I'm a pure beginner in GCC, and currently working on a project to >> implement >> instruction scheduling for a new DSP processor. This processor doesn't >> have >> pipeline interlock, so the compiler HAVE to schedule the instruction >> without >> relying on hardware help anymore >> >> The problem is, I'm a very beginner in GCC. I think the scheduling in GCC >> is >> activated by INSN_SCHEDULING variable (in automatically generated file: >> insn-attr.h), but I don't even know how to activate this variable. > > INSN_SCHEDULING will automatically be turned on if you have any > define_insn_reservation clauses in your CPU.md file. See the > "Processor pipeline description" documentation in the gcc internals > manual. > > That said, the gcc scheduler unfortunately does not work very well for > processors which do not have hardware interlocks. The scheduler will > lay out the instructions more or less optimally. But the scheduler > has no ability to insert nops when they are required to satisfy > interlock constraints. > > I know of two workable approachs. You can either insert the required > nops in the TARGET_MACHINE_DEPENDENT_REORG pass or in the > TARGET_ASM_FUNCTION_PROLOGUE hook. I personally prefer the latter > approach, as it takes effect after all other instruction rearrangement > is complete, but there are existing backends which use the former. > > For an example of inserting nops in TARGET_MACHINE_DEPENDENT_REORG, > see the MIPS backend, specifically mips_avoid_hazards. For an example > of inserting nops in TARGET_ASM_FUNCTION_PROLOGUE, see the FRV > backend, specifically frv_pack_insns. > > Ian > > -- View this message in context: http://www.nabble.com/How-to-activate-instruction-scheduling-in-GCC--tf4167590.html#a11940780 Sent from the gcc - Dev mailing list archive at Nabble.com.
GCC 4.2.1 : testsuite says WARNING: program timed out
Is there a way to allow the testsuite to just run regardless of howlong it takes? I am getting "program timed out" warnings for multiple tests : Running /export/home/dclarke/build/gcc-4.2.1/gcc/testsuite/gcc.c-torture/compile/compile.exp ... WARNING: program timed out. FAIL: gcc.c-torture/compile/20001226-1.c -O1 (test for excess errors) WARNING: program timed out. FAIL: gcc.c-torture/compile/20001226-1.c -O2 (test for excess errors) WARNING: program timed out. FAIL: gcc.c-torture/compile/20001226-1.c -O3 -fomit-frame-pointer (test for excess errors) WARNING: program timed out. FAIL: gcc.c-torture/compile/20001226-1.c -O3 -g (test for excess errors) WARNING: program timed out. - Dennis Clarke
Re: GCC 4.2.1 : testsuite says WARNING: program timed out
Dennis Clarke wrote: Is there a way to allow the testsuite to just run regardless of howlong it takes? I am getting "program timed out" warnings for multiple tests : Running /export/home/dclarke/build/gcc-4.2.1/gcc/testsuite/gcc.c-torture/compile/compile.exp ... WARNING: program timed out. FAIL: gcc.c-torture/compile/20001226-1.c -O1 (test for excess errors) WARNING: program timed out. FAIL: gcc.c-torture/compile/20001226-1.c -O2 (test for excess errors) WARNING: program timed out. FAIL: gcc.c-torture/compile/20001226-1.c -O3 -fomit-frame-pointer (test for excess errors) WARNING: program timed out. FAIL: gcc.c-torture/compile/20001226-1.c -O3 -g (test for excess errors) WARNING: program timed out. You need a faster computer. Those tests take a long time. On slow systems they take longer than the default testsuite timeout to compile. You can probably safely ignore time outs for 20001226-1.c David Daney
Re: GCC 4.2.1 : testsuite says WARNING: program timed out
> Dennis Clarke wrote: >> Is there a way to allow the testsuite to just run regardless of howlong it >> takes? >> >> I am getting "program timed out" warnings for multiple tests : >> >> Running >> /export/home/dclarke/build/gcc-4.2.1/gcc/testsuite/gcc.c-torture/compile/compile.exp >> ... >> WARNING: program timed out. >> FAIL: gcc.c-torture/compile/20001226-1.c -O1 (test for excess errors) >> WARNING: program timed out. >> FAIL: gcc.c-torture/compile/20001226-1.c -O2 (test for excess errors) >> WARNING: program timed out. >> FAIL: gcc.c-torture/compile/20001226-1.c -O3 -fomit-frame-pointer (test >> for excess errors) >> WARNING: program timed out. >> FAIL: gcc.c-torture/compile/20001226-1.c -O3 -g (test for excess errors) >> WARNING: program timed out. >> >> > You need a faster computer. Trust me .. I know. :-) I do have access to top of the line Sun gear but I am running this experiment and this bootstrap on this machine. Thus, the question stands : Is there a way to allow the testsuite to just run regardless of how long it takes? Dennis
Re: How to activate instruction scheduling in GCC?
Sorry, no need already to bother with the last question, already knew that it is (again) generated automatically from the Machine description file petruk_gile wrote: > > Thanks .. your reply is really helpful ... > > Btw, I checked the MIPS backend at MIPS.c, but I can't find the definition > of some functions such as: > > get_attr_hazard(), gen_hazard_nop (), etc. > > Anyone know where those functions defined? > > > > > Ian Lance Taylor-3 wrote: >> >> petruk_gile <[EMAIL PROTECTED]> writes: >> >>> I'm a pure beginner in GCC, and currently working on a project to >>> implement >>> instruction scheduling for a new DSP processor. This processor doesn't >>> have >>> pipeline interlock, so the compiler HAVE to schedule the instruction >>> without >>> relying on hardware help anymore >>> >>> The problem is, I'm a very beginner in GCC. I think the scheduling in >>> GCC is >>> activated by INSN_SCHEDULING variable (in automatically generated file: >>> insn-attr.h), but I don't even know how to activate this variable. >> >> INSN_SCHEDULING will automatically be turned on if you have any >> define_insn_reservation clauses in your CPU.md file. See the >> "Processor pipeline description" documentation in the gcc internals >> manual. >> >> That said, the gcc scheduler unfortunately does not work very well for >> processors which do not have hardware interlocks. The scheduler will >> lay out the instructions more or less optimally. But the scheduler >> has no ability to insert nops when they are required to satisfy >> interlock constraints. >> >> I know of two workable approachs. You can either insert the required >> nops in the TARGET_MACHINE_DEPENDENT_REORG pass or in the >> TARGET_ASM_FUNCTION_PROLOGUE hook. I personally prefer the latter >> approach, as it takes effect after all other instruction rearrangement >> is complete, but there are existing backends which use the former. >> >> For an example of inserting nops in TARGET_MACHINE_DEPENDENT_REORG, >> see the MIPS backend, specifically mips_avoid_hazards. For an example >> of inserting nops in TARGET_ASM_FUNCTION_PROLOGUE, see the FRV >> backend, specifically frv_pack_insns. >> >> Ian >> >> > > -- View this message in context: http://www.nabble.com/How-to-activate-instruction-scheduling-in-GCC--tf4167590.html#a11941887 Sent from the gcc - Dev mailing list archive at Nabble.com.
RE: GCC 4.2.1 : testsuite says WARNING: program timed out
Dennis Clarke wrote: >Is there a way to allow the testsuite to just run regardless of >how long it takes? I think you need to pass "set timeout -1" into dejagnu. I'd suggest a larger positive timeout instead. I forget the correct way to do this - I used to end up editing the .exp files in /usr/share/dejagnu. Rup. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: [RFC] Improve Tree-SSA if-conversion - convergence of efforts
"Daniel Berlin" <[EMAIL PROTECTED]> wrote on 31/07/2007 18:00:57: > > I agree with you for conditional stores/loads. Great! > > The unconditional store/load stuff, however, is exactly what > tree-ssa-sink was meant to do, and belongs there (this is #3 above). > I'm certainly going to fight tooth and nail against trying to shoehorn > unconditional store sinking into if-conv. Sometimes, store-sinking can cause performance degradations. One reason for that, is increasing register pressure, due to extending life range of registers. In addition, in case we have a store followed by a branch, store sinking result will be a branch followed by a store. On some architectures, the former can be executed in parallel, as opposed to the latter. Thus, in this case, it worth executing store-sinking only when it helps the if-conversion to get rid of the branch. How do you suggest to solve this problem, in case store-sinking will be part of the tree-sink pass? Another point, what about (unconditional) load hoisting: It's surely not related to sink pass, right? Tehila.
creating low gimple code for gimplify_omp_atomic_pipeline
Hi, In order to generate code for omp_atomic, I use force_gimple_operand which calls gimplify_omp_atomic; in some cases it calls gimplify_omp_atomic_pipeline, which expands the atomic operation to a cycle (implementing it using atomic compare-and-swap primitive). However, the cond_expr that is generated is structured, and needs to be lowered. Any suggestions on how to create low gimple code for gimplify_omp_atomic_pipeline cases? Thanks, Razya
Re: AMD64 ABI compatibility
Hi Jan, Jan Hubicka wrote on 31.07.2007 23:40:40: > > Hi Kai, > > > > so, could you resolve the remaining issues? Or have you kind of > > paused the project? > > > > Cheers, > > Nicolas > > > > > > On Jul 12, 2007, at 2:14 , Kai Tietz wrote: > > > > >Hi, > > > > > >I am nearly through :) The remaining macros left to be ported are > > >REGPARM_MAX and SSE_REGPARM_MAX. The sysv_abi uses 6 regs and 8 sses, > > >ms_abi uses 4 regs and 4 sse registers. The problem is for example > > >the use > > >in i386.md of SSE_REGPARM_MAX without any hint, how to choose the > > >required > > >abi. Do you have an idea how this could be done ? > > This shoul not be dificult - ix86_regparm is used in > ix86_function_regparm, init_cumulative_args, setup_incoming_varargs_64 > functions. In all those cases you know the function declaration and > thus you can take a look if it is call to different ABI and overwrite > the value. Ok, here is my update. Cheers, i.A. Kai Tietz | (\_/) This is Bunny. Copy and paste Bunny | (='.'=) into your signature to help him gain | (")_(") world domination. -- OneVision Software Entwicklungs GmbH & Co. KG Dr.-Leo-Ritter-Straße 9 - 93049 Regensburg Tel: +49.(0)941.78004.0 - Fax: +49.(0)941.78004.489 - www.OneVision.com Commerzbank Regensburg - BLZ 750 400 62 - Konto 6011050 Handelsregister: HRA 6744, Amtsgericht Regensburg Komplementärin: OneVision Software Entwicklungs Verwaltungs GmbH Dr.-Leo-Ritter-Straße 9 – 93049 Regensburg Handelsregister: HRB 8932, Amtsgericht Regensburg - Geschäftsführer: Ulrike Döhler, Manuela Kluger Index: gcc/gcc/calls.c === --- gcc.orig/gcc/calls.c +++ gcc/gcc/calls.c @@ -1187,6 +1187,7 @@ initialize_argument_information (int num static int compute_argument_block_size (int reg_parm_stack_space, struct args_size *args_size, +tree fndecl, // int preferred_stack_boundary ATTRIBUTE_UNUSED) { int unadjusted_args_size = args_size->constant; @@ -1224,7 +1225,7 @@ compute_argument_block_size (int reg_par /* The area corresponding to register parameters is not to count in the size of the block we need. So make the adjustment. */ - if (!OUTGOING_REG_PARM_STACK_SPACE) + if (!OUTGOING_REG_PARM_STACK_SPACE (fndecl)) args_size->var = size_binop (MINUS_EXPR, args_size->var, ssize_int (reg_parm_stack_space)); @@ -1245,7 +1246,7 @@ compute_argument_block_size (int reg_par args_size->constant = MAX (args_size->constant, reg_parm_stack_space); - if (!OUTGOING_REG_PARM_STACK_SPACE) + if (!OUTGOING_REG_PARM_STACK_SPACE (fndecl)) args_size->constant -= reg_parm_stack_space; } return unadjusted_args_size; @@ -2036,7 +2037,7 @@ expand_call (tree exp, rtx target, int i reg_parm_stack_space = REG_PARM_STACK_SPACE (fndecl); #endif - if (!OUTGOING_REG_PARM_STACK_SPACE && reg_parm_stack_space > 0 && PUSH_ARGS) + if (!OUTGOING_REG_PARM_STACK_SPACE (fndecl) && reg_parm_stack_space > 0 && PUSH_ARGS) must_preallocate = 1; /* Set up a place to return a structure. */ @@ -2442,7 +2443,7 @@ expand_call (tree exp, rtx target, int i /* Since we will be writing into the entire argument area, the map must be allocated for its entire size, not just the part that is the responsibility of the caller. */ - if (!OUTGOING_REG_PARM_STACK_SPACE) + if (!OUTGOING_REG_PARM_STACK_SPACE (fndecl)) needed += reg_parm_stack_space; #ifdef ARGS_GROW_DOWNWARD @@ -2541,7 +2542,7 @@ expand_call (tree exp, rtx target, int i { rtx push_size = GEN_INT (adjusted_args_size.constant - + (OUTGOING_REG_PARM_STACK_SPACE ? 0 + + (OUTGOING_REG_PARM_STACK_SPACE (fndecl) ? 0 : reg_parm_stack_space)); if (old_stack_level == 0) { @@ -2712,7 +2713,7 @@ expand_call (tree exp, rtx target, int i /* If register arguments require space on the stack and stack space was not preallocated, allocate stack space here for arguments passed in registers. */ - if (OUTGOING_REG_PARM_STACK_SPACE && !ACCUMULATE_OUTGOING_ARGS + if (OUTGOING_REG_PARM_STACK_SPACE (fndecl) && !ACCUMULATE_OUTGOING_ARGS && must_preallocate == 0 && reg_parm_stack_space > 0) anti_adjust_stack (GEN_INT (reg_parm_stack_space)); @@ -3537,7 +3538,7 @@ emit_library_call_value_1 (int retval, r args_size.constant = MAX (args_size.constant, reg_parm_stack_space); - if (!OUTGOING_REG
Re: GCC 4.2.1 : testsuite says WARNING: program timed out
2007/8/1, Rupert Wood <[EMAIL PROTECTED]>: > Dennis Clarke wrote: > > >Is there a way to allow the testsuite to just run regardless of > >how long it takes? > > I think you need to pass "set timeout -1" into dejagnu. I'd suggest a larger > positive timeout instead. > > I forget the correct way to do this - I used to end up editing the .exp files > in /usr/share/dejagnu. that's right, however, I recall some issues with, e.g., libstdc++ testsuite not using the system set in, if memory serves me right, remote.exp. -- Cheers, /ChJ
Re: RFC: RTL sharing between decls and instructions
Richard Sandiford <[EMAIL PROTECTED]> writes: > gcc/ > * emit-rtl.c (reset_used_decls): Rename to... > (set_used_decls): ...this. Set the used flag rather than clearing it. > (unshare_all_rtl_again): Update accordingly. Set flags on argument > DECL_RTLs rather than resetting them. This is OK if it passes testing. Your argument sounds right to me. Thanks. Ian
Re: GCC 4.2.1 : testsuite says WARNING: program timed out
On Wed, Aug 01, 2007 at 03:57:19AM -0400, Dennis Clarke wrote: > WARNING: program timed out. > FAIL: gcc.c-torture/compile/20001226-1.c -O1 (test for excess errors) It's in the archives: http://gcc.gnu.org/ml/gcc/2006-09/msg00155.html> -- Rask Ingemann Lambertsen
ICE on valid code, cse related
Hi, I am working on a private port and getting an ICE in valid code. This mainly is because of the following ( which is a part of the entire dump of RTL of the source file) (insn 13 8 14 2 /fc3/testcases/reduce/testcase-min.i:8 (set (reg:SI 138) (const_int 0 [0x0])) 44 {*movsi} (expr_list:REG_LIBCALL_ID (const_int 0 [0x0]) (nil))) (insn 14 13 15 2 /fc3/testcases/reduce/testcase-min.i:8 (set (reg:SI 1 $c1) (reg/f:SI 112 *fp*)) 44 {*movsi} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1]) (insn_list:REG_LIBCALL 17 (nil (insn 15 14 16 2 /fc3/testcases/reduce/testcase-min.i:8 (set (reg:SI 2 $c2) (reg:SI 138)) 44 {*movsi} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1]) (nil))) (call_insn 16 15 18 2 /fc3/testcases/reduce/testcase-min.i:8 (parallel [ (call (mem:SI (symbol_ref:SI ("__floatsisf") [flags 0x41]) [0 S4 A32]) (const_int 0 [0x0])) (use (const_int 0 [0x0])) (clobber (reg:SI 31 $link)) ]) 41 {*call_direct} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1]) (expr_list:REG_EH_REGION (const_int -1 [0x]) (nil))) (expr_list:REG_DEP_TRUE (use (reg:SI 2 $c2)) (expr_list:REG_DEP_TRUE (use (reg:SI 1 $c1)) (nil (insn 18 16 17 2 /fc3/testcases/reduce/testcase-min.i:8 (clobber (reg:SF 139)) -1 (expr_list:REG_LIBCALL_ID (const_int 1 [0x1]) (nil))) (insn 17 18 19 2 /fc3/testcases/reduce/testcase-min.i:8 (set (subreg:SI (reg:SF 139) 0) (mem/c/i:SI (reg/f:SI 112 *fp*) [2 S4 A32])) 44 {*movsi} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1]) (insn_list:REG_RETVAL 14 (expr_list:REG_EQUAL (float:SF (reg:SI 138)) (nil Note the REG_EQUAL note of insn 17. cse tries to replace reg:SI 138 with a constant and because of insn 13, the note becomes (float:SF (const_int 0)) which in turn cse converts into REG_EQUAL (const_double:SF 0 [0x0] 0.0 [0x0.0p+0]) and when CONST_DOUBLE_LOW is done on the above, the compiler crashes - " internal compiler error: RTL check: expected code 'const_double' and mode 'VOID', have code 'const_double' and mode 'SF' in plus_constant, at explow.c:103" i.e the compiler is crashing after converting a const_int to an SFmode value. Could this possibly be a generic issue or a problem with my backend ( as in will I need to define movsf in my backend, which isnt defined at present ) ? Regret the rather verbose post. Thanks in advance, Pranav
Re: creating low gimple code for gimplify_omp_atomic_pipeline
On 8/1/07 8:07 AM, Razya Ladelsky wrote: > Any suggestions on how to create low gimple code for > gimplify_omp_atomic_pipeline > cases? Interesting. I think it's the first time we run into this problem. I don't see force_gimple_operand trying to emit low GIMPLE. But we always use it from the optimizers, so it should. You cannot force the omp_atomic gimplifiers to emit low GIMPLE as those are called by the GENERIC->GIMPLE conversion. The easiest way to fix this, I think, is to call lower_stmt() from force_gimple_operand() after the call to gimplify_expr. For this you'll need to setup a stmt iterator on the resulting list of statements from gimplify_expr and call lower_stmt on each of them (this should be implemented in gimple-low.c). Longer term, I think we need to have an indicator of what level of GIMPLE the function is in. This way the various helpers like force_gimple_operand can decide what to do.
RE: GCC 4.2.1 : testsuite says WARNING: program timed out
> Dennis Clarke wrote: > >>Is there a way to allow the testsuite to just run regardless of >>how long it takes? > > I think you need to pass "set timeout -1" into dejagnu. I'd suggest a larger > positive timeout instead. > > I forget the correct way to do this - I used to end up editing the .exp > files in /usr/share/dejagnu. okay .. that sounds like a good hint. Well .. the file in the default share/dejagnu directory look like so : $ ls baseboards framework.exp mondfe.exp standard.exp testglue.c config ftp.expremote.exp stub-loader.c tip.exp debugger.exp kermit.exp rlogin.exp target.exp util-defs.exp dejagnu.explibexecrsh.exptargetdb.exp utils.exp dg.exp libgloss.exp runtest.exptelnet.exp xsh.exp $ somehow .. that can not be right. Let's look in the GCC 4.2.1 objdir area for files that end with .exp : $ cd gcc-4.2.1-build $ find . -type f | grep "\.exp" ./gcc/testsuite/gcc/site.exp ./gcc/site.exp $ okay .. now we are getting somewhere. Maybe :-\ $ cat ./gcc/testsuite/gcc/site.exp ## these variables are automatically generated by make ## # Do not edit here. If you wish to override these values # add them to the last section set rootme "/opt/build/gcc-4.2.1-build/gcc" set srcdir "/export/home/dclarke/build/gcc-4.2.1/gcc" set host_triplet sparc-sun-solaris2.8 set build_triplet sparc-sun-solaris2.8 set target_triplet sparc-sun-solaris2.8 set target_alias sparc-sun-solaris2.8 set libiconv "/export/home/dclarke/local/lib/libiconv.so -R/export/home/dclarke/local/lib" set CFLAGS "" set CXXFLAGS "" set HOSTCC "cc" set HOSTCFLAGS "-g" set TESTING_IN_BUILD_TREE 1 set HAVE_LIBSTDCXX_V3 1 set tmpdir /opt/build/gcc-4.2.1-build/gcc/testsuite/gcc set srcdir "${srcdir}/testsuite" ## All variables above are generated by configure. Do Not Edit ## $ there is not much there that looks helpful ... and both those files look to be the same : $ ls -li ./gcc/testsuite/gcc/site.exp ./gcc/site.exp 263449 -rw-r--r-- 1 dclarke csw 759 Jul 31 20:58 ./gcc/site.exp 1282849 -rw-r--r-- 1 dclarke csw 763 Jul 31 20:58 ./gcc/testsuite/gcc/site.exp $ diff ./gcc/testsuite/gcc/site.exp ./gcc/site.exp 17c17 < set tmpdir /opt/build/gcc-4.2.1-build/gcc/testsuite/gcc --- > set tmpdir /opt/build/gcc-4.2.1-build/gcc/testsuite $ great ... so then ... perhaps I do have to go back to the exp files in the default dejagnu area ? oh to heck with this ... perhaps I can tar up the whole objdir and move it over to a 1.6GHz UltraSparc box and test it there .. but that defeats the purpose. Thanks for trying Dennis
Re: [RFC] Improve Tree-SSA if-conversion - convergence of efforts
On 8/1/07, Tehila Meyzels <[EMAIL PROTECTED]> wrote: > "Daniel Berlin" <[EMAIL PROTECTED]> wrote on 31/07/2007 18:00:57: > > > > > I agree with you for conditional stores/loads. > > Great! > > > > > The unconditional store/load stuff, however, is exactly what > > tree-ssa-sink was meant to do, and belongs there (this is #3 above). > > I'm certainly going to fight tooth and nail against trying to shoehorn > > unconditional store sinking into if-conv. > > Sometimes, store-sinking can cause performance degradations. > One reason for that, is increasing register pressure, due to extending life > range of registers. > > In addition, in case we have a store followed by a branch, store sinking > result will be a branch followed by a store. > On some architectures, the former can be executed in parallel, as opposed > to the latter. > Thus, in this case, it worth executing store-sinking only when it helps the > if-conversion to get rid of the branch. > > How do you suggest to solve this problem, in case store-sinking will be > part of the tree-sink pass? > Store sinking already *is* part of the tree-sink pass. It just only sinks a small number of stores. The solution to the problem that "sometimes you make things harder for the target" is to fix that in the backend. In this case, the scheduler will take care of it. All of our middle end optimizations will sometimes have bad effects unless the backend fixes it up.Trying to guess what is going to happen 55 passes down the line is a bad idea unless you happen to be a very good psychic. As a general rule of thumb, we are happy to make the backend as target specific and ask as many target questions as you like. The middle end, not so much. There are very few passes in the middle end that can/should/do ask anything about the target. Store sinking is not one of them, and I see no good reason it should be. > Another point, what about (unconditional) load hoisting: > It's surely not related to sink pass, right? > PRE already will hoist unconditional loads out of loops, and in places where it will eliminate redundancy. It could also hoist loads in non-redundancy situations, it is simply the case that it's current heuristic does not think this is a good idea. Thus, if you wanted to do unconditional load hoisting, the thing to do is to make a function like do_regular_insertion in tree-ssa-pre.c, and call it from insert_aux. We already have another heuristic for partially antic fully available expressions, see do_partial_partial_insertion
The Linux binutils 2.17.50.0.18 is released
This is the beta release of binutils 2.17.50.0.18 for Linux, which is based on binutils 2007 0731 in CVS on sourceware.org plus various changes. It is purely for Linux. All relevant patches in patches have been applied to the source tree. You can take a look at patches/README to see what have been applied and in what order they have been applied. Starting from the 2.17.50.0.4 release, the default output section LMA (load memory address) has changed for allocatable sections from being equal to VMA (virtual memory address), to keeping the difference between LMA and VMA the same as the previous output section in the same region. For .data.init_task : { *(.data.init_task) } LMA of .data.init_task section is equal to its VMA with the old linker. With the new linker, it depends on the previous output section. You can use .data.init_task : AT (ADDR(.data.init_task)) { *(.data.init_task) } to ensure that LMA of .data.init_task section is always equal to its VMA. The linker script in the older 2.6 x86-64 kernel depends on the old behavior. You can add AT (ADDR(section)) to force LMA of .data.init_task section equal to its VMA. It will work with both old and new linkers. The x86-64 kernel linker script in kernel 2.6.13 and above is OK. The new x86_64 assembler no longer accepts monitor %eax,%ecx,%edx You should use monitor %rax,%ecx,%edx or monitor which works with both old and new x86_64 assemblers. They should generate the same opcode. The new i386/x86_64 assemblers no longer accept instructions for moving between a segment register and a 32bit memory location, i.e., movl (%eax),%ds movl %ds,(%eax) To generate instructions for moving between a segment register and a 16bit memory location without the 16bit operand size prefix, 0x66, mov (%eax),%ds mov %ds,(%eax) should be used. It will work with both new and old assemblers. The assembler starting from 2.16.90.0.1 will also support movw (%eax),%ds movw %ds,(%eax) without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are available at http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch The ia64 assembler is now defaulted to tune for Itanium 2 processors. To build a kernel for Itanium 1 processors, you will need to add ifeq ($(CONFIG_ITANIUM),y) CFLAGS += -Wa,-mtune=itanium1 AFLAGS += -Wa,-mtune=itanium1 endif to arch/ia64/Makefile in your kernel source tree. Please report any bugs related to binutils 2.17.50.0.18 to [EMAIL PROTECTED] and http://www.sourceware.org/bugzilla/ Changes from binutils 2.17.50.0.17: 1. Update from binutils 2007 0731. 2. Switching from GPLv2 to GPLv3. 3. Add a new ELF linker option, --build-id, to generate a unique per-binary identifier embedded in a note section. 4. Remove COFF/x86-64 from PE-COFF/x86-64. 5. Fix a "nm -l" crash on DWARF info. PR 4797. 6. Match symbol type when creating symbol aliase in ELF shared library. 7. Fix addr2line on relocatable linux kernel. PR 4756. 8. Change disassembler to print addend as signed. 9. Support section alignment from 128 to 8192 bytes for PE-COFF. 10. Add attribute section to ELF linker. 11. Fix ELF linker to meet gABI alignment requirement. PR 4701. 12. Add support for reading in debug information via a .gnu_debuglink section. 13. Fix string merge for ia64 linker. PR 4590. 14. Add --common to size to display total size for *COM* syms. 15. Fix "strip --strip-unneeded" on relocatable files. PR 4716. 16. Fix "objcopy/strip --only-keep-debug" for SHT_NOTE sections. 17. Fix objdump -S with unit-at-a-time. 18. Properly handle "-shared -pie" in linker. PR 4409. 19. Fix x86 disassembler in Intel mode for various SIMD instruction. PRs 4667/4834. 20. Update x86-64 assembler to long nop sequence by default. 21. Fix --32 for x86-64 mingw assembler. 22. Fix a memory corruption in assembler. PR 4722. 22. Properly support 64bit PE-COFF on hosts where long isn't 64bit. 23. Add #line in generated linker source files. 24. Fix linker crash on SIZEOF. PR 4782. 27. Add CR16 support. 28. Add windmc tool for Windows. 29. Generate x86 instruction/register definitions from ascii tables. 30. Fix strip for Solaris. PR 4712. 31. Fix various mips bugs. 32. Fix various ppc bugs. 33. Fix various spu bugs. 34. Fix various xtensa bugs. Changes from binutils 2.17.50.0.16: 1. Update from binutils 2007 0615. 2. Preserve section alignment for copy relocation. PR 4504. 3. Properly fix regression with objcopy --only-keep-debug. PR 4479. 4. Fix ELF eh frame handling. PR 4497. 5. Fix ia64 string merge. PR 4590. 5. Don't use PE target on EFI files nor EFI target on PE files. 6. Speed up linker with many input files. 7. Support cross compiling windres. PR 2737. 8. Fix various windres bugs. 9. Fix various arms bugs. 10. Fix various m68k bugs. 11. Fix various mips bugs. 12. Fix various ppc bugs. 13. Fix various sparc bugs. 14
Re: ICE on valid code, cse related
"Pranav Bhandarkar" <[EMAIL PROTECTED]> writes: > Note the REG_EQUAL note of insn 17. cse tries to replace reg:SI 138 > with a constant and because of insn 13, the note becomes (float:SF > (const_int 0)) which in turn cse converts into > > REG_EQUAL (const_double:SF 0 [0x0] 0.0 [0x0.0p+0]) That seems OK at first glance. > and when CONST_DOUBLE_LOW is done on the above, the compiler crashes - > > " internal compiler error: RTL check: expected code 'const_double' and > mode 'VOID', have code 'const_double' and mode 'SF' in plus_constant, > at explow.c:103" > > i.e the compiler is crashing after converting a const_int to an SFmode value. Who is calling CONST_DOUBLE_LOW on this value? Ian
Re: Workshop on GCC for Research in Embedded and Parallel Systems
== CALL FOR PAPERS - One Final Week Extension GREPS '07 Workshop on GCC for Research in Embedded and Parallel Systems Brasov, Romania, September 16, 2007 http://sysrun.haifa.il.ibm.com/hrl/greps2007/ in conjunction with PACT '07 http://pactconf.org == We are honored to have two prominent keynote speakers: Paul H J Kelly Imperial College, London Title: GCC in software performance research: just plug in Benoit Dupont de Dinechin ST Microelectronics, Grenoble, France Title: GCC for Embedded VLIW Processors: Why Not? A final extension of one week has been granted: submissions are due August 7, 2007; They are to be 6-12 pages long, for review purposes. The submission site is http://papers.haifa.il.ibm.com/greps2007/ For more details see http://sysrun.haifa.il.ibm.com/hrl/greps2007/ Early notification of the intention to participate (submit and/or attend) would be helpful. Important Dates Papers due: August 7, 2007 Acceptance notices: August 13, 2007 Workshop date: September 16, 2007 Ayal Zaks/Haifa/IBM wrote on 26/06/2007 18:08:43: > Acronymed GREPS (... just what you were looking for), is to be held on > September 16 in Brasov, Romania, co-located with PACT. We'd like to bring this > workshop to your attention; the submission site is now open until July 24, > after the upcoming GCC Developers' Summit. For more details see http://sysrun. > haifa.il.ibm.com/hrl/greps2007/ > > Thanks, Albert and Ayal.
Re: [tuples] heads up. you need to specify --enable-checking
On 8/1/07 12:37 PM, Diego Novillo wrote: > So, when configuring the branch make sure you specify --enable-checking. Oh, never mind. Andrew pointed out that it's much easier to just modify version.c as we usually do on branches. Silly me. No need to explicitly --enable-checking now. Apologies for the noise.
Re: ICE on valid code, cse related
> Who is calling CONST_DOUBLE_LOW on this value? plus_constant calls CONST_DOUBLE_LOW on this value. simplify_binary_operation_1 calls plus_constant ( while trying to simplify PLUS on (const_double:SF 0 [0x0] 0.0 [0x0.0p+0]) & (const_int -2147483648 [0x8000]) ), which in turn calls CONST_DOUBLE_LOW. Thanks, Pranav
[tuples] heads up. you need to specify --enable-checking
I just got tricked by my change to DEV-PHASE. Since the branch no longer says 'experimental' but it specifies the branch name and the mainline merge revision number, configure is defaulting to --enable-checking=release. So, when configuring the branch make sure you specify --enable-checking.
Re: [RFC] Improve Tree-SSA if-conversion - convergence of efforts
"Daniel Berlin" <[EMAIL PROTECTED]> wrote on 01/08/2007 18:27:35: > On 8/1/07, Tehila Meyzels <[EMAIL PROTECTED]> wrote: > > "Daniel Berlin" <[EMAIL PROTECTED]> wrote on 31/07/2007 18:00:57: > > > > > > > > I agree with you for conditional stores/loads. > > > > Great! > > > > > > > > The unconditional store/load stuff, however, is exactly what > > > tree-ssa-sink was meant to do, and belongs there (this is #3 above). > > > I'm certainly going to fight tooth and nail against trying to shoehorn > > > unconditional store sinking into if-conv. > > > > Sometimes, store-sinking can cause performance degradations. > > One reason for that, is increasing register pressure, due to extending life > > range of registers. > > > > In addition, in case we have a store followed by a branch, store sinking > > result will be a branch followed by a store. > > On some architectures, the former can be executed in parallel, as opposed > > to the latter. > > Thus, in this case, it worth executing store-sinking only when it helps the > > if-conversion to get rid of the branch. > > > > > How do you suggest to solve this problem, in case store-sinking will be > > part of the tree-sink pass? > > > Store sinking already *is* part of the tree-sink pass. It just only > sinks a small number of stores. > The solution to the problem that "sometimes you make things harder for > the target" is to fix that in the backend. In this case, the > scheduler will take care of it. > > All of our middle end optimizations will sometimes have bad effects > unless the backend fixes it up.Trying to guess what is going to > happen 55 passes down the line is a bad idea unless you happen to be a > very good psychic. > > As a general rule of thumb, we are happy to make the backend as target > specific and ask as many target questions as you like. The middle > end, not so much. There are very few passes in the middle end that > can/should/do ask anything about the target. Store sinking is not one > of them, and I see no good reason it should be. > > > Another point, what about (unconditional) load hoisting: > > It's surely not related to sink pass, right? > > > PRE already will hoist unconditional loads out of loops, and in places > where it will eliminate redundancy. > > It could also hoist loads in non-redundancy situations, it is simply > the case that it's current heuristic does not think this is a good > idea. > Hoisting a non-redundant load speculatively above an if may indeed be a bad idea, unless that if gets converted as a result (and possibly even then ...). Are we in agreement then that unconditional load/store motion for the sake of redundancy elimination continues to belong to PRE/tree-sink, and that conditional load/store motion for the sake of conditional-branch elimination better be coordinated by if-cvt? Ayal. > Thus, if you wanted to do unconditional load hoisting, the thing to do > is to make a function like do_regular_insertion in tree-ssa-pre.c, and > call it from insert_aux. > > We already have another heuristic for partially antic fully available > expressions, see do_partial_partial_insertion
Re: printing cfg
On 8/1/07 3:03 PM, Bob Rossi wrote: > Is there a way to make it show the actual expressions in the code > instead? Other than changing the code in tree-cfg.c:tree_cfg2vcg(), not really. Also, this dump is fairly static in that it only happens right after the CFG is built for the first time (before any optimizations). > Also, is there a native way to display this information using > dot instead? Perhaps it would be easier to post-process the dumps that contain basic block information (-fdump-tree-all-blocks). I generally use the attached script to get the CFG out of an arbitrary pass. It's very simplistic, but it could be adapted to do what you want. #!/bin/sh # # (C) 2005 Free Software Foundation # Contributed by Diego Novillo <[EMAIL PROTECTED]>. # # This script is Free Software, and it can be copied, distributed and # modified as defined in the GNU General Public License. A copy of # its license can be downloaded from http://www.gnu.org/copyleft/gpl.html if [ "$1" = "" ] ; then echo "usage: $0 file" echo echo "Generates a GraphViz .dot graph file from 'file'." echo "It assumes that 'file' has been generated with -fdump-tree-...-blocks" echo exit 1 fi file=$1 out=$file.dot echo "digraph cfg {"> $out echo " node [shape=box]" >>$out echo ' size="11,8.5"' >>$out echo>>$out (grep -E '# BLOCK|# PRED:|# SUCC:' $file | \ sed -e 's:\[\([0-9\.%]*\)*\]::g;s:([a-z_,]*)::g' | \ awk '{ #print $0; \ if ($2 == "BLOCK") \ { \ bb = $3;\ print "\t", bb, "[label=\"", bb, "\", style=filled, color=gray]"; \ } \ else if ($2 == "PRED:") \ { \ for (i = 3; i <= NF; i++) \ print "\t", $i, "->", bb, ";"; \ } \ }') >> $out echo "}">> $out
printing cfg
Hi, I'm trying to print the cfg so that I can visualize it. I have a simple file, $ cat foo.c int foo (int param) { param++; if (param) param++; return param; } I run the command, $ gcc -fdump-tree-vcg-blocks -c foo.c and then I run, xvcg *.vcg which displays a picture of the cfg. It appears for some reason, that the expressions in the basic blocks just show things like, modify_expr (4) cond_expr (5) Is there a way to make it show the actual expressions in the code instead? Also, is there a native way to display this information using dot instead? Thanks, Bob Rossi
Re: ICE on valid code, cse related
"Pranav Bhandarkar" <[EMAIL PROTECTED]> writes: > > Who is calling CONST_DOUBLE_LOW on this value? > plus_constant calls CONST_DOUBLE_LOW on this value. > > simplify_binary_operation_1 calls plus_constant ( while trying to > simplify PLUS on (const_double:SF 0 [0x0] 0.0 [0x0.0p+0]) & (const_int > -2147483648 [0x8000]) ), which in turn calls CONST_DOUBLE_LOW. How can we have a PLUS on a CONST_DOUBLE and a CONST_INT? That does not make sense, as there is no MODE argument that could make this work correctly. From your description, MODE must be some integer mode, in which case it is wrong to be using a CONST_DOUBLE in SFmode. (I don't know where the bug is; I'm just trying to help pin it down.) Ian
Re: [RFC] Improve Tree-SSA if-conversion - convergence of efforts
On 8/1/07, Ayal Zaks <[EMAIL PROTECTED]> wrote: > "Daniel Berlin" <[EMAIL PROTECTED]> wrote on 01/08/2007 18:27:35: > > > On 8/1/07, Tehila Meyzels <[EMAIL PROTECTED]> wrote: > > > "Daniel Berlin" <[EMAIL PROTECTED]> wrote on 31/07/2007 18:00:57: > > > > > > > > > > > I agree with you for conditional stores/loads. > > > > > > Great! > > > > > > > > > > > The unconditional store/load stuff, however, is exactly what > > > > tree-ssa-sink was meant to do, and belongs there (this is #3 above). > > > > I'm certainly going to fight tooth and nail against trying to > shoehorn > > > > unconditional store sinking into if-conv. > > > > > > Sometimes, store-sinking can cause performance degradations. > > > One reason for that, is increasing register pressure, due to extending > life > > > range of registers. > > > > > > In addition, in case we have a store followed by a branch, store > sinking > > > result will be a branch followed by a store. > > > On some architectures, the former can be executed in parallel, as > opposed > > > to the latter. > > > Thus, in this case, it worth executing store-sinking only when it helps > the > > > if-conversion to get rid of the branch. > > > > > > > > How do you suggest to solve this problem, in case store-sinking will be > > > part of the tree-sink pass? > > > > > Store sinking already *is* part of the tree-sink pass. It just only > > sinks a small number of stores. > > The solution to the problem that "sometimes you make things harder for > > the target" is to fix that in the backend. In this case, the > > scheduler will take care of it. > > > > All of our middle end optimizations will sometimes have bad effects > > unless the backend fixes it up.Trying to guess what is going to > > happen 55 passes down the line is a bad idea unless you happen to be a > > very good psychic. > > > > As a general rule of thumb, we are happy to make the backend as target > > specific and ask as many target questions as you like. The middle > > end, not so much. There are very few passes in the middle end that > > can/should/do ask anything about the target. Store sinking is not one > > of them, and I see no good reason it should be. > > > > > Another point, what about (unconditional) load hoisting: > > > It's surely not related to sink pass, right? > > > > > PRE already will hoist unconditional loads out of loops, and in places > > where it will eliminate redundancy. > > > > It could also hoist loads in non-redundancy situations, it is simply > > the case that it's current heuristic does not think this is a good > > idea. > > > > Hoisting a non-redundant load speculatively above an if may indeed be a bad > idea, unless that if gets converted as a result (and possibly even then > ...). Are we in agreement then that unconditional load/store motion for > the sake of redundancy elimination continues to belong to PRE/tree-sink, > and that conditional load/store motion for the sake of conditional-branch > elimination better be coordinated by if-cvt? > Yes. My only issue here is duplication of code that exists in other passes, not one of who/when/why things get done. IE it is easier to use PRE's infrastructure to do the unconditional load elimination, but still only do more than redundancy elimination when you will if-convert branches, then it would be to write a new pass. Your new pass would end up probably missing loads that PRE goes to trouble to get, and would duplicate a lot of the safety computation PRE already knows how to do. Of course, if you only see yourself moving 1 or two loads per function, it may be quicker to do just those in their own pass controlled by ifcvt. But if you are going to try to if-convert every branch, and every load inside those branches, you really don't want to try to make your computation as efficient as PRE makes it. A similar situation exists for unconditional store sinking/tree-ssa-sink.
Re: AMD64 ABI compatibility
Kai, did you make your diff against the current CVS checkout or against your first patch? Should your changes already work for some cases? I would like to test if they produce the right instructions. However, I do not have enough insight into gcc to work on it myself. Thanks, Nicolas On Aug 1, 2007, at 5:48 , Kai Tietz wrote: Hi Jan, Jan Hubicka wrote on 31.07.2007 23:40:40: Hi Kai, so, could you resolve the remaining issues? Or have you kind of paused the project? Cheers, Nicolas On Jul 12, 2007, at 2:14 , Kai Tietz wrote: Hi, I am nearly through :) The remaining macros left to be ported are REGPARM_MAX and SSE_REGPARM_MAX. The sysv_abi uses 6 regs and 8 sses, ms_abi uses 4 regs and 4 sse registers. The problem is for example the use in i386.md of SSE_REGPARM_MAX without any hint, how to choose the required abi. Do you have an idea how this could be done ? This shoul not be dificult - ix86_regparm is used in ix86_function_regparm, init_cumulative_args, setup_incoming_varargs_64 functions. In all those cases you know the function declaration and thus you can take a look if it is call to different ABI and overwrite the value. Ok, here is my update. Cheers, i.A. Kai Tietz | (\_/) This is Bunny. Copy and paste Bunny | (='.'=) into your signature to help him gain | (")_(") world domination. -- OneVision Software Entwicklungs GmbH & Co. KG Dr.-Leo-Ritter-Straße 9 - 93049 Regensburg Tel: +49.(0)941.78004.0 - Fax: +49.(0)941.78004.489 - www.OneVision.com Commerzbank Regensburg - BLZ 750 400 62 - Konto 6011050 Handelsregister: HRA 6744, Amtsgericht Regensburg Komplementärin: OneVision Software Entwicklungs Verwaltungs GmbH Dr.-Leo-Ritter-Straße 9 – 93049 Regensburg Handelsregister: HRB 8932, Amtsgericht Regensburg - Geschäftsführer: Ulrike Döhler, Manuela Kluger
Re: Semicolons at the end of member function definitions
Volker Reichelt wrote: > 2007-03-26 Dirk Mueller <[EMAIL PROTECTED]> > >* parser.c (cp_parser_member_declaration): Pedwarn >about stray semicolons after member declarations. > > It makes > > struct A > { > void foo() {}; > } That is indeed still legal in the current working draft. (The reason that I copied the grammar productions above the parser functions was so that it would be easy to check things like this...) > Therefore, IMHO the patch is wrong and should be reverted. Yes, please go ahead and revert it. And, if you have time, please add a test-case specifically for this case. The previous patch removed semicolons from lots of valid code, but probably none of those test cases were specifically for this case. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
missing libtool sources?
ltmain.sh starts with this line: # Generated from ltmain.m4sh; do not edit by hand but we don't seem to have ltmain.m4sh in the source tree.