Re: Bug in expand_builtin_setjmp_receiver ?
Hi Nathan, > lm32 has a gdb simulator available, so it should be fairly easy to write > a board file for it if one doesn't already exist. > > Unfortunately, building lm32-elf is broken in several different ways > right now. What problems do you have building lm32-elf? If you let me know, I can try to look in to them. Cheers, Jon
Re: peephole2: dead regs not marked as dead
Paolo Bonzini schrieb: > On 10/26/2010 07:42 PM, Georg Lay wrote: >> I set a break at the end of df_simulate_one_insn_backwards. >> CURRENT = *(live->current->bits) >> FIRST = *(live->first->bits) > > Or call debug_bitmap (). :) > >> reg 26 (Stackpointer) and reg 27 (return address) do not matter here. >> The result ist >> >> insn 10 (CALL) CURRENT = FIRST = 0xc008010 = {...,4,15} > > Ok, this looks like a bug somewhere (either in DF or in your backend). hmmm. How could the backend introduce a bug in lifeness? REG15 is special in some way - it is call-saved - it is neither an element of FUNCTION_ARG_REGNO_P nor of EPILOGUE_USES nor of CALL_[REALLY_]USED_REGISTERS - CLASS_LIKELY_SPILLED_P is true for REG15 - it has it's own regclass. This is because REG15 is an implicit reg to many instructions which allows for shorter encoding, so REG15 is preferred over other regs. However, there is no special functionality attached to REG15. this regclass it the first regclass after NO_REGS and subset of a more general set of GPRs, so it's not a "stand alone" reg. - REG15 ist top of REG_ALLOC_ORDER - IRA uses priority based allocator (but with CB dumps are the same) > One reason could be an artificial use at the bottom of the basic block. > This seems strange because the implicit restore of the register in the > epilogue would be a definition, not a use. Anyway, can you print the > liveness bitmap after df_simulate_initialize_backwards? I patched the following snip at the end of that function. The %? just prints additional info gcc-func[funcname:passname(passno)] + { +tric_edump ("%?:\n"); +debug_bitmap (live); + } It prints ... df_simulate_initialize_backwards[and:peephole2(202)]: first = 0x885ee1c current = 0x885ee1c indx = 0 0x885ee1c next = (nil) prev = (nil) indx = 0 bits = { 2 15 26 27 } ... > I also don't see any reason why insn 10 should use d15, though. To > exclude this, can you walk through df_simulate_defs and df_simulate_uses > for insn 10, and print (p *def / p *use respectively) each of the defs > and uses that they encounter? I attached the dumps for the pass and also the changes I made to df-problems.c so that you can see in which context the dumps are printed. > Since liveness is being computed backwards, it's better to think about > it as "already" being alive. Since it is not alive at the end of the > basic block (in the dump it's not part of "lr out"), it must have been > added either by df_simulate_initialize_backwards, or by > df_simulate_one_insn_backwards on the CALL insn. Georg Lay --- df_simulate_initialize_backwards[and:peephole2(202)]: first = 0x885ee1c current = 0x885ee1c indx = 0 0x885ee1c next = (nil) prev = (nil) indx = 0 bits = { 2 15 26 27 } u-1 reg 26 bb 2 insn -1 flag 0x0 type 0x1 chain { } first = 0x885ee1c current = 0x885ee1c indx = 0 0x885ee1c next = (nil) prev = (nil) indx = 0 bits = { 2 15 26 27 } first = 0x885ee1c current = 0x885ee1c indx = 0 0x885ee1c next = (nil) prev = (nil) indx = 0 bits = { 2 15 26 27 } --- df_simulate_defs[and:peephole2(202)]: (call_insn/j 10 25 11 2 peep2.c:5 (parallel [ (set (reg:SI 2 d2) (call (mem:HI (symbol_ref:SI ("f") [flags 0x41] ) [0 S2 A16]) (const_int 0 [0x0]))) (use (const_int 1 [0x1])) ]) 92 {call_value_insn} (expr_list:REG_DEAD (reg:SI 4 d4) (nil)) (expr_list:REG_DEP_TRUE (use (reg:SI 4 d4)) (nil))) d-1 reg 0 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 1 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 3 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 4 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 5 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 6 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 7 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 18 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 19 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 20 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 21 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 22 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 23 bb 2 insn 10 flag 0x40 type 0x0 chain { } d-1 reg 2 bb 2 insn 10 flag 0x8 type 0x0 loc 0xb753ce80(0xb759c810) chain { } first = 0x885ee1c current = 0x885ee1c indx = 0 0x885ee1c next = (nil) prev = (nil) indx = 0 bits = { 15 26 27 } --- df_simulate_uses[and:peephole2(202)]: (call_insn/j 10 25 11 2 peep2.c:5 (parallel [ (set (reg:SI 2 d2) (call (mem:HI (symbol_ref:SI ("f") [flags 0x41] ) [0 S2 A16]) (const_int 0 [0x0]))) (use (const_int 1 [0x1])) ]) 92 {call_value_insn} (expr_list:REG_DEAD (reg:SI 4 d4) (nil)) (expr_list:REG_DEP_TRUE (use (reg:SI 4 d4)) (nil))) u-1 reg 26 bb 2 insn 10 flag 0x2008 type 0x1 chain { } u-1 reg 4 bb 2 insn 10 flag 0x8 type 0x1 loc 0xb755ccb
gengtype installation (where, how)?
Hello I am at the GCC Summit. If some GCC Makefile maintainer could meet me to discuss face to face how and where concretely should the gengtype program be installed I would be grateful. As you know, I am pushing patches to make gengtype really usable from plugins, and that means persisting its state somewhere, and having the user developping a plugin with GTY-s invoking it. So gengtype is becoming a user-visible program and have to be installed. I have no ideas about the details.. Cheers -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: %pc relative addressing of string literals/const data
On 27/10/2010 07:47, Joakim Tjernlund wrote: > Alan Modra wrote on 2010/10/27 04:01:50: >> On Wed, Oct 27, 2010 at 12:53:00AM +0100, Dave Korn wrote: >>> On 26/10/2010 23:37, Joakim Tjernlund wrote: >>> Everything went dead quiet the minute I stated to send patches, what did I do wrong? >>> Nothing, you just ran into the lack-of-manpower problem. Sorry! And I >>> can't even help, I'm not a ppc maintainer. >> I also cannot approve gcc patches. > > Sent it to gcc-patches too. I already sent another gcc patch there but that > didn't > trigger any response either. > Perhaps you can notify whoever that can approve patches? We have a convention on the patches list; if a patch hasn't gotten an answer after ten to fourteen days or so, send a reply to the original post, adding "[PING]" to the beginning of the subject line. (Sometimes it can take two or three pings, unfortunately that's just a consequence of our limited resources.) I see your first patch was posted on the 19th. Give it another few days, then ping it. When you do so, you could also mention your other patch at the same time. cheers, DaveK
Re: peephole2: dead regs not marked as dead
On 10/27/2010 12:54 PM, Georg Lay wrote: reg 26 (Stackpointer) and reg 27 (return address) do not matter here. The result ist insn 10 (CALL) CURRENT = FIRST = 0xc008010 = {...,4,15} Ok, this looks like a bug somewhere (either in DF or in your backend). hmmm. How could the backend introduce a bug in lifeness? REG15 is special in some way - it is call-saved - it is neither an element of FUNCTION_ARG_REGNO_P nor of EPILOGUE_USES nor of CALL_[REALLY_]USED_REGISTERS Looks fine. However, your previous dump showed {2,26,27} for lr out, while this debug output shows {2,15,26,27} at the beginning of df_simulate_initialize_backwards. I would then look at the dump for the previous pass, and/or put breakpoints before/after df_analyze to see why the two dumps differ. Paolo
Re: peephole2: dead regs not marked as dead
Paolo Bonzini schrieb: > On 10/27/2010 12:54 PM, Georg Lay wrote: reg 26 (Stackpointer) and reg 27 (return address) do not matter here. The result ist insn 10 (CALL) CURRENT = FIRST = 0xc008010 = {...,4,15} >>> >>> Ok, this looks like a bug somewhere (either in DF or in your backend). >> >> hmmm. How could the backend introduce a bug in lifeness? >> REG15 is special in some way >> >> - it is call-saved >> - it is neither an element of FUNCTION_ARG_REGNO_P nor of EPILOGUE_USES >>nor of CALL_[REALLY_]USED_REGISTERS > > Looks fine. > > However, your previous dump showed {2,26,27} for lr out, while this > debug output shows {2,15,26,27} at the beginning of > df_simulate_initialize_backwards. > > I would then look at the dump for the previous pass, and/or put > breakpoints before/after df_analyze to see why the two dumps differ. That dump was from IRA. The first time d15 can be seen in "lr out" is dse2: peep2.c.193r.split2:;; lr out 2 [d2] 26 [SP] 27 [a11] peep2.c.195r.pro_and_epilogue:;; lr out 2 [d2] 26 [SP] 27 [a11] peep2.c.196r.dse2:;; lr out 2 [d2] 15 [d15] 26 [SP] 27 [a11] The first time it occurs in "exit block uses" is in pro/epilogue: peep2.c.193r.split2:;; exit block uses 2 [d2] 26 [SP] 27 [a11] peep2.c.195r.pro_and_epilogue:;; exit block uses2 [d2] 15 [d15] 26 [SP] 27 [a11] peep2.c.196r.dse2:;; exit block uses2 [d2] 15 [d15] 26 [SP] 27 [a11] peep2.c.196r.dse2:;; exit block uses2 [d2] 15 [d15] 26 [SP] 27 [a11] I define'd REG_DEAD_DEBUGGING, but that doesn't give more information, and none of these passes produces dumps for my match in df-problems.c. ...and df is much too complicated as not to get lost in it... Georg ;; Function and (and) scanning new insn with uid = 22. scanning new insn with uid = 23. deleting insn with uid = 9. deleting insn with uid = 9. and Dataflow summary: ;; invalidated by call 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 [d6] 7 [d7] 18 [a2] 19 [a3] 20 [a4] 21 [a5] 22 [a6] 23 [a7] ;; hardware regs used 26 [SP] ;; regular block artificial uses26 [SP] ;; eh block artificial uses 26 [SP] 32 [ARGP] ;; entry block defs 2 [d2] 4 [d4] 5 [d5] 6 [d6] 7 [d7] 19 [a3] 20 [a4] 21 [a5] 22 [a6] 23 [a7] 26 [SP] 27 [a11] 31 [a15] ;; exit block uses 2 [d2] 26 [SP] 27 [a11] ;; regs ever live 2[d2] 4[d4] 15[d15] 26[SP] ;; ref usage r0={1d} r1={1d} r2={2d,1u} r3={1d} r4={3d,3u} r5={2d} r6={2d} r7={2d} r15={2d,2u} r18={1d} r19={2d} r20={2d} r21={2d} r22={2d} r23={2d} r26={1d,3u} r27={1d,1u} r31={1d} ;;total ref usage 40{30d,10u,0e} in 4{3 regular + 1 call} insns. (note 1 0 4 NOTE_INSN_DELETED) ;; Start of basic block ( 0) -> 2 ;; bb 2 artificial_defs: { } ;; bb 2 artificial_uses: { u-1(26){ }} ;; lr in4 [d4] 26 [SP] 27 [a11] ;; lr use 4 [d4] 26 [SP] ;; lr def 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 [d6] 7 [d7] 15 [d15] 18 [a2] 19 [a3] 20 [a4] 21 [a5] 22 [a6] 23 [a7] ;; live in 4 [d4] 26 [SP] 27 [a11] ;; live gen 2 [d2] 4 [d4] 15 [d15] ;; live kill ;; Pred edge ENTRY [100.0%] (fallthru) (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (note 2 4 3 2 NOTE_INSN_DELETED) (note 3 2 8 2 NOTE_INSN_FUNCTION_BEG) (note 8 3 22 2 NOTE_INSN_DELETED) (insn 22 8 23 2 peep2.c:5 (set (reg:SI 15 d15) (and:SI (reg:SI 4 d4 [ x ]) (const_int -98305 [0xfffe7fff]))) 139 {*andsi3_zeroes.insert.ic} (nil)) (insn 23 22 21 2 peep2.c:5 (set (reg:SI 15 d15) (xor:SI (reg:SI 15 d15) (reg:SI 4 d4 [ x ]))) 39 {*xorsi3} (nil)) (insn 21 23 10 2 peep2.c:5 (set (reg:SI 4 d4) (reg:SI 15 d15)) 2 {*movsi_insn} (nil)) (call_insn/j 10 21 11 2 peep2.c:5 (parallel [ (set (reg:SI 2 d2) (call (mem:HI (symbol_ref:SI ("f") [flags 0x41] ) [0 S2 A16]) (const_int 0 [0x0]))) (use (const_int 1 [0x1])) ]) 92 {call_value_insn} (nil) (expr_list:REG_DEP_TRUE (use (reg:SI 4 d4)) (nil))) ;; End of basic block 2 -> ( 1) ;; lr out 2 [d2] 26 [SP] 27 [a11] ;; live out 2 [d2] 26 [SP] 27 [a11] ;; Succ edge EXIT [100.0%] (ab,sibcall) (barrier 11 10 20) (note 20 11 0 NOTE_INSN_DELETED) ;; Function and (and) try_optimize_cfg iteration 1 verify found no changes in insn with uid = 10. and Dataflow summary: ;; invalidated by call 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 [d6] 7 [d7] 18 [a2] 19 [a3] 20 [a4] 21 [a5] 22 [a6] 23 [a7] ;; hardware regs used 26 [SP] ;; regular block artificial uses26 [SP] ;; eh block artificial uses 26 [SP] 32 [ARGP] ;; entry block defs 2 [d2] 4 [d4] 5 [d5] 6 [d6] 7 [d7] 15 [d15] 19 [a3] 20 [a4] 21 [a5] 22 [a6] 23 [a7] 26 [SP] 27 [a11] 31 [a15] ;; exit block uses 2 [d2] 15 [d15] 26 [SP] 27 [a11] ;; regs ever live 2[d2] 4[d4] 15[d15] 26[SP] ;; ref usage r0={1d} r1={1d} r2={2d,1u} r3={1d} r4={3d,3u} r5={2d} r6={2d} r7={2d} r15={3d,3u} r18={1d} r1
Re: peephole2: dead regs not marked as dead
On 10/27/2010 04:30 PM, Georg Lay wrote: The first time it occurs in "exit block uses" is in pro/epilogue: peep2.c.193r.split2:;; exit block uses 2 [d2] 26 [SP] 27 [a11] peep2.c.195r.pro_and_epilogue:;; exit block uses2 [d2] 15 [d15] 26 [SP] 27 [a11] peep2.c.196r.dse2:;; exit block uses2 [d2] 15 [d15] 26 [SP] 27 [a11] peep2.c.196r.dse2:;; exit block uses2 [d2] 15 [d15] 26 [SP] 27 [a11] Oh, that helps a lot: this is what df_get_exit_block_use_set says: if (HAVE_epilogue && epilogue_completed) { /* Mark all call-saved registers that we actually used. */ for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) if (df_regs_ever_live_p (i) && !LOCAL_REGNO (i) && !TEST_HARD_REG_BIT (regs_invalidated_by_call, i)) bitmap_set_bit (exit_block_uses, i); } These uses are needed to ensure that, if restores of call-saved registers are explicitly in the RTL for the epilogue, they are not considered dead. However, they make d15 live. It looks like you need to represent the definition of call-saved registers explicitly in the RTL for the epilogue. These definitions will make d15 dead as you expect, and you need them even if they will not output any assembly. In this sense, this is a backend bug: you are not making the RTL stream accurate, and it's biting you. I acknowledge it's quite tricky; from what I understood it happens because of a peculiarity of your architecture with respect to callee-save registers (d15 is automatically save/restored, or something like that). Paolo
Re: Bug in expand_builtin_setjmp_receiver ?
Hi Jon, Le mardi 26 octobre 2010 à 13:07 +0100, Jon Beniston a écrit : > What problems do you have building lm32-elf? If you let me know, I can try > to look in to them. If you have access to a lm32 toolchain, can you test if gcc.c-torture/execute/built-in-setjmp.c passes at different optimization levels? Many thanks, Fred
Re: Bug in expand_builtin_setjmp_receiver ?
On Tue, Oct 26, 2010 at 01:07:26PM +0100, Jon Beniston wrote: > > lm32 has a gdb simulator available, so it should be fairly easy to write > > a board file for it if one doesn't already exist. > > > > Unfortunately, building lm32-elf is broken in several different ways > > right now. > > What problems do you have building lm32-elf? If you let me know, I can try > to look in to them. At least INCOMING_RETURN_ADDR_RTX and TARGET_EXCEPT_UNWIND_INFO need to be defined, as in the below patch (not sure about the definition of INCOMING_RETURN_ADDR_RTX). I think even with those defined, compiling libgcc ICEs, though I don't remember the details. -Nathan diff --git a/gcc/config/lm32/lm32.c b/gcc/config/lm32/lm32.c index 671f0e1..b355309 100644 --- a/gcc/config/lm32/lm32.c +++ b/gcc/config/lm32/lm32.c @@ -100,6 +100,9 @@ static void lm32_option_override (void); #undef TARGET_LEGITIMATE_ADDRESS_P #define TARGET_LEGITIMATE_ADDRESS_P lm32_legitimate_address_p +#undef TARGET_EXCEPT_UNWIND_INFO +#define TARGET_EXCEPT_UNWIND_INFO sjlj_except_unwind_info + struct gcc_target targetm = TARGET_INITIALIZER; /* Current frame information calculated by lm32_compute_frame_size. */ diff --git a/gcc/config/lm32/lm32.h b/gcc/config/lm32/lm32.h index b0c2d59..4c63e94 100644 --- a/gcc/config/lm32/lm32.h +++ b/gcc/config/lm32/lm32.h @@ -249,6 +249,8 @@ enum reg_class #define ARG_POINTER_REGNUM FRAME_POINTER_REGNUM +#define INCOMING_RETURN_ADDR_RTX gen_rtx_REG (SImode, RA_REGNUM) + #define RETURN_ADDR_RTX(count, frame) \ lm32_return_addr_rtx (count, frame)
Re: Constant propagation and CSE
Hi Jeff, On 26 October 2010 16:22, Jeff Law wrote: > There is currently no pass which does "un-cse"; however, using insn > splitting and operand costing and suitable insn constraints/predicates you > can usually arrange to avoid expensive constants in places where it makes > sense. The thing is the cprop pass doesn't look at insn costs while doing its job AFAICS. I'm interested to see how insn splitting can help with this if you don't care to explain. > Perhaps if you gave us more information about the target and the situations > you're trying to avoid we could give more specific advice. The problem is quite simple: if a target allows big immediates in its instructions, the cprop pass can generate quite an inflation in code size on some kind of codes. Imagine that a 64-bits constant is propagated in every iteration of an unrolled loop. For that case, it would be much more cache friendly to have the constant in register(s) and not propagate it. Do not focus too much on the 'loop unrolling' thing, it's just an example of one kind of code that uses the same constant a bunch of times. The tradeoff is hard to decide. If you manage to factorize the constants so that they're hold in registers, then you might significantly increase the register pressure especially if they are used at distant points of the function... and thus degrade the performance. The thing is, I don't see any knob to tweak that trade off, hence my question. As I mentioned, I managed to achieve part of what I wanted by making the big immediates variants illegal during the cprop pass, but that seems like too much of a hack. Thanks! Fred
Sorry for abrupt departure
Sorry for having to bail out without saying goodbye to anyone or participate in the GCC Steering Committee panel; I got word from my attorney that the affidavit that he needed did not get properly transferred this morning. After repeated attempts I gave up on the hotel fax service (Les Suites has got to have the worst wifi & fax services I've ever used or tied to use). For reference, the best way to handle getting out a fax while in the Ottawa airport is to go to lost and found in baggage claim area. Under no circumstances should you go through customs first ;-) Jeff
Re: Constant propagation and CSE
On 10/27/10 12:15, Frederic Riss wrote: Hi Jeff, On 26 October 2010 16:22, Jeff Law wrote: There is currently no pass which does "un-cse"; however, using insn splitting and operand costing and suitable insn constraints/predicates you can usually arrange to avoid expensive constants in places where it makes sense. The thing is the cprop pass doesn't look at insn costs while doing its job AFAICS. I'm interested to see how insn splitting can help with this if you don't care to explain. Certainly the SSA propagators don't use costing information; CSE on the other hand does using costing info, but not always in the way you might think (addresses in memory references for example are often backwards from what you might think) The problem is quite simple: if a target allows big immediates in its instructions, the cprop pass can generate quite an inflation in code size on some kind of codes. Imagine that a 64-bits constant is propagated in every iteration of an unrolled loop. For that case, it would be much more cache friendly to have the constant in register(s) and not propagate it. Do not focus too much on the 'loop unrolling' thing, it's just an example of one kind of code that uses the same constant a bunch of times. This is a common problem. For constants, its generally preferable to first load them into registers and allow CSE to try and commonize the large constants. Combine then will propagate single use constants into their use, leaving the multi-use constants commonized. Register pressure isn't as much of a problem as you might think because constants are relatively easy to rematerialize when there is excess register pressure. Jeff
Re: Constant propagation and CSE
On 27 October 2010 21:21, Jeff Law wrote: > On 10/27/10 12:15, Frederic Riss wrote: >> On 26 October 2010 16:22, Jeff Law wrote: >> >> The thing is the cprop pass doesn't look at insn costs while doing its >> job AFAICS. I'm interested to see how insn splitting can help with >> this if you don't care to explain. > > Certainly the SSA propagators don't use costing information; CSE on the > other hand does using costing info, but not always in the way you might > think (addresses in memory references for example are often backwards from > what you might think) Care to extend on that use of the costing info? I followed the code path in gcse.c and e.g. do_local_cprop doesn't seem to care about costs. BTW, I'm on the 4.5 branch if that matters. > This is a common problem. For constants, its generally preferable to first > load them into registers and allow CSE to try and commonize the large > constants. Combine then will propagate single use constants into their use, > leaving the multi-use constants commonized. That's the situation I managed to get by preventing cprop to propagate large constants. Once propagated, no CSE pass will extract and commonize them again. > Register pressure isn't as much of a problem as you might think because > constants are relatively easy to rematerialize when there is excess register > pressure. That's what I thought, but I have yet to see GCC split the liveness of the commonized constants. When should that be done? Many thanks, Fred