Re: Should -Wjump-misses-init be in -Wall?

2009-06-23 Thread Gabriel Dos Reis
On Tue, Jun 23, 2009 at 12:43 AM, Alan Modra wrote:
> On Mon, Jun 22, 2009 at 09:45:52PM -0400, Robert Dewar wrote:
>> Joe Buck wrote:
>>> I think that this should be the standard: a warning belongs in -Wall if
>>> it tends to expose bugs.  If it doesn't, then it's just somebody's idea
>>> of proper coding style but with no evidence in support of its correctness.
>>>
>>> A -Wall warning should expose bugs, and should be easy to silence in
>>> correct code.
>>
>> To understand what you are saying, we need to know what bug means, since
>> it can have two meanings:
>>
>> 1. An actual error, that could show up right now in certain circumstances
>>
>> 2. An error resulting in undefined behavior in the standard, but
>> for the current version of gcc, it cannot actually cause any real
>> misbehavior, but some future version of gcc might take advantage
>> of this error status and do something weird.
>>
>> For me it is enough if warnings expose case 2 situations, even if
>> they find few if any case 1 situations.
>
> I agree, but I think this warning should be in -Wc++-compat, not -Wall
> or even -Wextra.  Why?  I'd argue the warning is useless for C code,
> unless you care about C++ style.

I do not think it is useless for C99 codes because C99 allows
C++ style declarations/initialization in the middle of a block.

-- Gaby


Re: gcc 4.3.2 vectorizes access to volatile array

2009-06-23 Thread Andrew Haley
Till Straumann wrote:

> Andrew Haley wrote:
>> H.J. Lu wrote:
>>>
>>> That may be too old.  Gcc 4.3.4 revision 148680
>>> generates:
>>>
>>> .L5:
>>> leaq(%rsi,%rdx), %rax
>>> movzbl(%rax), %eax
>>> movb%al, (%rdi,%rdx)
>>> addq$1, %rdx
>>> cmpq$32, %rdx
>>> jne.L5
>>> 
>>
>> 4.4.0 20090307 generates truly bizarre code, though:

> That's roughly the same that 4.3.3 produces.
> I had not quoted the full assembly code but just
> the essential part that is executed when
> source and destination are 4-byte aligned
> and are more than 4-bytes apart.
> Otherwise (not longword-aligned) the
> (correct) code labeled '.L5' is executed.

Right.  I suspect this is just a matter of finding the place where the
vectorization happens and turning it off if source or dest are volatile.
Should be easy enough.

Andrew.


Re: Unnecessary regmoves in modulo scheduler?

2009-06-23 Thread Revital1 Eres
Hello Bingfeng,

> I found a true register dependency is always accompanied with a
cross-iteration
> anti dependency.

When -fmodulo-sched-allow-regmoves flag is set some anti-deps edges are not
created.
Please see add_cross_iteration_register_deps () function in ddg.c.

HTH,
Revital

This should guarantee the true dependence cannot have lifetime
> longer than II.
>
> A --> B true dep (e.g., insn 36 -> insn 38 in the
DDG)
> B --> A anti dep with distance = 1(e.g., insn 38 -> insn 36 in the
DDG)
>
> The second dependdency should lead to : Sched_Time(A) + II >(=)
Sched_Time(B)
> which means Sched_Time(B) - Sched_Time(A) <(=) II and no need for reg
move.
>
> Similarly, an anti register dependency is always accompanied with a
cross-iteration
> true dependency.
>
> A --> B anti dep  (e.g., insn 36 -> insn 53 in the
DDG)
> B --> A true dep with distance = 1(e.g., insn 53 -> insn 36 in the
DDG)
> We can reach similar conclusion.
>
> I wonder what other scenario would require to generate reg moves. Am I
missing
> some obvious points? Thanks in advance.
>
>
> Cheers,
> Bingfeng Mei
>
> Broadcom UK
>
>
> DDG from sms-6.c
> SMS loop num: 1, file: sms-6.c, line: 9
> Node num: 0
> (insn 36 35 37 3 sms-6.c:11 (set (reg:SI 113)
> (mem:SI (reg:SI 108 [ ivtmp.44 ]) [4 S4 A32])) 184 {*movwsi}
(nil))
> OUT ARCS:  [36 -(A,0,0)-> 53]  [36 -(T,6,0)-> 38]
> IN ARCS:  [38 -(A,0,1)-> 36]  [53 -(T,1,1)-> 36]
> Node num: 1
> (insn 37 36 38 3 sms-6.c:11 (set (reg:SI 114)
> (mem:SI (reg:SI 109 [ ivtmp.42 ]) [5 S4 A32])) 184 {*movwsi}
(nil))
> OUT ARCS:  [37 -(A,0,0)-> 52]  [37 -(T,6,0)-> 38]
> IN ARCS:  [38 -(A,0,1)-> 37]  [52 -(T,1,1)-> 37]
> Node num: 2
> (insn 38 37 39 3 sms-6.c:11 (set (reg:SI 115)
> (mult:SI (reg:SI 113)
> (reg:SI 114))) 262 {mulsi3} (expr_list:REG_DEAD (reg:SI 114)
> (expr_list:REG_DEAD (reg:SI 113)
> (nil
> OUT ARCS:  [38 -(A,0,1)-> 37]  [38 -(A,0,1)-> 36]  [38 -(T,8,0)-> 39]
> IN ARCS:  [39 -(A,0,1)-> 38]  [36 -(T,6,0)-> 38]  [37 -(T,6,0)-> 38]
> Node num: 3
> (insn 39 38 40 3 sms-6.c:11 (set (mem:SI (reg:SI 107 [ ivtmp.45 ]) [3 S4
A32])
> (reg:SI 115)) 184 {*movwsi} (expr_list:REG_DEAD (reg:SI 115)
> (nil)))
> OUT ARCS:  [39 -(A,0,1)-> 38]  [39 -(O,1,0)-> 69]  [39 -(A,0,0)-> 54]
> IN ARCS:  [54 -(T,1,1)-> 39]  [51 -(O,1,1)-> 39]  [47 -(O,1,1)-> 39]  [43
-(O,
> 1,1)-> 39]  [38 -(T,8,0)-> 39]
> Node num: 4
> (insn 40 39 41 3 sms-6.c:12 (set (reg:SI 116)
> (mem:SI (plus:SI (reg:SI 108 [ ivtmp.44 ])
> (const_int 4 [0x4])) [4 S4 A32])) 184 {*movwsi} (nil))
> OUT ARCS:  [40 -(A,0,0)-> 53]  [40 -(T,6,0)-> 42]
> IN ARCS:  [42 -(A,0,1)-> 40]  [53 -(T,1,1)-> 40]
> Node num: 5
> (insn 41 40 42 3 sms-6.c:12 (set (reg:SI 117)
> (mem:SI (plus:SI (reg:SI 109 [ ivtmp.42 ])
> (const_int 4 [0x4])) [5 S4 A32])) 184 {*movwsi} (nil))
> OUT ARCS:  [41 -(A,0,0)-> 52]  [41 -(T,6,0)-> 42]
> IN ARCS:  [42 -(A,0,1)-> 41]  [52 -(T,1,1)-> 41]
> Node num: 6
> (insn 42 41 43 3 sms-6.c:12 (set (reg:SI 118)
> (mult:SI (reg:SI 116)
> (reg:SI 117))) 262 {mulsi3} (expr_list:REG_DEAD (reg:SI 117)
> (expr_list:REG_DEAD (reg:SI 116)
> (nil
> OUT ARCS:  [42 -(A,0,1)-> 41]  [42 -(A,0,1)-> 40]  [42 -(T,8,0)-> 43]
> IN ARCS:  [43 -(A,0,1)-> 42]  [40 -(T,6,0)-> 42]  [41 -(T,6,0)-> 42]
> Node num: 7
> (insn 43 42 44 3 sms-6.c:12 (set (mem:SI (plus:SI (reg:SI 107
[ ivtmp.45 ])
> (const_int 4 [0x4])) [3 S4 A32])
> (reg:SI 118)) 184 {*movwsi} (expr_list:REG_DEAD (reg:SI 118)
> (nil)))
> OUT ARCS:  [43 -(A,0,1)-> 42]  [43 -(O,1,0)-> 69]  [43 -(A,0,0)-> 54]
[43 -
> (O,1,1)-> 39]
> IN ARCS:  [54 -(T,1,1)-> 43]  [51 -(O,1,1)-> 43]  [47 -(O,1,1)-> 43]  [42
-(T,
> 8,0)-> 43]
> Node num: 8
> (insn 44 43 45 3 sms-6.c:13 (set (reg:SI 119)
> (mem:SI (plus:SI (reg:SI 108 [ ivtmp.44 ])
> (const_int 8 [0x8])) [4 S4 A32])) 184 {*movwsi} (nil))
> OUT ARCS:  [44 -(A,0,0)-> 53]  [44 -(T,6,0)-> 46]
> IN ARCS:  [46 -(A,0,1)-> 44]  [53 -(T,1,1)-> 44]
> Node num: 9
> (insn 45 44 46 3 sms-6.c:13 (set (reg:SI 120)
> (mem:SI (plus:SI (reg:SI 109 [ ivtmp.42 ])
> (const_int 8 [0x8])) [5 S4 A32])) 184 {*movwsi} (nil))
> OUT ARCS:  [45 -(A,0,0)-> 52]  [45 -(T,6,0)-> 46]
> IN ARCS:  [46 -(A,0,1)-> 45]  [52 -(T,1,1)-> 45]
> Node num: 10
> (insn 46 45 47 3 sms-6.c:13 (set (reg:SI 121)
> (mult:SI (reg:SI 119)
> (reg:SI 120))) 262 {mulsi3} (expr_list:REG_DEAD (reg:SI 120)
> (expr_list:REG_DEAD (reg:SI 119)
> (nil
> OUT ARCS:  [46 -(A,0,1)-> 45]  [46 -(A,0,1)-> 44]  [46 -(T,8,0)-> 47]
> IN ARCS:  [47 -(A,0,1)-> 46]  [44 -(T,6,0)-> 46]  [45 -(T,6,0)-> 46]
> Node num: 11
> (insn 47 46 48 3 sms-6.c:13 (set (mem:SI (plus:SI (reg:SI 107
[ ivtmp.45 ])
> (const_int 8 [0x8])) [3 S4 A32])
> (reg:SI 121)) 184 {*movwsi} (expr_list:REG_DEAD (re

RE: Unnecessary regmoves in modulo scheduler?

2009-06-23 Thread Bingfeng Mei
Thanks. I didn't notice the option. Which approach is generally better
according to your experience? Producing regmoves or more depedencies? 

> -Original Message-
> From: Revital1 Eres [mailto:e...@il.ibm.com] 
> Sent: 23 June 2009 14:40
> To: Bingfeng Mei
> Cc: Ayal Zaks; gcc@gcc.gnu.org
> Subject: Re: Unnecessary regmoves in modulo scheduler?
> 
> Hello Bingfeng,
> 
> > I found a true register dependency is always accompanied with a
> cross-iteration
> > anti dependency.
> 
> When -fmodulo-sched-allow-regmoves flag is set some anti-deps 
> edges are not
> created.
> Please see add_cross_iteration_register_deps () function in ddg.c.
> 
> HTH,
> Revital
> 
> This should guarantee the true dependence cannot have lifetime
> > longer than II.
> >
> > A --> B true dep (e.g., insn 36 -> insn 
> 38 in the
> DDG)
> > B --> A anti dep with distance = 1(e.g., insn 38 -> 
> insn 36 in the
> DDG)
> >
> > The second dependdency should lead to : Sched_Time(A) + II >(=)
> Sched_Time(B)
> > which means Sched_Time(B) - Sched_Time(A) <(=) II and no 
> need for reg
> move.
> >
> > Similarly, an anti register dependency is always accompanied with a
> cross-iteration
> > true dependency.
> >
> > A --> B anti dep  (e.g., insn 36 -> 
> insn 53 in the
> DDG)
> > B --> A true dep with distance = 1(e.g., insn 53 -> 
> insn 36 in the
> DDG)
> > We can reach similar conclusion.
> >
> > I wonder what other scenario would require to generate reg 
> moves. Am I
> missing
> > some obvious points? Thanks in advance.
> >
> >
> > Cheers,
> > Bingfeng Mei
> >
> > Broadcom UK
> >
> >
> > DDG from sms-6.c
> > SMS loop num: 1, file: sms-6.c, line: 9
> > Node num: 0
> > (insn 36 35 37 3 sms-6.c:11 (set (reg:SI 113)
> > (mem:SI (reg:SI 108 [ ivtmp.44 ]) [4 S4 A32])) 184 {*movwsi}
> (nil))
> > OUT ARCS:  [36 -(A,0,0)-> 53]  [36 -(T,6,0)-> 38]
> > IN ARCS:  [38 -(A,0,1)-> 36]  [53 -(T,1,1)-> 36]
> > Node num: 1
> > (insn 37 36 38 3 sms-6.c:11 (set (reg:SI 114)
> > (mem:SI (reg:SI 109 [ ivtmp.42 ]) [5 S4 A32])) 184 {*movwsi}
> (nil))
> > OUT ARCS:  [37 -(A,0,0)-> 52]  [37 -(T,6,0)-> 38]
> > IN ARCS:  [38 -(A,0,1)-> 37]  [52 -(T,1,1)-> 37]
> > Node num: 2
> > (insn 38 37 39 3 sms-6.c:11 (set (reg:SI 115)
> > (mult:SI (reg:SI 113)
> > (reg:SI 114))) 262 {mulsi3} (expr_list:REG_DEAD 
> (reg:SI 114)
> > (expr_list:REG_DEAD (reg:SI 113)
> > (nil
> > OUT ARCS:  [38 -(A,0,1)-> 37]  [38 -(A,0,1)-> 36]  [38 
> -(T,8,0)-> 39]
> > IN ARCS:  [39 -(A,0,1)-> 38]  [36 -(T,6,0)-> 38]  [37 -(T,6,0)-> 38]
> > Node num: 3
> > (insn 39 38 40 3 sms-6.c:11 (set (mem:SI (reg:SI 107 [ 
> ivtmp.45 ]) [3 S4
> A32])
> > (reg:SI 115)) 184 {*movwsi} (expr_list:REG_DEAD (reg:SI 115)
> > (nil)))
> > OUT ARCS:  [39 -(A,0,1)-> 38]  [39 -(O,1,0)-> 69]  [39 
> -(A,0,0)-> 54]
> > IN ARCS:  [54 -(T,1,1)-> 39]  [51 -(O,1,1)-> 39]  [47 
> -(O,1,1)-> 39]  [43
> -(O,
> > 1,1)-> 39]  [38 -(T,8,0)-> 39]
> > Node num: 4
> > (insn 40 39 41 3 sms-6.c:12 (set (reg:SI 116)
> > (mem:SI (plus:SI (reg:SI 108 [ ivtmp.44 ])
> > (const_int 4 [0x4])) [4 S4 A32])) 184 
> {*movwsi} (nil))
> > OUT ARCS:  [40 -(A,0,0)-> 53]  [40 -(T,6,0)-> 42]
> > IN ARCS:  [42 -(A,0,1)-> 40]  [53 -(T,1,1)-> 40]
> > Node num: 5
> > (insn 41 40 42 3 sms-6.c:12 (set (reg:SI 117)
> > (mem:SI (plus:SI (reg:SI 109 [ ivtmp.42 ])
> > (const_int 4 [0x4])) [5 S4 A32])) 184 
> {*movwsi} (nil))
> > OUT ARCS:  [41 -(A,0,0)-> 52]  [41 -(T,6,0)-> 42]
> > IN ARCS:  [42 -(A,0,1)-> 41]  [52 -(T,1,1)-> 41]
> > Node num: 6
> > (insn 42 41 43 3 sms-6.c:12 (set (reg:SI 118)
> > (mult:SI (reg:SI 116)
> > (reg:SI 117))) 262 {mulsi3} (expr_list:REG_DEAD 
> (reg:SI 117)
> > (expr_list:REG_DEAD (reg:SI 116)
> > (nil
> > OUT ARCS:  [42 -(A,0,1)-> 41]  [42 -(A,0,1)-> 40]  [42 
> -(T,8,0)-> 43]
> > IN ARCS:  [43 -(A,0,1)-> 42]  [40 -(T,6,0)-> 42]  [41 -(T,6,0)-> 42]
> > Node num: 7
> > (insn 43 42 44 3 sms-6.c:12 (set (mem:SI (plus:SI (reg:SI 107
> [ ivtmp.45 ])
> > (const_int 4 [0x4])) [3 S4 A32])
> > (reg:SI 118)) 184 {*movwsi} (expr_list:REG_DEAD (reg:SI 118)
> > (nil)))
> > OUT ARCS:  [43 -(A,0,1)-> 42]  [43 -(O,1,0)-> 69]  [43 
> -(A,0,0)-> 54]
> [43 -
> > (O,1,1)-> 39]
> > IN ARCS:  [54 -(T,1,1)-> 43]  [51 -(O,1,1)-> 43]  [47 
> -(O,1,1)-> 43]  [42
> -(T,
> > 8,0)-> 43]
> > Node num: 8
> > (insn 44 43 45 3 sms-6.c:13 (set (reg:SI 119)
> > (mem:SI (plus:SI (reg:SI 108 [ ivtmp.44 ])
> > (const_int 8 [0x8])) [4 S4 A32])) 184 
> {*movwsi} (nil))
> > OUT ARCS:  [44 -(A,0,0)-> 53]  [44 -(T,6,0)-> 46]
> > IN ARCS:  [46 -(A,0,1)-> 44]  [53 -(T,1,1)-> 44]
> > Node num: 9
> > (insn 45 44 46 3 sms-6.c:13 (set (reg:SI 120)
> > (mem:SI (plus:SI (reg:SI 109 [ ivtmp.42 ])
> > (const_int 8 [0x8])) [5 S4 A32])) 184 
> {*movwsi} (nil))
> > OUT ARCS:  [45 -(A,0,0)-

RE: Unnecessary regmoves in modulo scheduler?

2009-06-23 Thread Revital1 Eres
Hello,

>
> Thanks. I didn't notice the option. Which approach is generally better
> according to your experience? Producing regmoves or more depedencies?

I think it depends on the target. Having reg-moves could increase
register pressure so using it on targets with small set of registers
could be painful.

Revital

>
> > -Original Message-
> > From: Revital1 Eres [mailto:e...@il.ibm.com]
> > Sent: 23 June 2009 14:40
> > To: Bingfeng Mei
> > Cc: Ayal Zaks; gcc@gcc.gnu.org
> > Subject: Re: Unnecessary regmoves in modulo scheduler?
> >
> > Hello Bingfeng,
> >
> > > I found a true register dependency is always accompanied with a
> > cross-iteration
> > > anti dependency.
> >
> > When -fmodulo-sched-allow-regmoves flag is set some anti-deps
> > edges are not
> > created.
> > Please see add_cross_iteration_register_deps () function in ddg.c.
> >
> > HTH,
> > Revital
> >
> > This should guarantee the true dependence cannot have lifetime
> > > longer than II.
> > >
> > > A --> B true dep (e.g., insn 36 -> insn
> > 38 in the
> > DDG)
> > > B --> A anti dep with distance = 1(e.g., insn 38 ->
> > insn 36 in the
> > DDG)
> > >
> > > The second dependdency should lead to : Sched_Time(A) + II >(=)
> > Sched_Time(B)
> > > which means Sched_Time(B) - Sched_Time(A) <(=) II and no
> > need for reg
> > move.
> > >
> > > Similarly, an anti register dependency is always accompanied with a
> > cross-iteration
> > > true dependency.
> > >
> > > A --> B anti dep  (e.g., insn 36 ->
> > insn 53 in the
> > DDG)
> > > B --> A true dep with distance = 1(e.g., insn 53 ->
> > insn 36 in the
> > DDG)
> > > We can reach similar conclusion.
> > >
> > > I wonder what other scenario would require to generate reg
> > moves. Am I
> > missing
> > > some obvious points? Thanks in advance.
> > >
> > >
> > > Cheers,
> > > Bingfeng Mei
> > >
> > > Broadcom UK
> > >
> > >
> > > DDG from sms-6.c
> > > SMS loop num: 1, file: sms-6.c, line: 9
> > > Node num: 0
> > > (insn 36 35 37 3 sms-6.c:11 (set (reg:SI 113)
> > > (mem:SI (reg:SI 108 [ ivtmp.44 ]) [4 S4 A32])) 184 {*movwsi}
> > (nil))
> > > OUT ARCS:  [36 -(A,0,0)-> 53]  [36 -(T,6,0)-> 38]
> > > IN ARCS:  [38 -(A,0,1)-> 36]  [53 -(T,1,1)-> 36]
> > > Node num: 1
> > > (insn 37 36 38 3 sms-6.c:11 (set (reg:SI 114)
> > > (mem:SI (reg:SI 109 [ ivtmp.42 ]) [5 S4 A32])) 184 {*movwsi}
> > (nil))
> > > OUT ARCS:  [37 -(A,0,0)-> 52]  [37 -(T,6,0)-> 38]
> > > IN ARCS:  [38 -(A,0,1)-> 37]  [52 -(T,1,1)-> 37]
> > > Node num: 2
> > > (insn 38 37 39 3 sms-6.c:11 (set (reg:SI 115)
> > > (mult:SI (reg:SI 113)
> > > (reg:SI 114))) 262 {mulsi3} (expr_list:REG_DEAD
> > (reg:SI 114)
> > > (expr_list:REG_DEAD (reg:SI 113)
> > > (nil
> > > OUT ARCS:  [38 -(A,0,1)-> 37]  [38 -(A,0,1)-> 36]  [38
> > -(T,8,0)-> 39]
> > > IN ARCS:  [39 -(A,0,1)-> 38]  [36 -(T,6,0)-> 38]  [37 -(T,6,0)-> 38]
> > > Node num: 3
> > > (insn 39 38 40 3 sms-6.c:11 (set (mem:SI (reg:SI 107 [
> > ivtmp.45 ]) [3 S4
> > A32])
> > > (reg:SI 115)) 184 {*movwsi} (expr_list:REG_DEAD (reg:SI 115)
> > > (nil)))
> > > OUT ARCS:  [39 -(A,0,1)-> 38]  [39 -(O,1,0)-> 69]  [39
> > -(A,0,0)-> 54]
> > > IN ARCS:  [54 -(T,1,1)-> 39]  [51 -(O,1,1)-> 39]  [47
> > -(O,1,1)-> 39]  [43
> > -(O,
> > > 1,1)-> 39]  [38 -(T,8,0)-> 39]
> > > Node num: 4
> > > (insn 40 39 41 3 sms-6.c:12 (set (reg:SI 116)
> > > (mem:SI (plus:SI (reg:SI 108 [ ivtmp.44 ])
> > > (const_int 4 [0x4])) [4 S4 A32])) 184
> > {*movwsi} (nil))
> > > OUT ARCS:  [40 -(A,0,0)-> 53]  [40 -(T,6,0)-> 42]
> > > IN ARCS:  [42 -(A,0,1)-> 40]  [53 -(T,1,1)-> 40]
> > > Node num: 5
> > > (insn 41 40 42 3 sms-6.c:12 (set (reg:SI 117)
> > > (mem:SI (plus:SI (reg:SI 109 [ ivtmp.42 ])
> > > (const_int 4 [0x4])) [5 S4 A32])) 184
> > {*movwsi} (nil))
> > > OUT ARCS:  [41 -(A,0,0)-> 52]  [41 -(T,6,0)-> 42]
> > > IN ARCS:  [42 -(A,0,1)-> 41]  [52 -(T,1,1)-> 41]
> > > Node num: 6
> > > (insn 42 41 43 3 sms-6.c:12 (set (reg:SI 118)
> > > (mult:SI (reg:SI 116)
> > > (reg:SI 117))) 262 {mulsi3} (expr_list:REG_DEAD
> > (reg:SI 117)
> > > (expr_list:REG_DEAD (reg:SI 116)
> > > (nil
> > > OUT ARCS:  [42 -(A,0,1)-> 41]  [42 -(A,0,1)-> 40]  [42
> > -(T,8,0)-> 43]
> > > IN ARCS:  [43 -(A,0,1)-> 42]  [40 -(T,6,0)-> 42]  [41 -(T,6,0)-> 42]
> > > Node num: 7
> > > (insn 43 42 44 3 sms-6.c:12 (set (mem:SI (plus:SI (reg:SI 107
> > [ ivtmp.45 ])
> > > (const_int 4 [0x4])) [3 S4 A32])
> > > (reg:SI 118)) 184 {*movwsi} (expr_list:REG_DEAD (reg:SI 118)
> > > (nil)))
> > > OUT ARCS:  [43 -(A,0,1)-> 42]  [43 -(O,1,0)-> 69]  [43
> > -(A,0,0)-> 54]
> > [43 -
> > > (O,1,1)-> 39]
> > > IN ARCS:  [54 -(T,1,1)-> 43]  [51 -(O,1,1)-> 43]  [47
> > -(O,1,1)-> 43]  [42
> > -(T,
> > > 8,0)-> 43]
> > > Node num: 8
> > > (insn 44 43 45 3 sms-6.c:13 (set (reg:SI 119)
> > > (mem:SI (plus:SI (reg:SI 

-funswitch-loops slowdown.is it possible to change settings in backend ?

2009-06-23 Thread Bernd Roesch
Hello 

The -funswitch-loops Option seem work on gcc 4.3.0 and above not good for
speed.Test on m68k gcc.

It generate much larger code(wma123) and code is slower in many case (try
out ffmpeg H264 decode)i get report from a Athlon 2600+ with single channel
ram
running amiga 68k emulator.

But on my System use a AMD64 3000+ and Dual Channel ram running amiga
emulator
-funswitch-loops cause only large files but no slowdown.

but i guess on a real 68k/coldfire CPU without 2. level cache,
-funswitch-loops is more
not optimal.
gcc 3.4.0 have too this option set on -O3 or i am wrong ?
and here the speed is better and code is smaller

Is there a way to tweak some values on backend for specific CPU so
-funswitch-loops works 3.4.0(maybe unroll not so much loops ?

for now best solution for speed (H264 decode work on the system with single
Channel ram same ot little faster as 3.4.0 build.) is let disable
-funswitch-loops disable as far i get speedvalue reports. 

here are some values that show too slowdown on compilers 4.2.4 and 4.3.0 but
on
X86

http://multimedia.cx/eggs/compiler-performance-profiling-with-ffmpeg/

Regards



[OTish] Proving compiler algorithms implement same semantics as language specs? [was Re: [PATCH][RFC] Re-implement restrict support]

2009-06-23 Thread Dave Korn
[ redirected away from the -patches list because I want to ask a more general
theoretical question about compiler development ]

Richard Guenther wrote:

> During points-to pointer equivalence sets are computed by adding
> special RESTRICT heap-variables to points-to sets of targets of
> pointer conversions to restrict, global restrict qualified pointers
> and restrict qualified pointer arguments.
> 
> A RESTRICT in the points-to set of a restrict qualified pointer
> acts as a filter for NONLOCAL and ANYTHING.  The RESTRICT in the
> points-to sets make pointers based on each other conflict,
> non-restrict qualified pointers conflict with restrict qualified
> pointers if they point to anonymous memory (NONLOCAL or ANYTHING)
> or otherwise.

> Comments?  Holes in my treatment of restrict?

  I'd guess there has to be some way in formal logic or propositional calculus
by which we could take descriptions such as Richard's above, and the
description of restrict semantics in the standard, and reduce them each to a
pile of propositions that we could feed into a theorem-proving system and get
it to prove they were identical.

  But I'm guessing: this kind of area is a million miles outside anything I'm
familiar with.  However, we've got a load of very clever academic types on the
list here, so I thought I'd throw it open for discussion.  There have been a
bunch of papers and a few big projects in academia on provable compiler
correctness, but they all seem very ambitious and not like anything we could
make an applied use of in GCC; but is there some simpler, practical and
well-understood tool-set already existing that we could put into use for small
jobs such as the above?

cheers,
  DaveK


Re: [OTish] Proving compiler algorithms implement same semantics as language specs? [was Re: [PATCH][RFC] Re-implement restrict support]

2009-06-23 Thread Richard Guenther
On Tue, Jun 23, 2009 at 4:53 PM, Dave
Korn wrote:
> [ redirected away from the -patches list because I want to ask a more general
> theoretical question about compiler development ]
>
> Richard Guenther wrote:
>
>> During points-to pointer equivalence sets are computed by adding
>> special RESTRICT heap-variables to points-to sets of targets of
>> pointer conversions to restrict, global restrict qualified pointers
>> and restrict qualified pointer arguments.
>>
>> A RESTRICT in the points-to set of a restrict qualified pointer
>> acts as a filter for NONLOCAL and ANYTHING.  The RESTRICT in the
>> points-to sets make pointers based on each other conflict,
>> non-restrict qualified pointers conflict with restrict qualified
>> pointers if they point to anonymous memory (NONLOCAL or ANYTHING)
>> or otherwise.
>
>> Comments?  Holes in my treatment of restrict?
>
>  I'd guess there has to be some way in formal logic or propositional calculus
> by which we could take descriptions such as Richard's above, and the
> description of restrict semantics in the standard, and reduce them each to a
> pile of propositions that we could feed into a theorem-proving system and get
> it to prove they were identical.
>
>  But I'm guessing: this kind of area is a million miles outside anything I'm
> familiar with.  However, we've got a load of very clever academic types on the
> list here, so I thought I'd throw it open for discussion.  There have been a
> bunch of papers and a few big projects in academia on provable compiler
> correctness, but they all seem very ambitious and not like anything we could
> make an applied use of in GCC; but is there some simpler, practical and
> well-understood tool-set already existing that we could put into use for small
> jobs such as the above?

We should have proper testsuite coverage for language features.
In this case the XPASS on the vector testcase shows me a major
flaw in my implementation:

int a[N];

__attribute__ ((noinline)) int
foo (int * __restrict__ b, int k){
...

with the proposed patch b would not alias a.  Maybe desirable
for optimization but certainly not what the C language suggests.

Thus I'm back to a separate representation and oracle treatment
for restrict.  Bah.  That'll cost.

Richard.


Re: [OTish] Proving compiler algorithms implement same semantics as language specs? [was Re: [PATCH][RFC] Re-implement restrict support]

2009-06-23 Thread Joseph S. Myers
On Tue, 23 Jun 2009, Dave Korn wrote:

>   I'd guess there has to be some way in formal logic or propositional calculus
> by which we could take descriptions such as Richard's above, and the
> description of restrict semantics in the standard, and reduce them each to a
> pile of propositions that we could feed into a theorem-proving system and get
> it to prove they were identical.

Norrish's thesis  
formalised a subset of C90 in HOL, but I don't know what developments 
there have been in this area since 1998.  It may not always be clear what 
the standard means, and I also quote Norrish:

This ... tells us nothing about the quality of our semantics with 
respect to the original specification  Better would be to have the 
specification of the semantics inspected by another individual who was 
both familiar with the fine details of the ISO standard, and the 
techniques of operational semantics.  Unfortunately, such people are 
hard to find, which is rather an indictment of the divergence between 
theory and practice in computer science.

Theorem proving systems have been successfully used for IEEE 
floating-point algorithms, where the intended semantics are more precisely 
defined and better understood.  I believe that following the Pentium FDIV 
bug Intel now uses formal verification for the floating-point algorithms 
in its processors.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Should -Wjump-misses-init be in -Wall?

2009-06-23 Thread Joe Buck

On Tue, Jun 23, 2009 at 12:43 AM, Alan Modra wrote:
> > ..., but I think this warning should be in -Wc++-compat, not -Wall
> > or even -Wextra.  Why?  I'd argue the warning is useless for C code,
> > unless you care about C++ style.

On Tue, Jun 23, 2009 at 12:35:48AM -0700, Gabriel Dos Reis wrote:
> I do not think it is useless for C99 codes because C99 allows
> C++ style declarations/initialization in the middle of a block.

But if the initialization is skipped and the variable is then used,
won't we get an uninitialized-variable warning?


Re: OpenCL inquire

2009-06-23 Thread Phil Pratt-Szeliga
The following emails have been transfered between Soufiane and I.
Paolo Bonzini asked that I cc the
emails to the list.

=
Hi Soufiane,

I am working on OpenCL for google summer of code.  I just emailled my
mentor asking about what the rules of collaboration are while I am
being paid under gsoc.  (It is a little fuzzy on the website: Q: Can a
group apply for and work on a single proposal? A: No, only an
individual may work on a given project. Of course, students should
feel free to collaborate with others to accomplish their project
goals.)

You can certainly do work on your own and I would love to work with you.
I will let you know what my mentor says.

Sincerely,
Phil
=
Hello Phil,

I am happy to work with you, just tell me which parser do you use?
do you convert your parsed OpenCL grammar to ATI IL or Stream?
can i start from where you are or should i do all the work from scratch!

Soufiane


Hi Soufiane,

My gsoc project is not making an OpenCL C compiler, but instead the
supporting infrastructure.  I have made rough draft functions that
support clCreateContext, clCreateCommandQueue, clCreateMemoryObject,
clCreateProgramWithBinaries, clCreateKernel, clEnqueueNDRangeKernel
and clEnqueueBufferRead.  Without an OpenCL C compiler, I get use
clCreateProgramWithBinaries to load compiled code into the runtime. So
far this works to load dynamic link libraries on the cpu (without
knowing the number of kernel arguments at compile time).  I am making
a structure where there are various drivers for all the different
devices and I am focusing on CPU and Cell.  So if you wanted to make a
device driver to load and execute the ATI binaries, that would be a
big help (as long as it is okay with google).

We use git for this project.  There is barely any documentation right
now as I am developing things still (and things can change a little if
you need info for the driver).  The code is in github:
http://github.com/pcpratts/gcc_opencl/tree/master

The code I was talking about is in cl/devices/cpu/generic/driver.c

To run the code you will need the dependencies in the NOTES file.
Also you will need to put the output of
# lshw -xml > lshw_output.xml

in the root folder of the project.

Phil


Re: Should -Wjump-misses-init be in -Wall?

2009-06-23 Thread Gabriel Dos Reis
On Tue, Jun 23, 2009 at 11:12 AM, Joe Buck wrote:
>
> On Tue, Jun 23, 2009 at 12:43 AM, Alan Modra wrote:
>> > ..., but I think this warning should be in -Wc++-compat, not -Wall
>> > or even -Wextra.  Why?  I'd argue the warning is useless for C code,
>> > unless you care about C++ style.
>
> On Tue, Jun 23, 2009 at 12:35:48AM -0700, Gabriel Dos Reis wrote:
>> I do not think it is useless for C99 codes because C99 allows
>> C++ style declarations/initialization in the middle of a block.
>
> But if the initialization is skipped and the variable is then used,
> won't we get an uninitialized-variable warning?

Did we get any in the cases Ian reported?

-- Gaby


Re: Should -Wjump-misses-init be in -Wall?

2009-06-23 Thread Joe Buck

On Tue, Jun 23, 2009 at 11:12 AM, Joe Buck wrote:
> > But if the initialization is skipped and the variable is then used,
> > won't we get an uninitialized-variable warning?

On Tue, Jun 23, 2009 at 09:32:51AM -0700, Gabriel Dos Reis wrote:
> Did we get any in the cases Ian reported?

Note the second condition I gave: "and the variable is then used".
The new warning just tests the first part: the initialization is
skipped.
 


Re: Should -Wjump-misses-init be in -Wall?

2009-06-23 Thread Paolo Bonzini

Gabriel Dos Reis wrote:

On Tue, Jun 23, 2009 at 11:12 AM, Joe Buck wrote:

On Tue, Jun 23, 2009 at 12:43 AM, Alan Modra wrote:

..., but I think this warning should be in -Wc++-compat, not -Wall
or even -Wextra.  Why?  I'd argue the warning is useless for C code,
unless you care about C++ style.

On Tue, Jun 23, 2009 at 12:35:48AM -0700, Gabriel Dos Reis wrote:

I do not think it is useless for C99 codes because C99 allows
C++ style declarations/initialization in the middle of a block.

But if the initialization is skipped and the variable is then used,
won't we get an uninitialized-variable warning?


Did we get any in the cases Ian reported?


No, because they were all like this:

  goto fail;

 ...
  int a = ...;
  if (a)
{
 fail:
  // does not use a
  return;
}

  ...

This is a bit ugly, but it's valid C code.

The only dubious one in my opinion was a missing bracket around switch, 
which was also harmless but warrants a warning.  This however could be 
implemented as a separate warning going into -Wextra.


I don't think this warning can report anything that -Wuninitialized 
cannot report, so it should go in -Wc++-compat only.


Paolo


Re: Should -Wjump-misses-init be in -Wall?

2009-06-23 Thread Ian Lance Taylor
Paolo Bonzini  writes:

> I don't think this warning can report anything that -Wuninitialized
> cannot report, so it should go in -Wc++-compat only.

For the record, it can, as in when compiling this case without
optimization.  This is not a strong example by any means.

extern void f2 (int *);
int
f1 ()
{
  goto lab1;
  {
int i = 1;
f2 (&i);
  lab1:
return i;
  }
}

Ian


TEMPLATE_PARM_PARAMETER_PACK redundant check in find_parameter_packs_r

2009-06-23 Thread Larry Evans

At pt.c:2462

http://gcc.gnu.org/viewcvs/trunk/gcc/cp/pt.c?revision=148666&view=markup

there's:

  switch (TREE_CODE (t))
{
case TEMPLATE_PARM_INDEX:
  if (TEMPLATE_PARM_PARAMETER_PACK (t))
parameter_pack_p = true;
  break;

In gdb, macro exp shows:

(gdb) macro exp TEMPLATE_PARM_PARAMETER_PACK(t)
expands to: (((__extension__ ({ __typeof (t) const __t = (t); if (((enum 
tree_code) (__t)->base.code) != (TEMPLATE_PARM_INDEX)) tree_check_failed 
(__t, __FILE__, __LINE__, __FUNCTION__, (TEMPLATE_PARM_INDEX), 0); __t; 
}))->base.lang_flag_0))


which I don't understand because it seems to just recheck
that TREE_CODE is TEMPLATE_PARM_INDEX and if not issue an
error message(from tree_check_failed).  Why is there this
redundant check for TREE_CODE(t)== TEMPLATE_PARM_INDEX?
The only useful thing it does is return t->base.lang_flag_0.

What am I missing?

TIA.

-Larry



[gnu.org #456639] broken link in libstdc++ manual online

2009-06-23 Thread Rob Myers via RT
> [spoon.reloa...@gmail.com - Sun Jun 21 16:20:11 2009]:
> 
> In the page of the libstdc++ manual about the API documentation:
> http://gcc.gnu.org/onlinedocs/libstdc++/api.html
> the link to "the latest collection" goes to a "404 Not Found"

Hi. 

I'm forwarding you this email that was assigned to gnu webmasters for
the attention of your web page guys.

Thanks.

- Rob Myers.



Re: [gnu.org #456639] broken link in libstdc++ manual online

2009-06-23 Thread Jonathan Wakely
2009/6/23 Rob Myers via RT:
>>
>> In the page of the libstdc++ manual about the API documentation:
>> http://gcc.gnu.org/onlinedocs/libstdc++/api.html
>> the link to "the latest collection" goes to a "404 Not Found"
>
> Hi.
>
> I'm forwarding you this email that was assigned to gnu webmasters for
> the attention of your web page guys.

CC'd to the libstdc++ list


Re: TEMPLATE_PARM_PARAMETER_PACK redundant check in find_parameter_packs_r

2009-06-23 Thread Ian Lance Taylor
Larry Evans  writes:

> At pt.c:2462
>
> http://gcc.gnu.org/viewcvs/trunk/gcc/cp/pt.c?revision=148666&view=markup
>
> there's:
>
>   switch (TREE_CODE (t))
> {
> case TEMPLATE_PARM_INDEX:
>   if (TEMPLATE_PARM_PARAMETER_PACK (t))
> parameter_pack_p = true;
>   break;
>
> In gdb, macro exp shows:
>
> (gdb) macro exp TEMPLATE_PARM_PARAMETER_PACK(t)
> expands to: (((__extension__ ({ __typeof (t) const __t = (t); if
> (((enum tree_code) (__t)->base.code) != (TEMPLATE_PARM_INDEX))
> tree_check_failed (__t, __FILE__, __LINE__, __FUNCTION__,
> (TEMPLATE_PARM_INDEX), 0); __t; }))->base.lang_flag_0))
>
> which I don't understand because it seems to just recheck
> that TREE_CODE is TEMPLATE_PARM_INDEX and if not issue an
> error message(from tree_check_failed).  Why is there this
> redundant check for TREE_CODE(t)== TEMPLATE_PARM_INDEX?
> The only useful thing it does is return t->base.lang_flag_0.
>
> What am I missing?

You may be missing the fact that TEMPLATE_PARM_PARAMETER_PACK only
expands to code which checks t->base.code if ENABLE_TREE_CHECKING is
defined.  This is the default in the development sources, but not on the
release branches.  On release branches (or if you configure with
--enable-checking=no), TEMPLATE_PARM_PARAMETER_PACK really does just
return base.lang_flag_0.

Or, you may be missing the fact that TEMPLATE_PARM_PARAMETER_PACK is not
always called in a case where it is so very obvious that it is being
called on a TEMPLATE_PARM_INDEX tree (e.g., the call in
template_parameter_pack_p).  Since it would always be wrong to call it
on a tree which is not a TEMPLATE_PARM_INDEX, but since there is no
compiler type checking (since everything has type tree) the code
verifies that the macro is being used correctly.

These kinds of runtime checks have caught many bugs early on and saved a
lot of debugging time.

Ian


Question about dead_or_predicable

2009-06-23 Thread Steven Bosscher
Hi,

I have a question about ifcvt.c:dead_or_predicable.  This function is
pretty complicated and it's not really clear to me what it is doing.
But I'll have to understand what is going on because there is a bug in
this function that I would like to fix (see
http://gcc.gnu.org/PR40525).

The code I don't understand, is the part where the changes are
actually applied.  This code starts with the following comment:

 33547rth
 33547rth  no_body:
 33547rth   /* We don't want to use normal invert_jump or
redirect_jump because
 33547rth  we don't want to delete_insn called.  Also, we
want to do our own
 33547rth  change group management.  */
 33547rth

The comment doesn't explain *why* we don't want delete_insn to be
called, or why we want to do our own change group management.

I am guessing this code was required when ifcvt.c was contributed (the
file appears in r33547). The code has, at this point, proven that
MERGE_BB is "dead or predicable" and it now tries to rewire the insns
stream and the CFG to apply the transformation. Perhaps historically
the changes to the jump from old_dest to new_dest had to be in the
same change group as the changes to predicate the insns.

But now there is OTHER_BB, which is not documented, and for which the
changes to the jumps are done *after* the change group for everything
else has been applied.  The OTHER_BB is a bit newer (but only a little
bit) than the first revision in SVN of ifcvt.c. These changes are
applied directly, i.e. not in the change group, and it isn't even
verified that the changes are applied successfully?!

What I would like to do, is to apply the change group *before*
changing the jumps. That way, I can move the apply_change_group() call
into the "if (HAVE_conditional_execution)" block higher up in
dead_or_predicable, and try the non-conditional execution case if
predicating the insns fails. When neither succeeds, we're done.  When
one or the other succeeds, we change the jumps and rewire the CFG.

But since in one case (from merge_bb to new_dest) the changes are
applied as part of a change group, and in the other case (other_bb to
new_dest) the changes are not, I'm confused. Can I assume that the
changes to the jumps never fail? Or should the changes to redirect
other_bb to new_dest actually be part of the big change group (or at
least, should it be somehow verified that these changes are applied
successfully) and is it a bug in ifcvt.c that these changes are
applied without checks?

Hope the question makes enough sense for someone to help me out here :-)

Ciao!
Steven


gcc-4.4-20090623 is now available

2009-06-23 Thread gccadmin
Snapshot gcc-4.4-20090623 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20090623/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.4 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_4-branch 
revision 148879

You'll find:

gcc-4.4-20090623.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.4-20090623.tar.bz2 C front end and core compiler

gcc-ada-4.4-20090623.tar.bz2  Ada front end and runtime

gcc-fortran-4.4-20090623.tar.bz2  Fortran front end and runtime

gcc-g++-4.4-20090623.tar.bz2  C++ front end and runtime

gcc-java-4.4-20090623.tar.bz2 Java front end and runtime

gcc-objc-4.4-20090623.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.4-20090623.tar.bz2The GCC testsuite

Diffs from 4.4-20090616 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.4
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Question about dead_or_predicable

2009-06-23 Thread Dave Korn
Steven Bosscher wrote:

> The comment doesn't explain *why* we don't want delete_insn to be
> called, or why we want to do our own change group management.

  dead_or_predicable is called from find_if_case_[12] which is called from
find_if_header which is called from if_convert within a FOR_EACH_BB loop; is
this perhaps another example of "don't mess with something while you're
iterating over it or your iterator might break"?

cheers,
  DaveK


Things in specs vs. things explicit in the generated source [was Re: [PATCH] Pass -mtune and -march options to assembler.]

2009-06-23 Thread Dave Korn
Mark Mitchell wrote:

> I agree with you that people *should* assemble .s files with "gcc".
> But, in practice, many of them assemble them with "as" -- just as many
> people link with "ld".  (We still install these tools in $bindir, not in
> $libexecdir, which is what we should do if we really don't want people
> using them...)  In any case, relying on specs is error-prone; it's
> subject to users failing to pass the right flags, invoking the wrong
> tool, and on changes to specs processing over time.  In contrast,
> putting the information in the assembly file accurately conveys the
> user's intent when performing the initial compilation.

  So, what does everyone think of putting something like this in a spec:

+/* To implement C++ function replacement we always wrap the cxx
+   malloc-like operators.  See N2800 #17.6.4.6 [replacement.functions] */
+#define CXX_WRAP_SPEC "\
+  --wrap _Znwj \
+  --wrap _Znaj \
+  --wrap _ZdlPv \
+  --wrap _ZdaPv \
+  --wrap _ZnwjRKSt9nothrow_t \
+  --wrap _ZnajRKSt9nothrow_t \
+  --wrap _ZdlPvRKSt9nothrow_t \
+  --wrap _ZdaPvRKSt9nothrow_t "

... and relying on it being applied consistently for strict C++ conformance?

  (Background: I'm trying to make libstdc++ work as a DLL on windows; windows
DLLs differ from ELF .SOs as they must be fully-resolved at link time; this
means that e.g. calls to operator new from within libstdc++ itself have to be
resolved when the DLL is compiled, and won't subsequently take note if there's
a replacement definition available when an application gets finally linked.
So, making all C++ libs link against wrappers that can be redirected at
runtime according to the replacement functions available in the exe seems like
the way to go, but is this the way to do it?)

cheers,
  DaveK



Basic frontend question about layout

2009-06-23 Thread Jerry Quinn
Hi, folks,

I'm having trouble seeing how layout is specified at the GENERIC level
for RECORD_TYPEs.  The docs and comments in tree.def say that you cannot
rely on the order of fields of the type.  In stor-layout.c,
layout_types() seems to do the obvious thing, taking the fields in
order, but the docs make it sound like there is no way to be sure what
you'll get.

Theoretically this would mean that you couldn't even reliably link a
structure in two separate compilation units, which is bogus.

Could someone please clear up my confusion?

Thanks,
Jerry Quinn




Re: Basic frontend question about layout

2009-06-23 Thread Andrew Pinski
On Tue, Jun 23, 2009 at 8:48 PM, Jerry Quinn wrote:
> Hi, folks,
>
> I'm having trouble seeing how layout is specified at the GENERIC level
> for RECORD_TYPEs.  The docs and comments in tree.def say that you cannot
> rely on the order of fields of the type.  In stor-layout.c,
> layout_types() seems to do the obvious thing, taking the fields in
> order, but the docs make it sound like there is no way to be sure what
> you'll get.
> Could someone please clear up my confusion?

The confusion here is that layout_types is separate from the rest of
the middle-end and the front-end could do the layout themselves and
the front-end calls layout_type if it does not do the layout itself.

So for an example, Ada layouts the records themselves and some times
has a different order of the fields than the layouted offsets.

Thanks,
Andrew Pinski


Re: Basic frontend question about layout

2009-06-23 Thread Jerry Quinn
On Tue, 2009-06-23 at 20:52 -0700, Andrew Pinski wrote:
> On Tue, Jun 23, 2009 at 8:48 PM, Jerry Quinn wrote:
> > Hi, folks,
> >
> > I'm having trouble seeing how layout is specified at the GENERIC level
> > for RECORD_TYPEs.  The docs and comments in tree.def say that you cannot
> > rely on the order of fields of the type.  In stor-layout.c,
> > layout_types() seems to do the obvious thing, taking the fields in
> > order, but the docs make it sound like there is no way to be sure what
> > you'll get.
> > Could someone please clear up my confusion?
> 
> The confusion here is that layout_types is separate from the rest of
> the middle-end and the front-end could do the layout themselves and
> the front-end calls layout_type if it does not do the layout itself.

As I look at the code, it seems like a front end doesn't actually need
to call layout_type at all.  Is that correct?  If so, is the layout
C-compatible, assuming the field types being used are individually
compatible?

I'm gleaning this from a mix of code, comments, and gccint texinfo docs.
Would it make sense to update them to indicate that layout is in order
as long as the rest of the language-specific front end doesn't choose to
lay things out differently?

Thanks,
Jerry




Re: Question about dead_or_predicable

2009-06-23 Thread Steven Bosscher
On Wed, Jun 24, 2009 at 1:49 AM, Dave
Korn wrote:
> Steven Bosscher wrote:
>
>> The comment doesn't explain *why* we don't want delete_insn to be
>> called, or why we want to do our own change group management.
>
>  dead_or_predicable is called from find_if_case_[12] which is called from
> find_if_header which is called from if_convert within a FOR_EACH_BB loop; is
> this perhaps another example of "don't mess with something while you're
> iterating over it or your iterator might break"?

I don't think so. I know the rest of ifcvt.c pretty well. The code
expects blocks to be merged, moved, etc. when it has identified an
if-block.

Ciao!
Steven