Abt RTL expression

2006-11-10 Thread Rohit Arul Raj

Hello all,

While going through the RTL dumps, I noticed a few things which i need
to get clarified.
Below is the extract, in which i get the doubt.

(insn 106 36 107 6 (set (reg:SI 13 a5)
   (const_int -20 [0xffec])) 17 {movsi_short_const} (nil)
   (nil))

(insn 107 106 108 6 (parallel [
   (set (reg:SI 13 a5)
   (plus:SI (reg:SI 13 a5)
   (reg/f:SI 14 a6)))
   (clobber (reg:CC 21 cc))
   ]) 29 {addsi3} (nil)
   (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6)
   (const_int -20 [0xffec]))
   (nil)))

(insn 108 107 38 6 (set (reg:SI 13 a5)
   (mem/c:SI (reg:SI 13 a5) [0 S4 A32])) 15 {movsi_load} (nil)
   (nil))

My Deductions:
1. In insn 106, we are storing -16 in to the register 13 (a5).
2. In insn 107, we are taking the value from register 14 (a6) which is
a pointer and subtracting 16 from it and storing in a5.

Now a6 contains the stack pointer. Therefore a5 now contains SP-16.

3. In insn 108, we are storing the value pointed by the register a5 in to a5.
  Is my deduction for insn 108 right?
  If it is right, shouldn't the expression be like this:
   (mem/c:SI (reg/f:SI 13 a5) [0 S4 A32])) 15 {movsi_load} (nil)

if i am wrong, can anyone tell me what actually insn 108 means?

Regards,
Rohit


strict aliasing question

2006-11-10 Thread Howard Chu
I see a lot of APIs (e.g. Cyrus SASL) that have accessor functions 
returning values through a void ** argument. As far as I can tell, this 
doesn't actually cause any problems, but gcc 4.1 with -Wstrict-aliasing 
will complain. For example, take these two separate source files:


alias1.c


#include 

extern void getit( void **arg );

main() {
   int *foo;

   getit( (void **)&foo);
   printf("foo: %x\n", *foo);
}



alias2.c

static short x[] = {16,16};

void getit( void **arg ) {
   *arg = x;
}


gcc -O3 -fstrict-aliasing -Wstrict-aliasing *.c -o alias

The program prints the expected result with both strict-aliasing and 
no-strict-aliasing on my x86_64 box.  As such, when/why would I need to 
worry about  this warning?


--
 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sunhttp://highlandsun.com/hyc
 OpenLDAP Core Teamhttp://www.openldap.org/project/



Re: strict aliasing question

2006-11-10 Thread Richard Guenther

On 11/10/06, Howard Chu <[EMAIL PROTECTED]> wrote:

I see a lot of APIs (e.g. Cyrus SASL) that have accessor functions
returning values through a void ** argument. As far as I can tell, this
doesn't actually cause any problems, but gcc 4.1 with -Wstrict-aliasing
will complain. For example, take these two separate source files:

alias1.c


#include 

extern void getit( void **arg );

main() {
int *foo;

getit( (void **)&foo);
printf("foo: %x\n", *foo);
}



alias2.c

static short x[] = {16,16};

void getit( void **arg ) {
*arg = x;
}


gcc -O3 -fstrict-aliasing -Wstrict-aliasing *.c -o alias

The program prints the expected result with both strict-aliasing and
no-strict-aliasing on my x86_64 box.  As such, when/why would I need to
worry about  this warning?


If you compile with -O3 -combine *.c -o alias it will break.

Richard.


RE: [m32c-elf] losing track of register lifetime in combine?

2006-11-10 Thread Dave Korn
On 10 November 2006 07:13, Ian Lance Taylor wrote:

> DJ Delorie <[EMAIL PROTECTED]> writes:
> 
>> I compared the generated code with an equivalent explicit test,
>> and discovered that gcc uses a separate rtx for the intermediate:
>> 
>> i = 0xf;
>> if (j >= 16)
>>   {
>> int i2;
>> i2 = i >> 8;
>> i = i2 >> 8;
>> j -= 16;
>>   }
>> 
>> This seems to avoid the combiner problem, becuase you don't have the
>> same register being set and being used in one insn.  Does this explain
>> why combine was having a problem, or was this a legitimate thing to do
>> and the combiner is still wrong?  Using a temp in the expander works
>> around the problem.
> 
> Interesting.  Using a temporary is the natural way to implement this
> code.  But not using a temporary should be valid.  So I think there is
> a bug in combine.


  Doesn't this just suggest that there's a '+' constraint modifier missing
from an operand in a pattern in the md file somewhere, such as the one that
expands the conditional in the first place?


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



RE: How to create both -option-name-* and -option-name=* options?

2006-11-10 Thread Dave Korn
On 10 November 2006 07:34, Brooks Moses wrote:

> The Fortran front end currently has a lang.opt entry of the following form:
> 
>ffixed-line-length-
>Fortran RejectNegative Joined UInteger
> 
> I would like to add to this the following option which differs in the
> last character, but should be treated identically:
> 
>ffixed-line-length=
>Fortran RejectNegative Joined UInteger

>In file included from tm.h:7,
> from ../../svn-source/gcc/genconstants.c:32:
>options.h:659: error: redefinition of `OPT_ffixed_line_length_'
>options.h:657: error: `OPT_ffixed_line_length_' previously defined
>  here
> 
> This is because both the '=' and the '-' in the option name reduce to a
> '_' in the enumeration name, which of course causes the enumerator to
> get defined twice -- and that's a problem, even though I'm quite happy
> for the options to both be treated identically.
> 
> There's not really any good way around this problem, is there?


  It may seem a bit radical, but is there any reason not to modify the
option-parsing machinery so that either '-' or '=' are treated interchangeably
for /all/ options with joined arguments?  That is, whichever is specified in
the .opt file, the parser accepts either?  


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Abt RTL expression - combining instruction

2006-11-10 Thread Rohit Arul Raj

Hi all,

Finally got the combined compare_and_branch instruction to work. But
it has some side effects while testing other files.

20010129-1.s: Assembler messages:
20010129-1.s:46: Error: Value of 0x88 too large for 7-bit relative
instruction offset

I just designed my compare and branch insn as given below:

(define_insn "compare_and_branch_insn"
  [(set (pc)
   (if_then_else (match_operator 3 "comparison_operator"
  [(match_operand:SI 1 "register_operand"  "r,r")
   (match_operand:SI 2 "nonmemory_operand" "O,r")])
 (label_ref (match_operand 0 "" ""))
 (pc)))]
""
"*
   output_asm_insn (\"cmp\\t%2, %1\", operands);
/* Body of branch insn */

"
[(set (attr "length") (if_then_else
 (ltu
   (plus
 (minus
   (match_dup 0)
   (pc))
 (const_int 128))
   (const_int 250))
 (const_int 4)
 (if_then_else
   (ltu
 (plus
   (minus
 (match_dup 0)
 (pc))
   (const_int 65536))
 (const_int 131072))
   (if_then_else (eq_attr "align_lbranch" "true")
 (const_int 6)
 (const_int 5))
   (if_then_else (eq_attr "call_type" "short")
  (const_int 8)
  (const_int 16)
  (set_attr "delay_type" "delayed")
  (set_attr "type" "compare,branch")]
)

1. Does attribute length affect the calculation of offset?
2. What are the other factors that i have to take into consideration
while combining a compare and branch instruction.

Regards,
Rohit

On 08 Nov 2006 07:00:29 -0800, Ian Lance Taylor <[EMAIL PROTECTED]> wrote:

"Rohit Arul Raj" <[EMAIL PROTECTED]> writes:

> I have used cbranchmode4 instruction to generate combined compare and
> branch instruction.
>
> (define_insn "cbranchmode4"
>  (set (pc) (if_then_else
>   (match_operator:CC 0 "comparison_operator"
> [ (match_operand:SI 1  "register_operand"  "r,r")
>   (match_operand:SI 2 "nonmemory_operand" "O,r")])
>   (label_ref (match_operand 3 "" ""))
>   (pc)))]
> This pattern matches if the code is of the form
>
> if ( h == 1)
>  p = 0;
>
> if the code is of the form
> if (h), if (h >= 0)
> p = 0;
>
> Then it matches the seperate compare and branch instructions and not
> cbranch instruction.
>
> Can anyone point out where i am going wrong?

If you have a cbranch insn, and you want that one to always be
recognized, then why do you also have separate compare and branch
insns?

Ian



Question on tree-nested.c:convert_nl_goto_reference

2006-11-10 Thread Richard Kenner
I have a test case (involving lots of new code that's not checked in yet)
that's blowing up with a nonlocal goto and I'm wondering how it ever worked
because it certainly appears to me that DECL_CONTEXT has to be copied
from label to new_label.  But it isn't.  So how are nonlocal gotos
working?


Re: Abt long long support

2006-11-10 Thread Mohamed Shafi

On 11/10/06, Mike Stump <[EMAIL PROTECTED]> wrote:

On Nov 9, 2006, at 6:39 AM, Mohamed Shafi wrote:
> When i diff the rtl dumps for programs passing negative value with and
> without frame pointer i find  changes from file.greg .

A quick glance at the rtl shows that insn 95 tries to use [a4+4] but
insn 94 clobbered a4 already, also d3 is used by insn 93, but there
isn't a set for it.



The following part of the  rtl dump of greg pass is the one which is
giving the wrong output.


(insn 90 29 91 6 (set (reg:SI 12 a4)
   (const_int -16 [0xfff0])) 17 {movsi_short_const} (nil)
   (nil))

(insn 91 90 94 6 (parallel [
   (set (reg:SI 12 a4)
   (plus:SI (reg:SI 12 a4)
   (reg/f:SI 14 a6)))
   (clobber (reg:CC 21 cc))
   ]) 29 {addsi3} (nil)
   (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6)
   (const_int -16 [0xfff0]))
   (nil)))

(insn 94 91 95 6 (set (reg:SI 12 a4)
   (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S4 A32])) 15 {movsi_load} (nil)
   (nil))

(insn 95 94 31 6 (set (reg:SI 13 a5 [orig:12+4 ] [12])
   (mem/c:SI (plus:SI (reg:SI 12 a4)
   (const_int 4 [0x4])) [0 D.1863+4 S4 A32])) 15 {movsi_load} (nil)
   (nil))

(insn 31 95 87 6 (parallel [
   (set (reg:DI 2 d2)
   (minus:DI (reg:DI 0 d0 [34])
   (reg:DI 12 a4)))
   (clobber (reg:CC 21 cc))
   ]) 33 {subdi3} (nil)
   (nil))

Setting of register d3 is actually done in insns 31 . (set (reg:DI 2 d2)
Since this is in DI mode it is using d2 and d3 in DI mode.Similary d0
and a4 is accessed in DI mode. So d1 and a5 is also being used in this
insns.Hence negations is proper.

Just like Mike pointed out 95 tries to use [a4+4] but insn 94
clobbered a4 already.
The compiler should actually generate insn similar to  insn 91 and 92
in between insn 94 and 95, but not using a4,or after saving a4. This
is not happening. Insn 90 to 94 are emitted  only from greg pass
onwards.

When i inserted the necessary assembly instructions correspoinding to
movsi_short_const  and addsi3 between insns 91 and 92 in the assemble
file , the program worked fine.

There are spill codes for insns 31 in the beginning of the the .greg
file but i cant understand anything of that.

Spilling for insn 31.
Using reg 2 for reload 2
Using reg 12 for reload 3
Using reg 13 for reload 0
Using reg 13 for reload 1

The same program works for gcc 3.2 and gcc3.4.6 ports of the same private target

I am not sure whether this is because of reload pass or global
register allocation.

1. What could be the reason for this behavior?
2. How to overcome this type of behavior

Regards,
Shafi


Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Joern RENNECKE

Mike Stump wrote:

 

Now, what are the benefits and weaknesses between mine and your, you  
don't have to carry around type_context the way mine would, that's a  
big win.  You don't have to do anything special move a reference to a  
type around, that's a big win.  You have to do a structural walk if  
there are any bits that are used for type equality.


No, these bits can be placed together - a structural walk is only 
necessary when (some of) these bits themselves need more scrutiny - i.e. 
on at least one of the
sides some of the constituent parts is partially incomplete.  And I 
can't see how you can avoid that complexity.


  In my scheme, I  don't have to.  I just have a vector of items, they 
are right next to  each other, in the same cache line.


Again, the equality of the items might not not trivial.


Re: [m32c-elf] losing track of register lifetime in combine?

2006-11-10 Thread Ian Lance Taylor
"Dave Korn" <[EMAIL PROTECTED]> writes:

> On 10 November 2006 07:13, Ian Lance Taylor wrote:
> 
> > DJ Delorie <[EMAIL PROTECTED]> writes:
> > 
> >> I compared the generated code with an equivalent explicit test,
> >> and discovered that gcc uses a separate rtx for the intermediate:
> >> 
> >> i = 0xf;
> >> if (j >= 16)
> >>   {
> >> int i2;
> >> i2 = i >> 8;
> >> i = i2 >> 8;
> >> j -= 16;
> >>   }
> >> 
> >> This seems to avoid the combiner problem, becuase you don't have the
> >> same register being set and being used in one insn.  Does this explain
> >> why combine was having a problem, or was this a legitimate thing to do
> >> and the combiner is still wrong?  Using a temp in the expander works
> >> around the problem.
> > 
> > Interesting.  Using a temporary is the natural way to implement this
> > code.  But not using a temporary should be valid.  So I think there is
> > a bug in combine.
> 
> 
>   Doesn't this just suggest that there's a '+' constraint modifier missing
> from an operand in a pattern in the md file somewhere, such as the one that
> expands the conditional in the first place?

Not necessarily.  I would guess that it's a define_expand which
generates a pseudo-register and uses it as
(set (reg) (ashiftrt (reg) (const_int 8)))
That is OK.

In any case a '+' constraint doesn't make any difference this early in
the RTL passes.  combine doesn't look at constraints.

Ian


Re: Abt RTL expression - combining instruction

2006-11-10 Thread Ian Lance Taylor
"Rohit Arul Raj" <[EMAIL PROTECTED]> writes:

> 1. Does attribute length affect the calculation of offset?

It does if you tell it to.  The "length" attribute must be managed
entirely by your backend.  Most backends with variable size branches
use the length attribute to select which branch insn to generate.  The
usual pattern is to call get_attr_length and use that to pick the
assembler instruction.  For example, jump_compact in sh.md.

Who wrote the backend that you are modifying?  Why can't you ask them?

Ian


Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Doug Gregor

On 11/9/06, Mike Stump <[EMAIL PROTECTED]> wrote:

On Nov 8, 2006, at 5:59 AM, Doug Gregor wrote:
> However, this approach could have some odd side effects when there are
> multiple mappings within one context. For instance, we could have
> something like:
>
>  typedef int foo_t;
>  typedef int bar_t;
>
>  foo_t* x = strlen("oops");

x is a decl, the decl has a type, the context of that instance of the
type is x.

map(int,x) == foo_t.

It is this, because we know that foo_x was used to create x and we
set map(int,x) equal to foo_t as it is created.


Ah, I understand now. When you wrote "context", I was thinking of some
coarse-grained approach to context, e.g., a scope or a block. With
variable-level granularity, your idea certainly works.


> The error message that pops out would likely reference "bar_t *"

map(int,x) doesn't yield bar_t.


Right, got it.


> We can't literally combine T and U into a single canonical
> type node, because they start out as different types.

?


To get a truly canonical type node, whenever we create a new type that
may be equivalent to an existing type, we need to find that existing
type node at the time that we create the new type, e.g.,

 typedef int foo_t;

When we create the decl for "foo_t", it's TREE_TYPE will be "int" (the
canonical type node) and with its context we know the name the user
wrote ("foo_t"). When we create "foo_t*", the idea is the same: find
the canonical type node (int*) and its context will till us the actual
type written ("foo_t *"). All the time, we're finding the canonical
type node for a particular type ("foo_t *") before we go and create a
new type node.

With concepts, there are cases where we end up creating two different
type nodes that we later find out should have been the same type node.
Here's an example:

 template
 where LessThanComparable && SameType
 const T& weird_min(const T& t, const U& u) {
   return t < u? t : u;
 }

When we parse the template header, we create two different type nodes
for T and U. Then we start parsing the where clause, and create a type
node for LessThanComparable. Now we hit the SameType requirement,
which says that T and U are the same type. It's a little late to make
T and U actually have the same type node (unless we want to re-parse
the template or re-write the AST).


> Granted, we could layer a union-find implementation (that better
> supports
> concepts) on top of this approach.

Ah, but once you break the fundamental quality that different
addresses implies different types, you limit things to structural
equality and that is slow.


Not necessarily. If you have an efficient way to map from a type to
its canonical type node, then you pay for that mapping but not for a
structural equality check. In a union-find data structure, the mapping
amounts to a bit of pointer chasing... but in most cases it's only one
pointer hop. Actually, without concepts we can guarantee that it's
only one pointer hop... with concepts, we need to keep a little more
information around in the AST and we sometimes have more than one
pointer hop to find the answer.

I already use a union-find data structure inside ConceptGCC, because I
don't have the option to map to a canonical type when type nodes are
initially created. But, since there are no canonical type nodes in GCC
now, I have to use a hash table that hashes based on structural
properties to keep track of the canonical type nodes. *Any* system
that gives us canonical type nodes in GCC would be a huge benefit for
ConceptGCC.

 Cheers,
 Doug


RE: [m32c-elf] losing track of register lifetime in combine?

2006-11-10 Thread Dave Korn
On 10 November 2006 15:01, Ian Lance Taylor wrote:


> In any case a '+' constraint doesn't make any difference this early in
> the RTL passes.  combine doesn't look at constraints.


  bah, of course!  Ignore me, I'll just go sit in the dunce's corner for a
while :)

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Abt long long support

2006-11-10 Thread Ian Lance Taylor
"Mohamed Shafi" <[EMAIL PROTECTED]> writes:

> (insn 94 91 95 6 (set (reg:SI 12 a4)
> (mem/c:SI (reg:SI 12 a4) [0 D.1863+0 S4 A32])) 15 {movsi_load} (nil)
> (nil))
> 
> (insn 95 94 31 6 (set (reg:SI 13 a5 [orig:12+4 ] [12])
> (mem/c:SI (plus:SI (reg:SI 12 a4)
> (const_int 4 [0x4])) [0 D.1863+4 S4 A32])) 15 {movsi_load} 
> (nil)
> (nil))

> I am not sure whether this is because of reload pass or global
> register allocation.

If those two instructions appear for the first time in the .greg dump
file, then they have been created by reload.

> 1. What could be the reason for this behavior?

I'm really shooting in the dark here, but my guess is that you have a
define_expand for movdi that is not reload safe.  You can do this
operation correctly, you just have to reverse the instructions: load
a5 from (a4 + 4) before you load a4 from (a4).  See, e.g.,
mips_split_64bit_move in mips.c and note the use of
reg_overlap_mentioned_p.

Ian


Re: Abt RTL expression

2006-11-10 Thread Ian Lance Taylor
"Rohit Arul Raj" <[EMAIL PROTECTED]> writes:

> (insn 106 36 107 6 (set (reg:SI 13 a5)
> (const_int -20 [0xffec])) 17 {movsi_short_const} (nil)
> (nil))
> 
> (insn 107 106 108 6 (parallel [
> (set (reg:SI 13 a5)
> (plus:SI (reg:SI 13 a5)
> (reg/f:SI 14 a6)))
> (clobber (reg:CC 21 cc))
> ]) 29 {addsi3} (nil)
> (expr_list:REG_EQUIV (plus:SI (reg/f:SI 14 a6)
> (const_int -20 [0xffec]))
> (nil)))
> 
> (insn 108 107 38 6 (set (reg:SI 13 a5)
> (mem/c:SI (reg:SI 13 a5) [0 S4 A32])) 15 {movsi_load} (nil)
> (nil))
> 
> My Deductions:
> 1. In insn 106, we are storing -16 in to the register 13 (a5).

Yes.

> 2. In insn 107, we are taking the value from register 14 (a6) which is
> a pointer and subtracting 16 from it and storing in a5.

Yes.

> Now a6 contains the stack pointer. Therefore a5 now contains SP-16.
> 
> 3. In insn 108, we are storing the value pointed by the register a5 in to a5.

I would describe it as a load from memory, but, yes.

>Is my deduction for insn 108 right?
>If it is right, shouldn't the expression be like this:
> (mem/c:SI (reg/f:SI 13 a5) [0 S4 A32])) 15 {movsi_load} (nil)

Yes, probably it should.  You neglected to say which dump you are
looking at.  REG_POINTER, which is the flag that generates the /f, is
not reliable after reload.

Does it matter?  In a memory load, the register has to hold a pointer
value anyhow, so I don't see how it could matter for code generation.
REG_POINTER exists because on the PA addresses which use two registers
need to know which one is the pointer and which is the offset, for
some hideous reason which I hope I never learn.  In a memory address
with only one address REG_POINTER doesn't seem like an interesting
flag.

Ian


Re: Abt long long support

2006-11-10 Thread Rask Ingemann Lambertsen
On Thu, Nov 09, 2006 at 11:52:02AM -0800, Mike Stump wrote:

> The way the instructions are numbered suggests that the code went  
> wrong before this point.  You have to read and understand all the  
> dumps, whether they are right or wrong and why, track down the code  
> in the compiler that is creating the wrong code and then see if you  
> can guess why.

   I usually diff the dump files. If your screen is wide enough, a side by
side diff can help you a lot for small functions.

-- 
Rask Ingemann Lambertsen


Re: Question on tree-nested.c:convert_nl_goto_reference

2006-11-10 Thread Ian Lance Taylor
[EMAIL PROTECTED] (Richard Kenner) writes:

> I have a test case (involving lots of new code that's not checked in yet)
> that's blowing up with a nonlocal goto and I'm wondering how it ever worked
> because it certainly appears to me that DECL_CONTEXT has to be copied
> from label to new_label.  But it isn't.  So how are nonlocal gotos
> working?

I think they mostly work because the DECL_CONTEXT of a label isn't
very important.  As far as I know we only use it to make sure the
label is emitted.

But I do get a failure in verify_flow_info with the appended test
case.  verify_flow_info is only used when checking is enabled, so
maybe that is why people aren't seeing it?  Maybe we just need to add
this test case to the testsuite?

Ian

int
main ()
{
  int f1 ()
  {
__label__ lab;

int f2 ()
{
  goto lab;
}

return f2 () + f2 ();

  lab:
return 2;
  }

  if (f1 () != 2)
abort ();
  exit (0);
}


Re: Planned LTO driver work

2006-11-10 Thread Ian Lance Taylor
Mark Mitchell <[EMAIL PROTECTED]> writes:

> Though, if we *are* doing the template-repository dance, we'll have to
> do that for a while, declare victory, then invoke the LTO front end,
> and, finally, the actual linker, which will be a bit complicated.  It
> might be that we should move the invocation of the real linker back into
> gcc.c, so that collect2's job just becomes generating the right pile of
> object files via template instantiation and static
> constructor/destructor generation?

For most targets we don't need to invoke collect2 at all anyhow,
unless the user is using -frepo.  It's somewhat wasteful that we
always run it.

Moving the invocation of the linker into the gcc driver makes sense to
me, especially if it we can skip invoking collect2 entirely.  Note
that on some targets, ones which do not use GNU ld, collect2 does
provide the feature of demangling the ld error output.  That facility
would have to be moved into the gcc driver as well.

Ian


Re: Planned LTO driver work

2006-11-10 Thread Mark Mitchell
Ian Lance Taylor wrote:
> Mark Mitchell <[EMAIL PROTECTED]> writes:
> 
>> Though, if we *are* doing the template-repository dance, we'll have to
>> do that for a while, declare victory, then invoke the LTO front end,
>> and, finally, the actual linker, which will be a bit complicated.  It
>> might be that we should move the invocation of the real linker back into
>> gcc.c, so that collect2's job just becomes generating the right pile of
>> object files via template instantiation and static
>> constructor/destructor generation?
> 
> For most targets we don't need to invoke collect2 at all anyhow,
> unless the user is using -frepo.  It's somewhat wasteful that we
> always run it.
> 
> Moving the invocation of the linker into the gcc driver makes sense to
> me, especially if it we can skip invoking collect2 entirely.  Note
> that on some targets, ones which do not use GNU ld, collect2 does
> provide the feature of demangling the ld error output.  That facility
> would have to be moved into the gcc driver as well.

I agree that this sounds like the best long-term plan.  I'll try to work
out whether it's actually a short-term win for me to do anything to
collect2 at all; if not, then I'll just put stuff straight into the
driver, since that's what we really want anyhow.

Thanks for the feedback!

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


expanding __attribute__((format,..))

2006-11-10 Thread Nuno Lopes

Hi,

I've been thinking that it would be a good idea to extend the current 
__attribute__((format,..)) to use an arbitrary user callback.
I searched the mailing list archives and I found some references to similar 
ideas. So do you think this is feasible?


It would allow specifying arbitrary char codes and also arbitrary number of 
required arguments. It would be nice if it could also import the attributes 
from other defined callbacks.


E.g.:
#define my_format_callback(x,params)
   (import printf),
   ("%v",zval**, size_t),
   ("%foo", void*)

int my_printf(char *format, ...) 
__attribute__((format,("my_format_callback")))



Thanks in advance,
Nuno 



Re: expanding __attribute__((format,..))

2006-11-10 Thread Ian Lance Taylor
"Nuno Lopes" <[EMAIL PROTECTED]> writes:

> I've been thinking that it would be a good idea to extend the current
> __attribute__((format,..)) to use an arbitrary user callback.
> I searched the mailing list archives and I found some references to
> similar ideas. So do you think this is feasible?

I think it would be nice.  We usually founder on trying to provide a
facility which can replace the builtin printf support, since printf is
very complicated.

I kind of liked this idea:
http://gcc.gnu.org/ml/gcc-patches/2005-07/msg00797.html
but of course it was insane.

And then there was this idea, which I think was almost workable:
http://gcc.gnu.org/ml/gcc/2005-08/msg00469.html
But nobody really liked it.

So you need to find something which is on the one hand very simple and
on the other hand able to support the complexity which people need in
practice.

Ian


Re: expanding __attribute__((format,..))

2006-11-10 Thread Joseph S. Myers
On Fri, 10 Nov 2006, Ian Lance Taylor wrote:

> I kind of liked this idea:
> http://gcc.gnu.org/ml/gcc-patches/2005-07/msg00797.html
> but of course it was insane.

I still think a higher level state machine as described in the followups 
is how things should be done.

The first step (or the first hundred steps) would need to be a series of 
small incremental patches moving all the existing logic about format 
string structure from the code into more generic datastructures.

In so doing you need to consider how xgettext could be made to extract a 
superset of the possible diagnostic sentences so that i18n for the format 
checking messages can work properly again (which requires that full 
sentences be passed to xgettext and be known by the translators, while 
maintainability of the format descriptions requires that the information 
about valid combinations be maintained at a different level more like 
that used by the present datastructures).

Once the datastructures are suitably general, then interfaces to them can 
be considered.

-- 
Joseph S. Myers
[EMAIL PROTECTED]


PATCH: wwwdocs: Update Intel64 and IA32 SDM website

2006-11-10 Thread H. J. Lu
Intel has published Core 2 Duo Optimization Reference Manual. I will
check in this patch to update wwwdocs.


H.J.

2006-11-10  H.J. Lu  <[EMAIL PROTECTED]>

* readings.html: Update Intel64 and IA32 SDM website.

Index: readings.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v
retrieving revision 1.160
diff -u -p -r1.160 readings.html
--- readings.html   23 Oct 2006 16:17:21 -  1.160
+++ readings.html   10 Nov 2006 17:32:21 -
@@ -133,8 +133,8 @@ names.
  i386 (i486, i586, i686, i786)
Manufacturer: Intel
 
-  http://developer.intel.com/design/pentium4/manuals/index_new.htm";>
-  IA-32 Intel Architecture Software Developer's Manuals
+  http://developer.intel.com/products/processor/manuals/index.htm";>
+Intel®64 and IA-32 Architectures Software Developer's Manuals
 
   Some information about optimizing for x86 processors, links to
   x86 manuals and documentation:


Re: strict aliasing question

2006-11-10 Thread Howard Chu

Richard Guenther wrote:

On 11/10/06, Howard Chu <[EMAIL PROTECTED]> wrote:


The program prints the expected result with both strict-aliasing and
no-strict-aliasing on my x86_64 box.  As such, when/why would I need to
worry about  this warning?


If you compile with -O3 -combine *.c -o alias it will break.

Thanks for pointing that out. But that's not a realistic danger for the 
actual application. The accessor function is always going to be in a 
library compiled at a separate time. The call will always be from a 
program built at a separate time, so -combine isn't a factor.


--
 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sunhttp://highlandsun.com/hyc
 OpenLDAP Core Teamhttp://www.openldap.org/project/



Re: expanding __attribute__((format,..))

2006-11-10 Thread Nuno Lopes

I've been thinking that it would be a good idea to extend the current
__attribute__((format,..)) to use an arbitrary user callback.
I searched the mailing list archives and I found some references to
similar ideas. So do you think this is feasible?


I think it would be nice.  We usually founder on trying to provide a
facility which can replace the builtin printf support, since printf is
very complicated.


Thanks for your quick answer. I'm glad that you are receptive to this 
problem.




I kind of liked this idea:
   http://gcc.gnu.org/ml/gcc-patches/2005-07/msg00797.html
but of course it was insane.

And then there was this idea, which I think was almost workable:
   http://gcc.gnu.org/ml/gcc/2005-08/msg00469.html
But nobody really liked it.


IMHO, those two seem way too difficult to use...



So you need to find something which is on the one hand very simple and
on the other hand able to support the complexity which people need in
practice.


I sent you some kind of proposal in the last e-mail (very LISP like, 
unfortunately), but it seems somewhat good.
I'm a PHP developer and this feature would be really great for us, as 
currently I do maintain a php script (regex-based) to do this kind of 
verifications (and a few more, like checking for use of uninitialized vars). 
But "parsing" C with regexes is kinda painful and error prone.



Nuno 



Re: Abt long long support

2006-11-10 Thread Rask Ingemann Lambertsen
On Fri, Nov 10, 2006 at 07:17:29AM -0800, Ian Lance Taylor wrote:
> "Mohamed Shafi" <[EMAIL PROTECTED]> writes:
> 
> > 1. What could be the reason for this behavior?
> 
> I'm really shooting in the dark here, but my guess is that you have a
> define_expand for movdi that is not reload safe.

   And in case the target doesn't have registers capable of holding DImode
values, consider deleting the movdi pattern.

-- 
Rask Ingemann Lambertsen


Core 2 Duo Optimization Reference Manual is available

2006-11-10 Thread H. J. Lu
On Fri, Nov 10, 2006 at 09:36:59AM -0800, H. J. Lu wrote:
> Intel has published Core 2 Duo Optimization Reference Manual. I will
> check in this patch to update wwwdocs.
> 

I checked it in. You can find Core 2 Duo Optimization Reference Manual
at

http://developer.intel.com/products/processor/manuals/index.htm


H.J.


Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Doug Gregor

On 08 Nov 2006 03:45:26 +0100, Gabriel Dos Reis
<[EMAIL PROTECTED]> wrote:

[EMAIL PROTECTED] (Richard Kenner) writes:

| > Like when int and long have the same range on a platform?
| > The answer is they are different, even when they imply the same object
| > representation.
| >
| > The notion of unified type nodes is closer to syntax than semantics.
|
| I'm more than a little confused, then, as to what we are talking about
| canonicalizing.  We already have only one pointer to each type, for example.

yes, but it is not done systematically.  Furthermore, we don't unify
function types -- because for some reasons, it was decided they would
hold default arguments.


... and exception specifications, and some attributes that are really
meant to go on the declaration.

So, until we bring our types into line with C++'s type system, we're
going to have to retain some level of structural checking. Based on
Dale's suggestion, I'm inclined to add a new flag
"requires_structural_comparison" to every type. This will only be set
true for cases where either GCC's internal representation or the
language forces us into structural equality testing. For C++, I think
we're only forced into structural equality testing where GCC's
internal representation doesn't match C++'s view of type equality. For
C, it looks like array types like int[10] require structural equality
testing (but my understanding of the C type system is rather weak).

 Cheers,
 Doug


Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Gabriel Dos Reis
Ian Lance Taylor <[EMAIL PROTECTED]> writes:

[...]

| I meant something very simple: for every type, there is a
| TYPE_CANONICAL field.  This is how you tell whether two types are
| equivalent:
| TYPE_CANONICAL (a) == TYPE_CANONICAL (b)
| That is what I mean when I saw one memory dereference and one pointer
| comparison.

That certainly matches my understanding and implementation in the
Pivot.

-- Gaby


Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Gabriel Dos Reis
"Doug Gregor" <[EMAIL PROTECTED]> writes:

[...]

| With concepts, there are cases where we end up creating two different
| type nodes that we later find out should have been the same type node.
| Here's an example:
| 
|   template
|   where LessThanComparable && SameType
|   const T& weird_min(const T& t, const U& u) {
| return t < u? t : u;
|   }
| 
| When we parse the template header, we create two different type nodes
| for T and U. Then we start parsing the where clause, and create a type
| node for LessThanComparable. Now we hit the SameType requirement,
| which says that T and U are the same type. It's a little late to make
| T and U actually have the same type node (unless we want to re-parse
| the template or re-write the AST).

I don't think that implies rewriting the AST or reparsing.  The
same-type constraints reads to me that "it is assume T and U have the
same canonical type", e.g. the predicate SameType translates to
the constraints

 TYPE_CANONICAL(T) == TYPE_CANONICAL(U)

this equation can be added the constraint set without reparsing (it is
a semantic constraint).

-- Gaby


Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Doug Gregor

On 10 Nov 2006 19:15:45 +0100, Gabriel Dos Reis
<[EMAIL PROTECTED]> wrote:

"Doug Gregor" <[EMAIL PROTECTED]> writes:
| With concepts, there are cases where we end up creating two different
| type nodes that we later find out should have been the same type node.
| Here's an example:
|
|   template
|   where LessThanComparable && SameType
|   const T& weird_min(const T& t, const U& u) {
| return t < u? t : u;
|   }
|
| When we parse the template header, we create two different type nodes
| for T and U. Then we start parsing the where clause, and create a type
| node for LessThanComparable. Now we hit the SameType requirement,
| which says that T and U are the same type. It's a little late to make
| T and U actually have the same type node (unless we want to re-parse
| the template or re-write the AST).

I don't think that implies rewriting the AST or reparsing.  The
same-type constraints reads to me that "it is assume T and U have the
same canonical type", e.g. the predicate SameType translates to
the constraints

 TYPE_CANONICAL(T) == TYPE_CANONICAL(U)

this equation can be added the constraint set without reparsing (it is
a semantic constraint).


Yes, but there are types built from 'T' and 'U' that also need to be
"merged" in this way. For instance, say we have built the types T* and
U* before seeing that same-type constraint. Now, we also need
TYPE_CANONICAL (T*) == TYPE_CANONICAL (U*).
And TYPE_CANONICAL(LessThanComparable) ==
TYPE_CANONICAL(LessThanComparable).
If you know about all of these other types that have been built from T
and U, you can use Nelson and Oppen's algorithm to update the
TYPE_CANONICAL information relatively quickly. If you don't have that
information... you're stuck with structural checks or rewriting the
AST to eliminate the duplicated nodes.

 Cheers,
 Doug


RE: Abt long long support

2006-11-10 Thread Dave Korn
On 10 November 2006 17:55, Rask Ingemann Lambertsen wrote:

> On Fri, Nov 10, 2006 at 07:17:29AM -0800, Ian Lance Taylor wrote:
>> "Mohamed Shafi" <[EMAIL PROTECTED]> writes:
>> 
>>> 1. What could be the reason for this behavior?
>> 
>> I'm really shooting in the dark here, but my guess is that you have a
>> define_expand for movdi that is not reload safe.
> 
>And in case the target doesn't have registers capable of holding DImode
> values, consider deleting the movdi pattern.

  No, surely you don't want to do that!  You really need a movdi pattern -
even more so if there are no natural DImode-sized registers, as gcse can get
terribly confused by bad reg_equal notes if you don't.  See e.g.:

http://gcc.gnu.org/ml/gcc/2003-04/msg01397.html
http://gcc.gnu.org/ml/gcc/2004-06/msg00993.html

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Gabriel Dos Reis
"Doug Gregor" <[EMAIL PROTECTED]> writes:

| On 10 Nov 2006 19:15:45 +0100, Gabriel Dos Reis
| <[EMAIL PROTECTED]> wrote:
| > "Doug Gregor" <[EMAIL PROTECTED]> writes:
| > | With concepts, there are cases where we end up creating two different
| > | type nodes that we later find out should have been the same type node.
| > | Here's an example:
| > |
| > |   template
| > |   where LessThanComparable && SameType
| > |   const T& weird_min(const T& t, const U& u) {
| > | return t < u? t : u;
| > |   }
| > |
| > | When we parse the template header, we create two different type nodes
| > | for T and U. Then we start parsing the where clause, and create a type
| > | node for LessThanComparable. Now we hit the SameType requirement,
| > | which says that T and U are the same type. It's a little late to make
| > | T and U actually have the same type node (unless we want to re-parse
| > | the template or re-write the AST).
| >
| > I don't think that implies rewriting the AST or reparsing.  The
| > same-type constraints reads to me that "it is assume T and U have the
| > same canonical type", e.g. the predicate SameType translates to
| > the constraints
| >
| >  TYPE_CANONICAL(T) == TYPE_CANONICAL(U)
| >
| > this equation can be added the constraint set without reparsing (it is
| > a semantic constraint).
| 
| Yes, but there are types built from 'T' and 'U' that also need to be
| "merged" in this way.

I don't see why you need to merge the types, as opposed to setting
their canonical types.

| For instance, say we have built the types T* and
| U* before seeing that same-type constraint. Now, we also need
| TYPE_CANONICAL (T*) == TYPE_CANONICAL (U*).
| And TYPE_CANONICAL(LessThanComparable) ==
| TYPE_CANONICAL(LessThanComparable).
| If you know about all of these other types that have been built from T
| and U, you can use Nelson and Oppen's algorithm to update the
| TYPE_CANONICAL information relatively quickly. If you don't have that
| information... 

In a template definition, one has that information.

-- Gaby


subreg transformation causes incorrect post_inc

2006-11-10 Thread TabonyEE
Hi,

My port, based on (GCC) 4.2.0 20061002 (experimental), is producing
incorrect code for the following test case:

int f(short *p){
  int sum, i;
  sum = 0;
  for(i = 0; i < 256; i++){
sum += *p++ & 0xFF;
  }
  return sum;
}


The RTL snippet of interest, before combine, is,

(insn 23 22 24 3 (set (reg:SI 96)
(sign_extend:SI (mem:HI (post_inc:SI (reg/v/f:SI 89 [ p.38 ]))
[2 S2 A16]))) 178 {extendhisi2} (nil)
(expr_list:REG_INC (reg/v/f:SI 89 [ p.38 ])
(nil)))

(insn 24 23 25 3 (set (reg:SI 98)
(const_int 255 [0xff])) 12 {movsi_real} (nil)
(nil))

(insn 25 24 26 3 (set (reg:SI 97)
(and:SI (reg:SI 96)
(reg:SI 98))) 81 {andsi3} (insn_list:REG_DEP_TRUE 23
(insn_list:REG_DEP_TRUE 24 (nil)))
(expr_list:REG_DEAD (reg:SI 96)
(expr_list:REG_DEAD (reg:SI 98)
(expr_list:REG_EQUAL (and:SI (reg:SI 96)
(const_int 255 [0xff])) 
(nil)  


Combine combines that into the following and it remains that way until greg:

(insn 25 24 26 3 (set (reg:SI 97)  
(zero_extend:SI (subreg:QI (mem:HI (post_inc:SI (reg/v/f:SI 89 [
p.38 ])) [2 S2 A16]) 0))) 181 {zero_extendqisi
(expr_list:REG_INC (reg/v/f:SI 89 [ p.38 ])
(nil)))


After greg, the insn becomes,

(insn:HI 25 29 59 3 (set (reg:SI 0 r0 [97])
(zero_extend:SI (mem:QI (post_inc:SI (reg/v/f:SI 4 r4 [orig:89
p.38 ] [89])) [2 S1 A16]))) 181 {zero_extendqisi
(expr_list:REG_INC (reg/v/f:SI 4 r4 [orig:89 p.38 ] [89])  
   
(nil)))


The problem, as I see it, is that "(subreg:QI (mem:HI (post_inc" becomes
"(mem:QI (post_inc".  post_inc has changed meaning.

Experience has taught me that almost all such problems are the fault of
the backend, but I am stumped as to what part of the backend could cause
this problem.  Any ideas?

Does anyone know of any recent 4.2 bug fixes that could fix this?

If this is a bug in the middle end, then where do you think the bug is?
 Is it combine's, greg's, or some other pass's responsibility to ensure
correct code in this case?

Ideally, the post_inc would be replaced with a +2 post_modify, which
this target has.  For targets without post_modify, either the post_inc
could be dropped and a separate add emitted or the mem could remain
"(mem:HI (post_inc" followed by a zero_extend.

Thanks in advance for your help,
Charles J. Tabony


Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Gabriel Dos Reis
"Doug Gregor" <[EMAIL PROTECTED]> writes:

| On 08 Nov 2006 03:45:26 +0100, Gabriel Dos Reis
| <[EMAIL PROTECTED]> wrote:
| > [EMAIL PROTECTED] (Richard Kenner) writes:
| >
| > | > Like when int and long have the same range on a platform?
| > | > The answer is they are different, even when they imply the same object
| > | > representation.
| > | >
| > | > The notion of unified type nodes is closer to syntax than semantics.
| > |
| > | I'm more than a little confused, then, as to what we are talking about
| > | canonicalizing.  We already have only one pointer to each type, for 
example.
| >
| > yes, but it is not done systematically.  Furthermore, we don't unify
| > function types -- because for some reasons, it was decided they would
| > hold default arguments.
| 
| ... and exception specifications, and some attributes that are really
| meant to go on the declaration.
| 
| So, until we bring our types into line with C++'s type system, we're
| going to have to retain some level of structural checking. Based on
| Dale's suggestion, I'm inclined to add a new flag
| "requires_structural_comparison" to every type.

I hope that is a short-term work-around.

| This will only be set
| true for cases where either GCC's internal representation or the
| language forces us into structural equality testing.

For function types, all C++ front-end maintainers agreed (some time
ago) that the front-end should move to a state where default arguments
and the like are moved out of the type notes, therefore enabling more
sharing.  I think Kazu did a premilinary good work there.

| For C++, I think
| we're only forced into structural equality testing where GCC's
| internal representation doesn't match C++'s view of type equality. For
| C, it looks like array types like int[10] require structural equality
| testing (but my understanding of the C type system is rather weak).

I'm not worried about the C type system :-/

-- Gaby


Re: subreg transformation causes incorrect post_inc

2006-11-10 Thread Mark Shinwell

[EMAIL PROTECTED] wrote:

My port, based on (GCC) 4.2.0 20061002 (experimental), is producing
incorrect code for the following test case:

[snip]

I've only had a very quick look at your code, but I have a feeling that
this is an instance of the kind of slip-up with GO_IF_MODE_DEPENDENT_ADDRESS
that my patch at http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00858.html is
aimed at preventing.  (This patch is currently only applied to addrmodes
branch.)

Mark



Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Doug Gregor

On 10 Nov 2006 20:12:27 +0100, Gabriel Dos Reis
<[EMAIL PROTECTED]> wrote:

"Doug Gregor" <[EMAIL PROTECTED]> writes:
I don't see why you need to merge the types, as opposed to setting
their canonical types.


I have union-find on the mind, so I'm using the terms interchangeably.
Setting their canonical types to the same value merges the equivalence
classes of types.


| For instance, say we have built the types T* and
| U* before seeing that same-type constraint. Now, we also need
| TYPE_CANONICAL (T*) == TYPE_CANONICAL (U*).
| And TYPE_CANONICAL(LessThanComparable) ==
| TYPE_CANONICAL(LessThanComparable).
| If you know about all of these other types that have been built from T
| and U, you can use Nelson and Oppen's algorithm to update the
| TYPE_CANONICAL information relatively quickly. If you don't have that
| information...

In a template definition, one has that information.


?

Our same-type constraint says SameType.

We can easily set TYPE_CANONICAL for T and U.

We also need to set the TYPE_CANONICAL fields of LessThanComparable
and LessThanComparable to the same value.

How do we get from 'T' to 'LessThanComparable'?

 Cheers,
 Doug


Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Gabriel Dos Reis
"Doug Gregor" <[EMAIL PROTECTED]> writes:

[...]

| > | For instance, say we have built the types T* and
| > | U* before seeing that same-type constraint. Now, we also need
| > | TYPE_CANONICAL (T*) == TYPE_CANONICAL (U*).
| > | And TYPE_CANONICAL(LessThanComparable) ==
| > | TYPE_CANONICAL(LessThanComparable).
| > | If you know about all of these other types that have been built from T
| > | and U, you can use Nelson and Oppen's algorithm to update the
| > | TYPE_CANONICAL information relatively quickly. If you don't have that
| > | information...
| >
| > In a template definition, one has that information.
| 
| ?
| 
| Our same-type constraint says SameType.
| 
| We can easily set TYPE_CANONICAL for T and U.
| 
| We also need to set the TYPE_CANONICAL fields of LessThanComparable
| and LessThanComparable to the same value.
| 
| How do we get from 'T' to 'LessThanComparable'?


Delay semantics processing (that is canonical types, etc.) until you
have build all of the constraint set -- that is basically what we do
for C++98 and C++03 templates. 

At the end of the definition declaration, you have enough needed to 
unify the canonical type nodes for T*, U*, LessThanComparable, etc.

In a sense that is very similar to what we do for definitions that
appear lexically in class definitions.

-- Gaby



Configuration question

2006-11-10 Thread Steve Ellcey
I have run into a libstdc++ configuration issue and was wondering if it
is a known issue or not.

My build failed because the compiler I am using to build GCC and
libstdc++ does not have wchar support and does not define mbstate_t.
The compiler (and library) that I am creating however, do support wchar
and do define mbstate_t.  Both compilers are GCC, the old one does not
include a -D that the new one does.  mbstate_t (defined in the system
header files) is only seen when this define is set.

The problem is that the libstdc++ configure script is using the original
GCC to check for the existence of mbstate_t (doesn't find it) and using
that information to say that it needs to define mbstate_t when compiling
libstdc++, but libstdc++ is compiled with the newly built GCC which
does have an mbstate_t from the system header files.  Shouldn't the
libstdc++ configure script use the new GCC when checking things with
AC_TRY_COMPILE.  Or is this just not possible?  Is this why some tests
don't use AC_TRY_COMPILE but say "Fake what AC_TRY_COMPILE does"?  See
acinclude.m4 for these comments, there is no explanation about why it is
faking what AC_TRY_COMPILE does.

Steve Ellcey
[EMAIL PROTECTED]


Re: subreg transformation causes incorrect post_inc

2006-11-10 Thread TabonyEE
From: Mark Shinwell <[EMAIL PROTECTED]>
> [EMAIL PROTECTED] wrote:
> > My port, based on (GCC) 4.2.0 20061002 (experimental), is producing
> > incorrect code for the following test case:
> [snip]
> 
> I've only had a very quick look at your code, but I have a feeling 
> thatthis is an instance of the kind of slip-up with 
> GO_IF_MODE_DEPENDENT_ADDRESSthat my patch at 
> http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00858.html is
> aimed at preventing.  (This patch is currently only applied to 
> addrmodesbranch.)
> 
> Mark

I will give it a try.  Thanks!

Charles J. Tabony


Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Mike Stump

On Nov 9, 2006, at 11:09 PM, Ian Lance Taylor wrote:

I meant something very simple: for every type, there is a
TYPE_CANONICAL field.  This is how you tell whether two types are
equivalent:
TYPE_CANONICAL (a) == TYPE_CANONICAL (b)


Ah, yes, that would work.  Hum, so simple, why was I thinking  
something was not going to work about it.  There are advantages to  
real-time conversations...  anyway, can't think of any down sides  
right now except for the obvious, this is gonna eat 1 extra pointer  
per type.  In my scheme, one would have to collect stats on the sizes  
to figure out if there are enough types that don't have typedefs to  
pay for the data structure for those that do.  I think mine would  
need less storage, but, your scheme is so much easier to implement  
and transition to, that, I think it is preferable over an along side  
datatype.  Thanks for bearing with me.


Re: How to create both -option-name-* and -option-name=* options?

2006-11-10 Thread Mark Mitchell
Dave Korn wrote:

>   It may seem a bit radical, but is there any reason not to modify the
> option-parsing machinery so that either '-' or '=' are treated interchangeably
> for /all/ options with joined arguments?  That is, whichever is specified in
> the .opt file, the parser accepts either?  

I like that idea.

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: expanding __attribute__((format,..))

2006-11-10 Thread Nuno Lopes

On Fri, 10 Nov 2006, Ian Lance Taylor wrote:


I kind of liked this idea:
http://gcc.gnu.org/ml/gcc-patches/2005-07/msg00797.html
but of course it was insane.


I still think a higher level state machine as described in the followups
is how things should be done.


wouldn't that be killing a mosquito with a bomb? :)  (unless of course we 
can find a simple description language)




The first step (or the first hundred steps) would need to be a series of
small incremental patches moving all the existing logic about format
string structure from the code into more generic datastructures.

In so doing you need to consider how xgettext could be made to extract a
superset of the possible diagnostic sentences so that i18n for the format
checking messages can work properly again (which requires that full
sentences be passed to xgettext and be known by the translators, while
maintainability of the format descriptions requires that the information
about valid combinations be maintained at a different level more like
that used by the present datastructures).

Once the datastructures are suitably general, then interfaces to them can
be considered.


Can I do anything to help? I mean, can you point me the files and what 
should I do in order to make this move forward? (the most I've made was a 
few little patches to make a customized cross-compiler to a mips robot, so 
I'm not very familiarized with the code...)



Regards,
Nuno 



Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Mark Mitchell
Ian Lance Taylor wrote:

> This assumes, of course, that we can build an equivalence set for
> types.  I think that we need to make that work in the middle-end, and
> force the front-ends to conform.  As someone else mentioned, there are
> horrific cases in C like a[] being compatible with both a[5] and a[10]
> but a[5] and a[10] not being compatible with each other, and similarly
> f() is compatible with f(int) and f(float) but the latter two are not
> compatible with each other. 

I don't think these cases are serious problems; they're compatible
types, not equivalent types.  You don't need to check compatibility as
often as equivalence.  Certainly, in the big C++ test cases, Mike is
right that templates are the killer, and they're you're generally
testing equivalence.

So, if you separate same_type_p from compatible_type_p, and make
same_type_p fast, then that's still a big win.

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: Canonical type nodes, or, comptypes considered harmful

2006-11-10 Thread Gabriel Dos Reis
Mike Stump <[EMAIL PROTECTED]> writes:

| On Nov 9, 2006, at 11:09 PM, Ian Lance Taylor wrote:
| > I meant something very simple: for every type, there is a
| > TYPE_CANONICAL field.  This is how you tell whether two types are
| > equivalent:
| > TYPE_CANONICAL (a) == TYPE_CANONICAL (b)
| 
| Ah, yes, that would work.  Hum, so simple, why was I thinking
| something was not going to work about it.  There are advantages to
| real-time conversations...  anyway, can't think of any down sides
| right now except for the obvious, this is gonna eat 1 extra pointer
| per type.

That is what we use in our representation, so if you find something
seriously wrong with it I'm highly interested.

As of the extra pointer.  We use C++ to represent this whole stuff.
And it uses conventional object orientation combined with
non-conventional C++ templates.  Consequently we do not always
store the pointers for the canonical types.  For example, built-in
types are their own canonical types, so we just return "*this".

For typedefs (and general aliases, e.g. namespace alias), we store
pointers to the canonical type of the aliasee.  
For classes and enums, we return  the pointer to the "class
expression" (when present). etc. 

-- Gaby


Re: subreg transformation causes incorrect post_inc

2006-11-10 Thread TabonyEE
From: Mark Shinwell <[EMAIL PROTECTED]>
> [EMAIL PROTECTED] wrote:
> > My port, based on (GCC) 4.2.0 20061002 (experimental), is producing
> > incorrect code for the following test case:
> [snip]
> 
> I've only had a very quick look at your code, but I have a feeling 
> thatthis is an instance of the kind of slip-up with 
> GO_IF_MODE_DEPENDENT_ADDRESSthat my patch at 
> http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00858.html is
> aimed at preventing.  (This patch is currently only applied to 
> addrmodesbranch.)
> 
> Mark

Hhmm.  Is the intent of your patch simply to prevent the mistake of
backends not defining GO_IF_MODE_DEPENDENT_ADDRESS properly?  My backend
checks for POST_INC and POST_DEC in GO_IF_MODE_DEPENDENT_ADDRESS.

Charles J. Tabony


Threading the compiler

2006-11-10 Thread Mike Stump
We're going to have to think seriously about threading the compiler.   
Intel predicts 80 cores in the near future (5 years).  http:// 
hardware.slashdot.org/article.pl?sid=06/09/26/1937237&from=rss  To  
use this many cores for a single compile, we have to find ways to  
split the work.  The best way, of course is to have make -j80 do that  
for us, this usually results in excellent efficiencies and an ability  
to use as many cores as there are jobs to run.  However, for the  
edit, compile, debug cycle of development, utilizing many cores is  
harder.


To get compile speed in this type of case, we will need to start  
thinking about segregating data and work out into hunks, today, I  
already have a need for 4-8 hunks.  That puts me 4x to 8x slower than  
I'd like to be.  8x slower, well, just hurts.


The competition is already starting to make progress in this area.

I think it is time to start thinking about it for gcc.

We don't want to spend time in locks or spinning and we don't want to  
liter our code with such things, so, if we form areas that are fairly  
well isolated and independent and then have a manager, manage the  
compilation process we can have just it know about and have to deal  
with such issues.  The rules would be something like, while working  
in a hunk, you'd only have access to data from your own hunk, and  
global shared read only data.


The hope is that we can farm compilation of different functions out  
into different cores.  All global state updates would be fed back to  
the manager and then the manager could farm out the results into  
hunks and so on until done.  I think we can also split out lexing out  
into a hunk.  We can have the lexer give hunks of tokens to the  
manager to feed onto the parser.  We can have the parser feed hunks  
of work to do onto the manager and so on.


How many hunks do we need, well, today I want 8 for 4.2 and 16 for  
mainline, each release, just 2x more.  I'm assuming nice, equal sized  
hunks.  For larger variations in hunk size, I'd need even more hunks.


Or, so that is just an off the cuff proposal to get the discussion  
started.


Thoughts?


Re: Threading the compiler

2006-11-10 Thread H. J. Lu
On Fri, Nov 10, 2006 at 12:38:07PM -0800, Mike Stump wrote:
> How many hunks do we need, well, today I want 8 for 4.2 and 16 for  
> mainline, each release, just 2x more.  I'm assuming nice, equal sized  
> hunks.  For larger variations in hunk size, I'd need even more hunks.
> 
> Or, so that is just an off the cuff proposal to get the discussion  
> started.
> 
> Thoughts?

Will use C++ help or hurt compiler parallelism? Does it really matter?


H.J.


RE: How to create both -option-name-* and -option-name=* options?

2006-11-10 Thread Dave Korn
On 10 November 2006 20:06, Mark Mitchell wrote:

> Dave Korn wrote:
> 
>>   It may seem a bit radical, but is there any reason not to modify the
>> option-parsing machinery so that either '-' or '=' are treated
>> interchangeably for /all/ options with joined arguments?  That is,
>> whichever is specified in the .opt file, the parser accepts either?
> 
> I like that idea.


  Would it be a suitable solution to just provide a specialised wrapper around
the two strncmp invocations in find_opt?  It seems ok to me; we only want this
change to affect comparisons, we call whichever form is listed in the .opts
file the canonical form and just don't worry if the (canonical) way a flag is
reported in an error message doesn't quite match when the non-canonical form
was used on the command line?

  (I'm not even going to mention the 'limitation' that we are now no longer
free to create -fFLAG=VALUE and -fFLAG-VALUE options with different meanings!)


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: How to create both -option-name-* and -option-name=* options?

2006-11-10 Thread Mark Mitchell
Dave Korn wrote:
> On 10 November 2006 20:06, Mark Mitchell wrote:
> 
>> Dave Korn wrote:
>>
>>>   It may seem a bit radical, but is there any reason not to modify the
>>> option-parsing machinery so that either '-' or '=' are treated
>>> interchangeably for /all/ options with joined arguments?  That is,
>>> whichever is specified in the .opt file, the parser accepts either?
>> I like that idea.
> 
> 
>   Would it be a suitable solution to just provide a specialised wrapper around
> the two strncmp invocations in find_opt? 

FWIW, that seems reasonable to me, but I've not looked hard at the code
to be sure that's technically 100% correct.  It certainly seems like the
right idea.

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


9 Nov 06 notes from GCC improvement for Itanium conference call

2006-11-10 Thread Mark K. Smith
ON THE CALL: Kenny Zadack (Natural Bridge), Diego Novillo (Red Hat),
Vladimir Makarov (Red Hat), Mark Smith (Gelato), Bob Kidd (UIUC),
Andrey Belevantsev (RAS), Arutyun Avetisyan (RAS), Mark Davis (Intel),
Sebastian Pop (Ecole des Mines de Paris)

Agenda:
1) Gelato ICE April GCC track proposed content (Mark S.)
2) GCC4.2 & GCC 4.3 update, alias analysis update (Diego)
3) Scheduler work update, potential new software pipelining project
(Andrey)
4) LTO update (Kenny)
5) Superblock work update (Bob)

##
1) Gelato ICE April GCC track proposed content (Mark S.)

The content for the GCC track at the upcoming Gelato ICE San Jose
(April 15-18) conference was proposed (still need to confirm with
several speakers):

 - ISP-Russian Academy of Sciences: update on scheduler work, discuss
progress on new software pipelining work
 - Martin Michlmayr: compiling Debian using GCC 4.2 and Osprey
 - Shin-ming Liu: HP GCC and Osprey update
 - Kenny Zadack - update on LTO
 - Bob Kidd - update on superblock work
 - Zdenek Dvorak - update on prefetching work
 - Diego Novillo - update on alias analysis work
 - Matthieu Delahaye - update on Gelato GCC build farm
 - Dan Berlin - GPL2 and GPL3 presentation

##
2) GCC4.2 & GCC 4.3 update, alias analysis update (Diego)

GCC 4.2 & 4.3 update:
-
4.2 has branched.  Likely release in early 2007.

Many major pieces of work being scheduled for GCC 4.3 (SSA across the
callgraph, overhaul dataflow in the backend, overhaul SSA form for
memory, reduce memory footprint in the IL, autoparallelization, new
vectorization, new interprocedural optimizations, etc).

The full list is at http://gcc.gnu.org/wiki/GCC_4.3_Release_Planning

Alias analysis update: 
--
no changes to analysis, representation of aliasing is being modified
for 4.3

##
3) Scheduler work update, potential new software pipelining project
(Andrey)

We have merged all major features of selective scheduling and are
tuning it for Itanium.  We use a set of small benchmarks to analyze
the performance of the scheduler. At this moment we are neutral on
half of the benchmarks, we get 3% speedup on linpack and 5% speedup on
mgrid. We have fixed all of >1% regressions except dhrystone, which
regresses on 4% due to alignment issues. Major part of the bugs we've
fixed is because bundling and instruction choosing mechanism are
tightly coupled with the Haifa scheduler, and we need to support both
schedulers at the same time. We plan to proceed with tuning and
implement the driver for software pipelining for the next month.

We also plan to fix swing modulo scheduling to make it work on ia64
and improve it by propagating data dependency information to RTL. We
plan to discuss this project on the GCC mailing list in a few weeks.

Comments by Vladimir:
-
About software pipelining for Itanium: It is completely broken, more
accurately it never worked for Itanium. Because it is very (probably
most) important optimization for Itanium, after making SP working it
should be switched on by default at least for Itanium to keep it
working.  GCC has a lot of optimizations which are not on by default
and they have tendency to be broken sometime.

About insn bundling and insn scheduler hooks for Itanium:  Usually
the scheduler hooks are very few lines.  This is not case for
Itanium.I believe there is a potential for generating better quality
code by improving bundling and hooks.  Unfortunately, the code is
"spaghetti" code which is hard to understand.

##
4) LTO update (Kenny)

The LTO branch has made some progress: there is work underway by Mark
Mitchell, Sandra Loosemore, and myself to serialize three codes and
the declarations and types into sections inside the .o files.  I have
stopped working on this temporarily while I complete the work on the
dataflow branch. 

I expect that after the first of the year there will be a lot of
progress on this.  We should soon be able to serialize code and
compile it at link time.  This will still leave many problems open for
other to help with, including more aggressive optimizations, as well
as providing some mechanism to distributing/parallelizing the
compilation.

##
5) Superblock work update (Bob)

I'm merging mainline into the ia64-improvements branch. As soon as
that is finished, I will run a regression on the Superblock pass and
prepare a patch to submit to gcc-patches. This work has been on the
back burner for the past couple weeks due to an upcoming paper
deadline.

I'm writing a paper documenting the changes we made to IMPACT's
intermediate representation to allow interprocedural analysis to be
performed more easily. We were able to extend IMPACT's IR from one
stored completely in memory to one that can be stored partially in
memory and partially on disk. This allows us to reduce the memory
requirements for the compiler when processing modern, large program.
This paper will lik

Re: How to create both -option-name-* and -option-name=* options?

2006-11-10 Thread Brooks Moses

Dave Korn wrote:

On 10 November 2006 20:06, Mark Mitchell wrote:

Dave Korn wrote:

 It may seem a bit radical, but is there any reason not to modify the
option-parsing machinery so that either '-' or '=' are treated
interchangeably for /all/ options with joined arguments?  That is,
whichever is specified in the .opt file, the parser accepts either?


I like that idea.


  Would it be a suitable solution to just provide a specialised wrapper around
the two strncmp invocations in find_opt?  It seems ok to me; we only want this
change to affect comparisons, we call whichever form is listed in the .opts
file the canonical form and just don't worry if the (canonical) way a flag is
reported in an error message doesn't quite match when the non-canonical form
was used on the command line?


I would think that would be suitable, certainly.  Having the error 
message report the canonical form would, to me, just be a beneficial 
small reminder to people to use the canonical form.



  (I'm not even going to mention the 'limitation' that we are now no longer
free to create -fFLAG=VALUE and -fFLAG-VALUE options with different meanings!)


But that's already not possible -- that's essentially how I got into 
this problem in the first place.  If one tries to define both of those, 
the declaration of the enumeration-type holding the option flags breaks, 
so you can't do that.


(Well, you could hack that to make it work; define -fFLAG as the option 
name, so that the '-' or '=' is the first character of the argument. 
That will still work, but it's a pain if VALUE is otherwise a UInteger.)


This does raise a point about how the options are compared, though -- to 
be useful, this needs to also handle cases where a Joined option is 
emulated by a "normal" option.  For instance, Fortran's lang.opt 
contains something like:


  -ffixed-line-length-none
  Fortran

  -ffixed-line-length-
  Fortran Joined

We would also want "-ffixed-line-length=none" to be handled 
appropriately, which makes this a bit trickier than just handling the 
last character of Joined options.


Are there any meaningful downsides to just having the option-matcher 
treat all '-' and '=' values in the option name as equivalent?  It would 
mean that we'd also match "-ffixed=line=length-none", for instance, but 
I don't think that causes any real harm.


An alternative would be to specify that an '=' in the name in the .opt 
file will match either '=' or '-' on the command line.  This does 
require that the canonical form be the one with '=' in it, and means 
that things with '-' in them need to be changed in the .opt file to 
accept both, but the benefit is that it can accept pseudo-Joined options 
in either form without accepting all sorts of wierd things with random 
'='s in them.


- Brooks



Re: Threading the compiler

2006-11-10 Thread Mike Stump

On Nov 10, 2006, at 12:46 PM, H. J. Lu wrote:

Will use C++ help or hurt compiler parallelism? Does it really matter?


I'm not an expert, but, in the simple world I want, I want it to not  
matter in the least.  For the people writing most code in the  
compiler, I want clear simple rules for them to follow.


For example, google uses mapreduce http://labs.google.com/papers/ 
mapreduce.html as a primitive, and there are a few experts that  
manage that code, and everyone else just mindlessly uses it.  The  
rules are explained to them, and they just follow the rules and it  
just works.  No locking, no atomic, no volatile, no cleaver lock free  
code, no algorithmic changes (other than decomposing into isolated  
composable parts) .  I'd like something similar for us.


Re: Threading the compiler

2006-11-10 Thread Sohail Somani
On Fri, 2006-11-10 at 12:46 -0800, H. J. Lu wrote:
> On Fri, Nov 10, 2006 at 12:38:07PM -0800, Mike Stump wrote:
> > How many hunks do we need, well, today I want 8 for 4.2 and 16 for  
> > mainline, each release, just 2x more.  I'm assuming nice, equal sized  
> > hunks.  For larger variations in hunk size, I'd need even more hunks.
> > 
> > Or, so that is just an off the cuff proposal to get the discussion  
> > started.
> > 
> > Thoughts?
> 
> Will use C++ help or hurt compiler parallelism? Does it really matter?

My 2c.

I don't think it can possibly hurt as long as people follow normal C++
coding rules.

The main issue is not really language choice though. The main issues
would likely be defining data to be isolated enough to be useful to do
work in parallel.

Lots of threads communicating a lot would be bad.

Sohail



Re: Threading the compiler

2006-11-10 Thread Sohail Somani
On Fri, 2006-11-10 at 13:31 -0800, Mike Stump wrote:
> On Nov 10, 2006, at 12:46 PM, H. J. Lu wrote:
> > Will use C++ help or hurt compiler parallelism? Does it really matter?
> 
> I'm not an expert, but, in the simple world I want, I want it to not  
> matter in the least.  For the people writing most code in the  
> compiler, I want clear simple rules for them to follow.
> 
> For example, google uses mapreduce http://labs.google.com/papers/ 
> mapreduce.html as a primitive, and there are a few experts that  
> manage that code, and everyone else just mindlessly uses it.  The  
> rules are explained to them, and they just follow the rules and it  
> just works.  No locking, no atomic, no volatile, no cleaver lock free  
> code, no algorithmic changes (other than decomposing into isolated  
> composable parts) .  I'd like something similar for us.

What parts could be done in parallel besides things that can be done by
make -j80? I would guess that certain tree transformations could be run
in parallel.

Do threads help for io-bound apps?

Thanks,

Sohail



Re: Threading the compiler

2006-11-10 Thread Marcin Dalecki


On 2006-11-10, at 21:46, H. J. Lu wrote:


On Fri, Nov 10, 2006 at 12:38:07PM -0800, Mike Stump wrote:

How many hunks do we need, well, today I want 8 for 4.2 and 16 for
mainline, each release, just 2x more.  I'm assuming nice, equal sized
hunks.  For larger variations in hunk size, I'd need even more hunks.

Or, so that is just an off the cuff proposal to get the discussion
started.

Thoughts?


Will use C++ help or hurt compiler parallelism? Does it really matter?


It should be helpfull, because it seriously helps in keeping the  
semantical scope

of data items at bay.

Marcin Dalecki




Re: strict aliasing question

2006-11-10 Thread Mike Stump

On Nov 10, 2006, at 9:48 AM, Howard Chu wrote:

Richard Guenther wrote:

If you compile with -O3 -combine *.c -o alias it will break.


Thanks for pointing that out. But that's not a realistic danger for  
the actual application. The accessor function is always going to be  
in a library compiled at a separate time. The call will always be  
from a program built at a separate time, so -combine isn't a factor.


We are building a compiler to outsmart you.  We presently working on  
technology (google ("LTO")) to break your code.  :-)  Don't cry when  
we turn it on by default and it does.  I'd recommend understanding  
the rules and following them.


Re: Threading the compiler

2006-11-10 Thread Marcin Dalecki


On 2006-11-10, at 22:33, Sohail Somani wrote:


On Fri, 2006-11-10 at 12:46 -0800, H. J. Lu wrote:

On Fri, Nov 10, 2006 at 12:38:07PM -0800, Mike Stump wrote:

How many hunks do we need, well, today I want 8 for 4.2 and 16 for
mainline, each release, just 2x more.  I'm assuming nice, equal  
sized
hunks.  For larger variations in hunk size, I'd need even more  
hunks.


Or, so that is just an off the cuff proposal to get the discussion
started.

Thoughts?


Will use C++ help or hurt compiler parallelism? Does it really  
matter?


My 2c.

I don't think it can possibly hurt as long as people follow normal C++
coding rules.


Contrary to C there is no single general coding style for C++. In  
fact for a project
of such a scale this may be indeed the most significant deployment  
problem for C++.



Lots of threads communicating a lot would be bad.


This simply itsn't true. The compiler would be fine having many  
threads handling a
lot of data between them in a pipelined way. In fact it already does  
just that,

however without using the opportunity for paralell execution.

Marcin Dalecki




Re: Threading the compiler

2006-11-10 Thread Basile STARYNKEVITCH
Le Fri, Nov 10, 2006 at 01:33:42PM -0800, Sohail Somani écrivait/wrote:

> I don't think it can possibly hurt as long as people follow normal C++
> coding rules.
> 
> The main issue is not really language choice though. The main issues
> would likely be defining data to be isolated enough to be useful to do
> work in parallel.

I see the following issues

first (once parsing is done) we could (at least in non-inter-procedural
phases & passes, which might be common, in particular in -O1 or maybe -O2)
handle in parallel different functions inside a C compilation unit. 

Another trick (particularily in LTO) could be to store persistently some
internal representation for each function (within a compilation unit) and to
recall it if the compiler notice that a given function did'nt change.

However, for multi-threading the compiler, a significant issue might be the
internal GCC garbage collector (I'm not sure it is easily multi-threadable).

I'm not familiar enough with actual gcc timing to be sure all this would
really speedup compilation time.

I'm not pretending to volunteer the multi-threading effort, because I am not
yet familiar enough with GCC internals

My 0.02 euros!

Regards.



-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/ 
email: basilestarynkevitchnet 
mobile: +33 6 8501 2359 
8, rue de la Faïencerie, 92340 Bourg La Reine, France


Re: expanding __attribute__((format,..))

2006-11-10 Thread Mike Stump


On Nov 10, 2006, at 9:14 AM, Ian Lance Taylor wrote:


"Nuno Lopes" <[EMAIL PROTECTED]> writes:


I've been thinking that it would be a good idea to extend the current
__attribute__((format,..)) to use an arbitrary user callback.
I searched the mailing list archives and I found some references to
similar ideas. So do you think this is feasible?


I think it would be nice.  We usually founder


I think that a 20% solution would handle 95% of the cases.  :-)

__attribute((extra_formats, "AB"))

for example.  Easy to use, easy to describe, handles things well  
enough to keep people happy for 10 years.  The next version after  
this would be comprehensive enough to handle describing the values  
and the types, the checking rules  and the rules and the warning/ 
error messages to generate.


Re: Planned LTO driver work

2006-11-10 Thread Mike Stump

On Nov 9, 2006, at 11:37 PM, Mark Mitchell wrote:
It might be that we should move the invocation of the real linker  
back into gcc.c, so that collect2's job just becomes


Or move all of collect2 back into gcc.c.  There isn't a reason for it  
being separate any longer.


Re: Question on tree-nested.c:convert_nl_goto_reference

2006-11-10 Thread Richard Kenner
> But I do get a failure in verify_flow_info with the appended test case.

Indeed that's where I get the ICE.

> verify_flow_info is only used when checking is enabled, so
> maybe that is why people aren't seeing it?  

But isn't that the default on the trunk?


Re: Threading the compiler

2006-11-10 Thread Kevin Handy

Mike Stump wrote:

...


Thoughts?



Raw thoughts:

1. Threading isn't going to help for I/O bound portions.

2. The OS should already be doing some of the work of threading.
 Some 'parts' of the compiler should already be using CPUs: 'make',
 the front-end (gcc) command, the language compiler, the assembler,
 linker, etc.

3. The OS will likely be using some of the CPUs for its own purposes:
 I/O prefetch, display drivers, sound, etc. (and these processes will
 probably increase over time as the OS vendors get used to them
 being available). Different machines will also have differing number
 of CPUs. Old systems will still have one or two cores, Some Dual
 core may have 160. What will the multi-core compiler design do to
 the old processors (extreme slowness?)

4. Will you "serialize" error messages so that two compiles of a file
 will always display the errors in the same order? Also, will the object
 files created be the same between compiles.

5. Will more "heavy" optimizations be available? i.e. Will the multi-core
 speed things up enough that really hard optimizations (speed wise)
 become reasonable?



Re: Abt long long support

2006-11-10 Thread 'Rask Ingemann Lambertsen'
On Fri, Nov 10, 2006 at 07:11:34PM -, Dave Korn wrote:

>   No, surely you don't want to do that!  You really need a movdi pattern -
> even more so if there are no natural DImode-sized registers, as gcse can get
> terribly confused by bad reg_equal notes if you don't.  See e.g.:
> 
> http://gcc.gnu.org/ml/gcc/2003-04/msg01397.html
> http://gcc.gnu.org/ml/gcc/2004-06/msg00993.html

   PR number?

-- 
Rask Ingemann Lambertsen


Re: strict aliasing question

2006-11-10 Thread Howard Chu

Mike Stump wrote:

On Nov 10, 2006, at 9:48 AM, Howard Chu wrote:

Richard Guenther wrote:

If you compile with -O3 -combine *.c -o alias it will break.


Thanks for pointing that out. But that's not a realistic danger for 
the actual application. The accessor function is always going to be 
in a library compiled at a separate time. The call will always be 
from a program built at a separate time, so -combine isn't a factor.


We are building a compiler to outsmart you.  We presently working on 
technology (google ("LTO")) to break your code.  :-)  Don't cry when 
we turn it on by default and it does.  I'd recommend understanding the 
rules and following them.


Heh heh. Looking forward to using that. Google further back and you'll 
see that I did link time optimization with gcc 1.4 for m68k/Atari, 
almost 20 years ago. More power to you. (Why in my day, we had to carry 
bitbuckets twenty miles uphill, BOTH DIRECTIONS!)


As for following the rules, I didn't define the SASL API. It strikes me 
that (void **) is pretty unfriendly as an argument type. While it's easy 
to make the warning go away with a union, that doesn't actually 
guarantee that the memory being pointed to will be in a defined state.


With the previous example, if alias1.c was instead:


#include 

extern void getit( void **arg );

main() {
   union {
   int *foo;
   void *bar;
   } u;

   getit( &u.bar );
   printf("foo: %x\n", *u.foo);
}


gcc no longer complains, but according to the spec, the code is not any 
more reliable.


On the other hand, I don't see any good reason for an optimizer to break 
this code. The compiler knows absolutely that the two pointers have 
identical values and therefore point to the same piece of memory. There 
is no way it can legitimately squash out any loads or stores here - it's 
not executing in a loop, where a prior load may have already fetched the 
data.


--
 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sunhttp://highlandsun.com/hyc
 OpenLDAP Core Teamhttp://www.openldap.org/project/



Re: Handling of extern inline in c99 mode

2006-11-10 Thread Hallvard B Furuseth
I'm not subscribed to this list, I just noticed this discussion
while browsing around...  Don't know if the list accept
non-subscriber messages either, but let's see:


Ian Lance Taylor wrote:
> codesearch.google.com finds about 6000 uses of "extern line" in
>   code written in C, but the search
>   inline -static -extern -# lang:c file:\.c$
>   finds only 100 occurrences (...)

Because you don't search for "inline" declarations with no "static" nor
"extern", but files with "inline" which contain no "static" nor "extern"
_anywhere_ in the file, if I understand codesearch correctly.


One wish for whatever happens with "inline":

Please document what #if tests one should put in a portable (non-GNU:-)
program in order to (a) get the intended operation of gcc 'inline' and
(b) not drown the program's users in warning messages.

In this regard, 'inline' which behaves differently with -std=c99 and
gnu99 will make for a more complicated test.  So will introducing the
change - even just the default warning - in many branches at once.
A new -Wno-inline-warning option would not help either, since older
gcc versions will complain about the new option.

Maybe you should #define __gcc_gnu_inline__ and __gcc_c99_inline__
as the proper attribute/keyword so that a program can #ifdef on them.


I wonder what "-pedantic" should do about "inline"?  I've seen many
people use "-pedantic" without "-std"/"-ansi", because on many
systems the latter break some header files.

-- 
Regards,
Hallvard


RE: How to create both -option-name-* and -option-name=* options?

2006-11-10 Thread Dave Korn
On 10 November 2006 21:18, Brooks Moses wrote:

> Dave Korn wrote:

> But that's already not possible -- that's essentially how I got into
> this problem in the first place.  If one tries to define both of those,
> the declaration of the enumeration-type holding the option flags breaks,
> so you can't do that.

  That aside, it would have been possible before, and the mangling could
easily have been fixed to support it had we wanted to.

> (Well, you could hack that to make it work; define -fFLAG as the option
> name, so that the '-' or '=' is the first character of the argument.
> That will still work, but it's a pain if VALUE is otherwise a UInteger.)

  Yeh, but it's also the right thing to do with the machinery as it stands.
 
> This does raise a point about how the options are compared, though -- to
> be useful, this needs to also handle cases where a Joined option is
> emulated by a "normal" option.  For instance, Fortran's lang.opt
> contains something like:
> 
>-ffixed-line-length-none
>Fortran
> 
>-ffixed-line-length-
>Fortran Joined
> 
> We would also want "-ffixed-line-length=none" to be handled
> appropriately, which makes this a bit trickier than just handling the
> last character of Joined options.
> 
> Are there any meaningful downsides to just having the option-matcher
> treat all '-' and '=' values in the option name as equivalent?  It would
> mean that we'd also match "-ffixed=line=length-none", for instance, but
> I don't think that causes any real harm.

  I think it's horribly ugly!  (Yes, this would not be a show-stopper in
practice; I have a more serious reason to object, read on...)

> An alternative would be to specify that an '=' in the name in the .opt
> file will match either '=' or '-' on the command line.  This does
> require that the canonical form be the one with '=' in it, and means
> that things with '-' in them need to be changed in the .opt file to
> accept both, but the benefit is that it can accept pseudo-Joined options
> in either form without accepting all sorts of wierd things with random
> '='s in them.

  I think that for this one case we should just say that you have to supply
both forms -ffixed-line-length-none and -ffixed-line-length=none.

  What you have here is really a joined option that has an argument that can
be either a text field or an integer, and to save the trouble of parsing the
field properly you're playing a trick on the options parser by specifying
something that looks to the options machinery like a longer option with a
common prefix, but looks to the human viewer like the same option with a text
rather than integer parameter joined.

  Treating a trailing '-' as also matching a '=' (and vice-versa) doesn't blur
the boundary between what are separate concepts in the option parsing
machinery.  I think if you really want these pseudo-joined fields, add support
to the machinery to understand that the joined field can be either a string or
a numeric.

  The change I'm proposing is kind of orthogonal to that.  It solves your
problem with the enum; there becomes only one enum to represent both forms and
both forms are accepted and parse to that same enumerated value.  It does not
solve nor attempt to address your other problem, with the limitations on
parsing joined fields, and I don't think we should try and bend it into shape
to do this second job as well.

  If you address the parsing limitation on joined fields, the flexibility that
my suggestion offers /will/ automatically be available to your usage.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



gcc-4.1-20061110 is now available

2006-11-10 Thread gccadmin
Snapshot gcc-4.1-20061110 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20061110/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.1 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch 
revision 118667

You'll find:

gcc-4.1-20061110.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.1-20061110.tar.bz2 C front end and core compiler

gcc-ada-4.1-20061110.tar.bz2  Ada front end and runtime

gcc-fortran-4.1-20061110.tar.bz2  Fortran front end and runtime

gcc-g++-4.1-20061110.tar.bz2  C++ front end and runtime

gcc-java-4.1-20061110.tar.bz2 Java front end and runtime

gcc-objc-4.1-20061110.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.1-20061110.tar.bz2The GCC testsuite

Diffs from 4.1-20061103 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.1
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: C++: Implement code transformation in parser or tree

2006-11-10 Thread Mark Mitchell
Sohail Somani wrote:

> struct __some_random_name
> {
> void operator()(int & t){t++;}
> };
> 
> for_each(b,e,__some_random_name());
> 
> Would this require a new tree node like LAMBDA_FUNCTION or should the
> parser do the translation? In the latter case, no new nodes should be
> necessary (I think).

Do you need new class types, or just an anonymous FUNCTION_DECL?

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: expanding __attribute__((format,..))

2006-11-10 Thread Joseph S. Myers
On Fri, 10 Nov 2006, Nuno Lopes wrote:

> > On Fri, 10 Nov 2006, Ian Lance Taylor wrote:
> > 
> > > I kind of liked this idea:
> > > http://gcc.gnu.org/ml/gcc-patches/2005-07/msg00797.html
> > > but of course it was insane.
> > 
> > I still think a higher level state machine as described in the followups
> > is how things should be done.
> 
> wouldn't that be killing a mosquito with a bomb? :)  (unless of course we can
> find a simple description language)

Format checking is complicated.  Over 5% of all the 4 or so test 
assertions in a gcc testsuite run are from format checking testcases.  
Format checking is one of the most difficult parts of the compiler to get 
correct from an i18n perspective (i.e. having all complete sentences 
available for translation); everything else in the C front end apart from 
parse errors should be correct in that regard.

> Can I do anything to help? I mean, can you point me the files and what should
> I do in order to make this move forward? (the most I've made was a few little
> patches to make a customized cross-compiler to a mips robot, so I'm not very
> familiarized with the code...)

c-format.[ch].  Understand the logic in there as a whole.  Consider what 
aspects of information about format strings are embedded in the code and 
how you might improve the datastructures, one aspect at a time, to 
describe that aspect in data rather than code.  For verifying there are no 
unintended changes in the compiler's behavior, compare the exact 
diagnostic texts in gcc.log from test runs before and after each change.

-- 
Joseph S. Myers
[EMAIL PROTECTED]


Re: Threading the compiler

2006-11-10 Thread Sohail Somani
On Fri, 2006-11-10 at 22:49 +0100, Marcin Dalecki wrote:
> > I don't think it can possibly hurt as long as people follow normal C++
> > coding rules.
> 
> Contrary to C there is no single general coding style for C++. In  
> fact for a project
> of such a scale this may be indeed the most significant deployment  
> problem for C++.

There isn't a single coding style, this is true. But there are styles
which are generally understood to be bad. 

> > Lots of threads communicating a lot would be bad.
> 
> This simply itsn't true. The compiler would be fine having many  
> threads handling a
> lot of data between them in a pipelined way. In fact it already does  
> just that,
> however without using the opportunity for paralell execution.

What I meant by that statement was that in general, when there is a lot
of synchronization, race conditions happen if discipline is not applied.
Correct multi threaded code is hard. I would submit that the last thing
you need are race conditions as a matter of course in a compiler because
someone forgot to lock resource A before B. Not saying anything about
the gcc developers in particular of course.

Aside: I think the RAII nature of C++ constructors/destructors is
helpful in locking code.

More 2c?

Sohail



Re: Threading the compiler

2006-11-10 Thread Mike Stump

On Nov 10, 2006, at 2:19 PM, Kevin Handy wrote:
What will the multi-core compiler design do to the old processors  
(extreme slowness?)


Roughly speaking, I want it to add around 1000 extra instructions per  
function compiled, in other words, nothing.  The compile speed will  
be what the compile speed is.  Now, I will caution, the world doesn't  
look kindly on people trying to bootstrap gcc on a 8 MHz m68k  
anymore, even though it might even be possible.  In 5 years, I'm  
gonna be compiling on an 80 or 160 way box.  :-)  Yeah, Intel  
promised.  If you're trying to compile on a single 1 GHz CPU, it's  
gonna be slow  I don't want to make the compiler any slower, I  
want to make it faster, others will make use of the faster compiler,  
to make it slower, but that is orthogonal to my wanting to make it  
faster.


4. Will you "serialize" error messages so that two compiles of a  
file will always display the errors in the same order?


Yes, I think that messages should feed back into manager, so that the  
manager can `manage' things.  A stable, rational ordering for  
messages makes sense.



Also, will the object  files created be the same between compiles.


Around here, we predicate life on determinism, you can pry that away  
from my cold dead fingers.  We might have to switch from L472 to  
L10.22 for internal labels for example.  This way, each thread can  
create infinite amounts of labels that don't conflict with other  
threads (functions).


5. Will more "heavy" optimizations be available? i.e. Will the  
multi-core

 speed things up enough that really hard optimizations (speed wise)
 become reasonable?


See my first paragraph.


Re: C++: Implement code transformation in parser or tree

2006-11-10 Thread Sohail Somani
On Fri, 2006-11-10 at 14:47 -0800, Mark Mitchell wrote:
> Sohail Somani wrote:
> 
> > struct __some_random_name
> > {
> > void operator()(int & t){t++;}
> > };
> > 
> > for_each(b,e,__some_random_name());
> > 
> > Would this require a new tree node like LAMBDA_FUNCTION or should the
> > parser do the translation? In the latter case, no new nodes should be
> > necessary (I think).
> 
> Do you need new class types, or just an anonymous FUNCTION_DECL?

Hi Mark, thanks for your reply.

In general it would be a new class. If the lambda function looks like:

void myfunc()
{

int a;

...<>(int i1,int i2) extern (a) {a=i1+i2}...

}

That would be a new class with an int reference (initialized to a) and
operator()(int,int).

Does that clarify?

Sohail



Re: Question on tree-nested.c:convert_nl_goto_reference

2006-11-10 Thread Ian Lance Taylor
[EMAIL PROTECTED] (Richard Kenner) writes:

> > But I do get a failure in verify_flow_info with the appended test case.
> 
> Indeed that's where I get the ICE.
> 
> > verify_flow_info is only used when checking is enabled, so
> > maybe that is why people aren't seeing it?  
> 
> But isn't that the default on the trunk?

Yes.  But it's not on releases, so non-developers are going to see it.
And I can't find any C test cases which detect the problem.  As far as
I can tell, in C the problem will only arise when a nested function
itself contains a nested function, and the inner nested function does
a non-local goto to the outer nested function.  That is, the test case
I posted earlier is about as simple as it gets.

I don't know whether there are any functions nested inside nested
functions which do non-local gotos in the Ada testsuite.

Ian


Re: How to create both -option-name-* and -option-name=* options?

2006-11-10 Thread Brooks Moses

Dave Korn wrote:

On 10 November 2006 21:18, Brooks Moses wrote:

But that's already not possible -- that's essentially how I got into
this problem in the first place.  If one tries to define both of those,
the declaration of the enumeration-type holding the option flags breaks,
so you can't do that.


  That aside, it would have been possible before, and the mangling could
easily have been fixed to support it had we wanted to.


Right, yeah -- my point was just that nobody _had_ fixed the mangling to 
support it, and thus that this was only eliminating a theoretical 
possibility rather than something someone might actually be doing, which 
means in practice it's not changing very much.



Are there any meaningful downsides to just having the option-matcher
treat all '-' and '=' values in the option name as equivalent?  It would
mean that we'd also match "-ffixed=line=length-none", for instance, but
I don't think that causes any real harm.


  I think it's horribly ugly!  (Yes, this would not be a show-stopper in
practice; I have a more serious reason to object, read on...)


I think it's horribly ugly, too -- but I don't see that the ugliness 
shows up anywhere unless some user is _intentionally_ doing something 
ugly; it just means that their ugly usage is rewarded by the compiler 
doing essentially what they expect, rather than throwing an error.



An alternative would be to specify that an '=' in the name in the .opt
file will match either '=' or '-' on the command line.  This does
require that the canonical form be the one with '=' in it, and means
that things with '-' in them need to be changed in the .opt file to
accept both, but the benefit is that it can accept pseudo-Joined options
in either form without accepting all sorts of wierd things with random
'='s in them.


  I think that for this one case we should just say that you have to supply
both forms -ffixed-line-length-none and -ffixed-line-length=none.


Which I would be glad to do, except that as far as I can tell, it's not 
possible to actually do that.  The same problem arises there as arises 
when it doesn't have "none" on the end and "Joined" in the specification.



  What you have here is really a joined option that has an argument that can
be either a text field or an integer, and to save the trouble of parsing the
field properly you're playing a trick on the options parser by specifying
something that looks to the options machinery like a longer option with a
common prefix, but looks to the human viewer like the same option with a text
rather than integer parameter joined.


Right, agreed.  Though it's not so much "to save the trouble" as "to be 
able to leverage all the useful things the option parser does to verify 
numeric fields".



  Treating a trailing '-' as also matching a '=' (and vice-versa) doesn't blur
the boundary between what are separate concepts in the option parsing
machinery.  I think if you really want these pseudo-joined fields, add support
to the machinery to understand that the joined field can be either a string or
a numeric.


Well, I'm not sure that I "want" them, exactly.  They're only in 
gfortran because we're supporting backwards compatibity going back to 
the very early days of g77.



  The change I'm proposing is kind of orthogonal to that.  It solves your
problem with the enum; there becomes only one enum to represent both forms and
both forms are accepted and parse to that same enumerated value.  It does not
solve nor attempt to address your other problem, with the limitations on
parsing joined fields, and I don't think we should try and bend it into shape
to do this second job as well.

  If you address the parsing limitation on joined fields, the flexibility that
my suggestion offers /will/ automatically be available to your usage.


Hmm.  Valid points.

And, given that adding support for both string and numeric values looks 
fairly easy (much more so than I would have guessed), that's probably 
the better way to go.  A UIntegerOrString property would be incompatible 
with the Var property, since it would need two variables for storing the 
result, but I think this is not a notable loss since the combination of 
Var and UInteger is already rare -- the only flag that uses them both is 
-fabi-version.


Or, given that the only thing that appears to use this at the moment is 
this old g77-style fixed-line-length Fortran option that we're only 
supporting for legacy purposes, I suppose we could just go for the 
cop-out of supporting the "-none" version and not the "=none" version, 
and only document it as accepting "=0".


- Brooks



Re: strict aliasing question

2006-11-10 Thread Howard Chu

Richard Guenther wrote:

On 11/10/06, Howard Chu <[EMAIL PROTECTED]> wrote:

I see a lot of APIs (e.g. Cyrus SASL) that have accessor functions
returning values through a void ** argument. As far as I can tell, this
doesn't actually cause any problems, but gcc 4.1 with -Wstrict-aliasing
will complain. For example, take these two separate source files:

alias1.c


#include 

extern void getit( void **arg );

main() {
int *foo;

getit( (void **)&foo);
printf("foo: %x\n", *foo);
}



alias2.c

static short x[] = {16,16};

void getit( void **arg ) {
*arg = x;
}


gcc -O3 -fstrict-aliasing -Wstrict-aliasing *.c -o alias

The program prints the expected result with both strict-aliasing and
no-strict-aliasing on my x86_64 box.  As such, when/why would I need to
worry about  this warning?


If you compile with -O3 -combine *.c -o alias it will break.


Hm, actually it still prints the correct result for me. What platform 
are you using where it actually makes a difference? Again, I don't see 
how it's possible for a correct code generator to get this wrong. The 
only way that can happen is if the compiler ignores the store of x into 
*arg.  Any compiler that did that would quite plainly be broken. It 
seems to be academic, since gcc produces the right result, regardless of 
optimization level.


--
 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sunhttp://highlandsun.com/hyc
 OpenLDAP Core Teamhttp://www.openldap.org/project/



Re: strict aliasing question

2006-11-10 Thread Joe Buck
On Fri, Nov 10, 2006 at 04:18:25PM -0800, Howard Chu wrote:
> Richard Guenther wrote:
> >If you compile with -O3 -combine *.c -o alias it will break.
> 
> Hm, actually it still prints the correct result for me. What platform 
> are you using where it actually makes a difference?

Rather, he is saying that, with those flags, it is possible that gcc
will do optimizations that breaks the code, but these optimizations
might show up only on some platforms, with some releases, under
some conditions.  You might luck out, but it is possible that a future
gcc will do an optimization that changes the meaning of the code.

The compiler is allowed to reason roughly as follows: "I have a copy of
foo in register R1.  foo is of type long.  There have been no writes,
since foo was loaded into R1, for any types compatible with type long.
Therefore the copy of foo in R1 is still good, so I don't have to reload
it from memory."  The C standard has rules of this form because, without
them, it can be hard to do decent loop optimization.  The reason Richard
mentioned the options he did was that they enable some optimization across
function boundaries, meaning that the compiler is more likely to see that
there can't have been any legal modifications to some objects.




Re: strict aliasing question

2006-11-10 Thread Howard Chu

Joe Buck wrote:

On Fri, Nov 10, 2006 at 04:18:25PM -0800, Howard Chu wrote:
  

Richard Guenther wrote:


If you compile with -O3 -combine *.c -o alias it will break.
  
Hm, actually it still prints the correct result for me. What platform 
are you using where it actually makes a difference?



Rather, he is saying that, with those flags, it is possible that gcc
will do optimizations that breaks the code, but these optimizations
might show up only on some platforms, with some releases, under
some conditions.  You might luck out, but it is possible that a future
gcc will do an optimization that changes the meaning of the code.
  


OK, that's fair.

The compiler is allowed to reason roughly as follows: "I have a copy of
foo in register R1.  foo is of type long.  There have been no writes,
since foo was loaded into R1, for any types compatible with type long.
Therefore the copy of foo in R1 is still good, so I don't have to reload
it from memory."  The C standard has rules of this form because, without
them, it can be hard to do decent loop optimization.  The reason Richard
mentioned the options he did was that they enable some optimization across
function boundaries, meaning that the compiler is more likely to see that
there can't have been any legal modifications to some objects.
  
I understand that logic, in the general case. In this specific example, 
none of those conditions apply. foo is an uninitialized local variable. 
Therefore the compiler cannot know that it has a valid copy of it in any 
register. In fact what it should know is that it has no valid copy of 
it. And of course, there are no loops to consider here, so that type of 
reload optimization isn't relevant. As such, the compiler has no choice 
but to do the right thing, and load the value from memory, thus getting 
the correct result. Which it does.


--
 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sunhttp://highlandsun.com/hyc
 OpenLDAP Core Teamhttp://www.openldap.org/project/



Re: Threading the compiler

2006-11-10 Thread Ross Ridge
Mike Stump writes:
>We're going to have to think seriously about threading the compiler. Intel
>predicts 80 cores in the near future (5 years). [...] To use this many
>cores for a single compile, we have to find ways to split the work. The
>best way, of course is to have make -j80 do that for us, this usually
>results in excellent efficiencies and an ability to use as many cores
>as there are jobs to run.

Umm... those 80 processors that Intel is talking about are more like the
8 coprocessors in the Cell CPU.  It's not going to give you an 80-way
SMP machine that you can just "make -j80" on.  If that's really your
target achitecture you're going to have to come up with some really
innovative techniques to take advantage of it in GCC.  I don't think
working on parallelizing GCC for 4- and 8-way SMP systems is going to
give you much of a head start.  Which isn't to say it wouldn't be a
worthy enough project in it's own right.

Ross Ridge



Re: strict aliasing question

2006-11-10 Thread Andreas Schwab
Howard Chu <[EMAIL PROTECTED]> writes:

> I understand that logic, in the general case. In this specific example,
> none of those conditions apply. foo is an uninitialized local
> variable. Therefore the compiler cannot know that it has a valid copy of
> it in any register. In fact what it should know is that it has no valid
> copy of it. And of course, there are no loops to consider here, so that
> type of reload optimization isn't relevant. As such, the compiler has no
> choice but to do the right thing, and load the value from memory, thus
> getting the correct result. Which it does.

It will load the value from memory, true, but who says that the store to
memory will happen before that?  The compiler is allowed to reorder the
statements since it "knows" that foo and *arg cannot alias.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: strict aliasing question

2006-11-10 Thread Howard Chu

Andreas Schwab wrote:

Howard Chu <[EMAIL PROTECTED]> writes:
  

I understand that logic, in the general case. In this specific example,
none of those conditions apply. foo is an uninitialized local
variable. Therefore the compiler cannot know that it has a valid copy of
it in any register. In fact what it should know is that it has no valid
copy of it. And of course, there are no loops to consider here, so that
type of reload optimization isn't relevant. As such, the compiler has no
choice but to do the right thing, and load the value from memory, thus
getting the correct result. Which it does.



It will load the value from memory, true, but who says that the store to
memory will happen before that?  The compiler is allowed to reorder the
statements since it "knows" that foo and *arg cannot alias.
  


If the compiler is smart enough to know how to reorder the statements, 
then it should be smart enough to know that reordering will still leave 
foo uninitialized, which is obviously an error.  Any time an 
optimization/reordering visibly changes the results, that reordering is 
broken. And we already know that gcc is smart enough to recognize 
attempts to use uninitialized variables, so there's no reason for it to 
go there.


--
 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sunhttp://highlandsun.com/hyc
 OpenLDAP Core Teamhttp://www.openldap.org/project/



Re: Getting "char" from INTEGER_TYPE node

2006-11-10 Thread Brendon Costa
> > I am having some trouble with getting type names as declared by the user
> > in source. In particular if i have two functions:
> >
> > void Function(int i);
> > void Function(char c);
> >
> > when processing the parameters i get an INTEGER_TYPE node in the
> > parameter list for both function as expected, however
> > IDENTIFIER_POINTER(DECL_NAME(TYPE_NAME(node))) returns the string "int"
> > for both nodes. I would have expected one to be "int" and the other to
> > be "char". Looking at the TYPE_PRECISION for these nodes i get correct
> > values though, i.e. one is 8 bit precision, the other is 32 bit.
> >
> > How can i get the "char" string when a user uses char types instead of
> > "int" strings?


After more debugging, the problem was with the type I was obtaining the
name of. I was using DECL_ARG_TYPE() to obtain it and not TREE_TYPE() on
the function parameter node. This was giving me a wider integer type
parameter instead of the type that the user declared.

Brendon.



Re: Threading the compiler

2006-11-10 Thread Paul Brook
> The competition is already starting to make progress in this area.
>
> We don't want to spend time in locks or spinning and we don't want to
> liter our code with such things, so, if we form areas that are fairly
> well isolated and independent and then have a manager, manage the
> compilation process we can have just it know about and have to deal
> with such issues.  The rules would be something like, while working
> in a hunk, you'd only have access to data from your own hunk, and
> global shared read only data.
>
> The hope is that we can farm compilation of different functions out
> into different cores.  All global state updates would be fed back to
> the manager and then the manager could farm out the results into
> hunks and so on until done.  I think we can also split out lexing out
> into a hunk.  We can have the lexer give hunks of tokens to the
> manager to feed onto the parser.  We can have the parser feed hunks
> of work to do onto the manager and so on.
>
> How many hunks do we need, well, today I want 8 for 4.2 and 16 for
> mainline, each release, just 2x more.  I'm assuming nice, equal sized
> hunks.  For larger variations in hunk size, I'd need even more hunks.
>
> Or, so that is just an off the cuff proposal to get the discussion
> started.
>
> Thoughts?

Can you make it run on my graphics card too?

Seriously thought I don't really understand what sort of response you're 
expecting. You've described how an ideal compiler would work, in fact how 
pretty much any parallel system should be designed to work.

Do you have any justification for aiming for 8x parallelism in this release 
and 2x increase in parallelism in the next release? Why not just aim for 16x 
in the first instance? 16-way SMP isn't that rare even today.

You mention that "competition is already starting to make progress". Have they 
found it to be as easy as you imply? whole-program optimisation and SMP 
machines have been around for a fair while now, so I'm guessing not.

I realise this is a very negative reply, and please don't take it personally.
However I don't think there's much to be gained by vague proposals 
saying "Lets make gcc threaded and not do it in a way that sucks". Like with 
LTO, until someone comes up with a concrete proposal and starts hacking on a 
branch It's all just hot air.

Paul


Re: strict aliasing question

2006-11-10 Thread Daniel Berlin

> It will load the value from memory, true, but who says that the store to
> memory will happen before that?  The compiler is allowed to reorder the
> statements since it "knows" that foo and *arg cannot alias.
>

If the compiler is smart enough to know how to reorder the statements,
then it should be smart enough to know that reordering will still leave
foo uninitialized, which is obviously an error.


It's also undefined, so we can *and will* reorder things involving
uninitialized variables.

 Any time an
optimization/reordering visibly changes the results, that reordering is
broken.

Not in this case.
also Note that gcc *guarantees* the union trick will work, even though
the standard does not.


And we already know that gcc is smart enough to recognize
attempts to use uninitialized variables, so there's no reason for it to
go there.

We already do, particularly when it comes to constant propagation

Relying on the idea that "oh, well, this is uninitialized, so the
compiler can't touch it" is going to get you hurt one of these days :)


Re: Threading the compiler

2006-11-10 Thread Daniel Berlin

On 11/10/06, Mike Stump <[EMAIL PROTECTED]> wrote:

On Nov 10, 2006, at 12:46 PM, H. J. Lu wrote:
> Will use C++ help or hurt compiler parallelism? Does it really matter?

I'm not an expert, but, in the simple world I want, I want it to not
matter in the least.  For the people writing most code in the
compiler, I want clear simple rules for them to follow.

For example, google uses mapreduce http://labs.google.com/papers/
mapreduce.html as a primitive, and there are a few experts that
manage that code, and everyone else just mindlessly uses it.  The
rules are explained to them, and they just follow the rules and it
just works.  No locking, no atomic, no volatile, no cleaver lock free
code, no algorithmic changes (other than decomposing into isolated
composable parts) .  I'd like something similar for us.


I think the part that makes me the giggle the most is that we assume
that the actual mapper code is not threadsafe by default, and won't
run multiple threads of the mapper.


Re: strict aliasing question

2006-11-10 Thread Howard Chu

Daniel Berlin wrote:
> It will load the value from memory, true, but who says that the 
store to
> memory will happen before that?  The compiler is allowed to reorder 
the

> statements since it "knows" that foo and *arg cannot alias.
>

If the compiler is smart enough to know how to reorder the statements,
then it should be smart enough to know that reordering will still leave
foo uninitialized, which is obviously an error.


It's also undefined, so we can *and will* reorder things involving
uninitialized variables.

 Any time an
optimization/reordering visibly changes the results, that reordering is
broken.

Not in this case.


Hm. If you're going to reorder these things, then I would expect either 
an error or a warning at that point, because you really do know that a 
reference to an uninitialized variable is happening.



also Note that gcc *guarantees* the union trick will work, even though
the standard does not.


That's good to know, thanks. But frankly that's braindead to require 
someone to add all these new union declarations all over their code, 
when a simple cast used to suffice, and ultimately the generated code is 
the same. And since we have to write code for compilers other than just 
gcc, we can't even really rely on the union trick. In this respect, the 
standard is broken.


This example is worse, it gives no warning and gives the wrong result 
with -O3 -Wstrict-aliasing :


#include 

main() {
   int i = 0x123456;
   int *p = &i;

   *(short *)p = 2;

   printf("%x\n", i);
}


In this case, it's not two different pointers pointing to the same 
memory, it's the same pointer. The compiler doesn't even have to guess 
whether two different pointers access the same memory - it knows it's 
the same pointer, and therefore must be accessing the same memory. I can 
understand strange results occurring when there's ambiguity, but there 
is no ambiguity here.


--
 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sunhttp://highlandsun.com/hyc
 OpenLDAP Core Teamhttp://www.openldap.org/project/



Re: strict aliasing question

2006-11-10 Thread Daniel Berlin

Hm. If you're going to reorder these things, then I would expect either
an error or a warning at that point, because you really do know that a
reference to an uninitialized variable is happening.


We do warn when we see an uninitialized value if -Wuninitialized is on.

We don't warn at every point we make an optimization based on it, nor
do i think we should :)



> also Note that gcc *guarantees* the union trick will work, even though
> the standard does not.

That's good to know, thanks. But frankly that's braindead to require
someone to add all these new union declarations all over their code,
when a simple cast used to suffice, and ultimately the generated code is
the same. And since we have to write code for compilers other than just
gcc, we can't even really rely on the union trick. In this respect, the
standard is broken.

This example is worse, it gives no warning and gives the wrong result
with -O3 -Wstrict-aliasing :

#include 

main() {
int i = 0x123456;
int *p = &i;

*(short *)p = 2;

printf("%x\n", i);
}


In this case, it's not two different pointers pointing to the same
memory, it's the same pointer. The compiler doesn't even have to guess
whether two different pointers access the same memory - it knows it's
the same pointer,
understand strange results occurring when there's ambiguity, but there
is no ambiguity here.

You are right, there isn't.

We ask the TBAA analyzer "can a store to a short * touch i.
In this case, it says "no", because it's not legal.


Re: C++: Implement code transformation in parser or tree

2006-11-10 Thread Andrew Pinski
On Fri, 2006-11-10 at 15:23 -0800, Sohail Somani wrote:
> > Do you need new class types, or just an anonymous FUNCTION_DECL?
> 
> Hi Mark, thanks for your reply.
> 
> In general it would be a new class. If the lambda function looks like:
> 
> void myfunc()
> {
> 
> int a;
> 
> ...<>(int i1,int i2) extern (a) {a=i1+i2}...
> 
> }
> 
> That would be a new class with an int reference (initialized to a) and
> operator()(int,int).
> 
> Does that clarify?

Can lambda functions like this escape myfunc?  If not then using the
nested function mechanism that is already in GCC seems like a good
thing.  In fact I think of lambda functions as nested functions.

Thanks,
Andrew Pinski 



Re: C++: Implement code transformation in parser or tree

2006-11-10 Thread Sohail Somani
On Fri, 2006-11-10 at 19:46 -0800, Andrew Pinski wrote:
> On Fri, 2006-11-10 at 15:23 -0800, Sohail Somani wrote:
> > > Do you need new class types, or just an anonymous FUNCTION_DECL?
> > 
> > Hi Mark, thanks for your reply.
> > 
> > In general it would be a new class. If the lambda function looks like:
> > 
> > void myfunc()
> > {
> > 
> > int a;
> > 
> > ...<>(int i1,int i2) extern (a) {a=i1+i2}...
> > 
> > }
> > 
> > That would be a new class with an int reference (initialized to a) and
> > operator()(int,int).
> > 
> > Does that clarify?
> 
> Can lambda functions like this escape myfunc?  If not then using the
> nested function mechanism that is already in GCC seems like a good
> thing.  In fact I think of lambda functions as nested functions.

Yes they can in fact. So the object can outlive the scope. A supposed
use is for callbacks. Personally, I'd use it to make stl more usable in
the cases where boost lambda doesn't help.

Thanks,

Sohail



Re: strict aliasing question

2006-11-10 Thread Alexey Starovoytov
On Fri, 10 Nov 2006, Daniel Berlin wrote:

> > > It will load the value from memory, true, but who says that the store to
> > > memory will happen before that?  The compiler is allowed to reorder the
> > > statements since it "knows" that foo and *arg cannot alias.
> > >
> >
> > If the compiler is smart enough to know how to reorder the statements,
> > then it should be smart enough to know that reordering will still leave
> > foo uninitialized, which is obviously an error.
>
> It's also undefined, so we can *and will* reorder things involving
> uninitialized variables.

> >  Any time an
> > optimization/reordering visibly changes the results, that reordering is
> > broken.
> Not in this case.
> also Note that gcc *guarantees* the union trick will work, even though
> the standard does not.
>
> > And we already know that gcc is smart enough to recognize
> > attempts to use uninitialized variables, so there's no reason for it to
> > go there.
> We already do, particularly when it comes to constant propagation
>
> Relying on the idea that "oh, well, this is uninitialized, so the
> compiler can't touch it" is going to get you hurt one of these days :)

while speaking about uninitialized variables gcc developers probably want
to look at their own sources first:
gcc/testsuite/gcc.dg/vect/vect-27.c
  int ia[N];
  int ib[N+1];

  for (i=0; i < N; i++)
{
  ib[i] = i;
}

  for (i = 1; i <= N; i++)
{
  ia[i-1] = ib[i];
}

  /* check results:  */
  for (i = 1; i <= N; i++)
{
  if (ia[i-1] != ib[i])
abort ();
}

I hope that's not intentional, since higher optimizations in some compilers
break this incorrect code already.

Alex.



Re: Threading the compiler

2006-11-10 Thread Geert Bosch

Most people aren't waiting for compilation of single files.
If they do, it is because a single compilation unit requires
parsing/compilation of too many unchanging files, in which case
the primary concern is avoiding redoing useless compilation.

The common case is that people just don't use the -j feature
of make because
  1) they don't know about it
  2) their IDE doesn't know about it
  3) they got burned by bad Makefiles
  4) it's just too much typing

Making single compilations more complex through threading
seems wrong. Right now, in each compilation, we invoke the
compiler driver (gcc), which invokes the front end and
then the assembler. All these processes need to be
initialized, need to communicate, clean up etc.
While one might argue to use "gcc -pipe" for more parallelism,
I'd guess we win more by writing object files directly to disk
like virtually every other compiler on the planet.

Just compiling
  int main() { puts ("Hello, world!"); return 0; }
takes 342 system calls on my Linux box, most of them
related to creating processes, repeated dynamic linking,
and other initialization stuff, and reading and writing
temporary files for communication.

For every instruction processed, we call printf
to produce nicely formatted output with decimal operands
which later gets parsed again into binary format.
Ideally, we'd just do one read of the source and
one write of the object. Then we'd have far below
100 system calls for the entire compilation.

Most of my compilations (on Linux, at least) use close
to 100% of CPU. Adding more overhead for threading and
communication/synchronization can only hurt.

  -Geert


Re: Threading the compiler

2006-11-10 Thread Chris Lattner


On Nov 10, 2006, at 9:08 PM, Geert Bosch wrote:

The common case is that people just don't use the -j feature
of make because
  1) they don't know about it
  2) their IDE doesn't know about it
  3) they got burned by bad Makefiles
  4) it's just too much typing


Don't forget:
  5) running 4 GCC processes at once at -O3 runs out of memory and  
starts swapping, limiting me to -j2 or -j3 on a 2G 4-core box.


This is helped with threading.

-Chris


Re: strict aliasing question

2006-11-10 Thread Andrew Pinski
On Fri, 2006-11-10 at 21:00 -0800, Alexey Starovoytov wrote:
> while speaking about uninitialized variables gcc developers probably want
> to look at their own sources first:
> gcc/testsuite/gcc.dg/vect/vect-27.c

If any code in the testsuite is broken, it should be changed.  And this
is not really part of the compiler so you will not get wrong code from
the compiler, just the testcase will break.  If you find some, report it
instead of just complaining about it.

Thanks,
Andrew Pinski



Re: Threading the compiler

2006-11-10 Thread Marcin Dalecki


On 2006-11-11, at 06:08, Geert Bosch wrote:


Just compiling
  int main() { puts ("Hello, world!"); return 0; }
takes 342 system calls on my Linux box, most of them
related to creating processes, repeated dynamic linking,
and other initialization stuff, and reading and writing
temporary files for communication.


And 80% of it comes from the severe overuse of the notion of shared  
libraries on linux systems.


Marcin Dalecki




Re: Threading the compiler

2006-11-10 Thread Sohail Somani
On Sat, 2006-11-11 at 00:08 -0500, Geert Bosch wrote:
> Most of my compilations (on Linux, at least) use close
> to 100% of CPU. Adding more overhead for threading and
> communication/synchronization can only hurt.

In my daily work, I take processes that run 100% and make them use 100%
in less time. I think it sounds like (from what you say) that gcc needs
to be optimized before parallelized?

In some cases this might be easier.

Sohail



Re: strict aliasing question

2006-11-10 Thread Howard Chu

Daniel Berlin wrote:


We ask the TBAA analyzer "can a store to a short * touch i.
In this case, it says "no", because it's not legal.

If you know the code is not legal, why don't you abort the compilation 
with an error code? The current silent behavior provides a mechanism for 
creating source-code Trojans - code that on casual inspection, looks 
like it does one thing but does something else. It can even mask its 
behavior from debugging - e.g., typically code compiled for debugging 
has the optimizer turned off, because otherwise it's too difficult to 
follow the sequence of operations, variables aren't always accessible, 
etc. When compiled in this manner it is completely benign. But when 
built for deployment, with optimization, it's another story...


For example...

#include 

short buf[4];
char text[8];

main() {
   char *c;
   int *i;
   short *s;
   int words[] = { 0x726d202a, 0x70732078 };

   c = (char *)words;
   if ( *c == 0x2a ) { /* little endian */
   int j;

   j = words[0];
   c[3] = j & 0xff;
   j >>= 8;
   c[2] = j & 0xff;
   j >>= 8;
   c[1] = j & 0xff;
   j >>= 8;
   c[0] = j & 0xff;
   j = words[1];
   c += 4;
   c[3] = j & 0xff;
   j >>= 8;
   c[2] = j & 0xff;
   j >>= 8;
   c[1] = j & 0xff;
   j >>= 8;
   c[0] = j & 0xff;
   }

   s = (short *)(char *)words;
   buf[0] = s[0];
   buf[1] = s[1];
   i = (int *)(char *)buf;
   *i = words[1];
   s = (short *)text;
   s[0] = buf[0];
   s[1] = buf[1];

   printf("%x %x %x %x\n", buf[0], buf[1], buf[2], buf[3] );
   puts(text);
/*  system(text); */
}


The above code compiles without warning with -O2 / -O3 
-Wstrict-aliasing, but the result is quite different from compiling 
without optimization.


--
 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sunhttp://highlandsun.com/hyc
 OpenLDAP Core Teamhttp://www.openldap.org/project/



Re: strict aliasing question

2006-11-10 Thread Andrew Pinski
On Fri, 2006-11-10 at 23:05 -0800, Howard Chu wrote:
> Daniel Berlin wrote:
> >
> > We ask the TBAA analyzer "can a store to a short * touch i.
> > In this case, it says "no", because it's not legal.
> >
> If you know the code is not legal, why don't you abort the compilation 
> with an error code?

The code is legal but undefined at runtime.  There was a defect report
to the C standard about undefined code at runtime and rejecting that
code and the C standard committee decided it was not a defect.
http://std.dkuug.dk/JTC1/SC22/WG14/www/docs/dr_109.html

Here is the rational from that Defect report about not rejecting the
undefined behavior:
A conforming implementation must not fail to translate a strictly
conforming program simply because some possible execution of that
program would result in undefined behavior. Because foo might never be
called, the example given must be successfully translated by a
conforming implementation.

Thanks,
Andrew Pinski



Re: Threading the compiler

2006-11-10 Thread Mike Stump

On Nov 10, 2006, at 5:43 PM, Paul Brook wrote:

Can you make it run on my graphics card too?


:-)  You know all the power on a bleeding edge system is in the GPU  
now.  People are already starting to migrate data processing for  
their applications to it.  Don't bet against it.  In fact, we hide  
such migration behind apis that people already know and love, and you  
might be doing it in your applications already, if you're not  
careful.  And before you start laughing too hard, they are doubling  
every 12 months, we've only managed to double every 18 months.  Let's  
just say, the CPU is doomed.


Seriously thought I don't really understand what sort of response  
you're expecting.


Just consensus building.

Do you have any justification for aiming for 8x parallelism in this  
release and 2x increase in parallelism in the next release?


Our standard box we ship today that people do compiles on tends to be  
a 4 way box.  If a released compiler made use of the hardware we ship  
today, it would need to be 4 way.  For us to have had the feature in  
the compiler we ship with those systems, the feature would have had  
to be in gcc-4.0.  Intel has already announced 4 core chips that are  
pin compatible with the 2 core chips.  Their ship date is in 3 days.   
People have already dropped them in our boxes and they have 8 way  
machines, today.  For them to make use of those cores, today, gcc-4.0  
would had to have been 8 way capable.  The rate of increase in cores  
is 2x every 18 months.  gcc releases are about one every 12-18  
months.  By the time I deploy gcc-4.2, I could use 8 way, by the time  
I stop using gcc-4.2, I could make use of 16-32 cores I suspect.  :-(



Why not just aim for 16x in the first instance?


If 16x is more work than 8x, then I can't yet pony up the work  
required for 16x myself.  If cheap enough, I'll design a system where  
it is just N-way.  Won't know til I start doing code.


You mention that "competition is already starting to make  
progress". Have they found it to be as easy as you imply?


I didn't ask if they found it easy or not.

whole-program optimisation and SMP machines have been around for a  
fair while now, so I'm guessing not.


I don't know of anything that is particularly hard about it, but, if  
you know of bits that are hard, or have pointer to such, I'd be  
interested in it.


Re: strict aliasing question

2006-11-10 Thread Howard Chu

Andrew Pinski wrote:

On Fri, 2006-11-10 at 23:05 -0800, Howard Chu wrote:
  

Daniel Berlin wrote:


We ask the TBAA analyzer "can a store to a short * touch i.
In this case, it says "no", because it's not legal.

  
If you know the code is not legal, why don't you abort the compilation 
with an error code?



The code is legal but undefined at runtime.


Ah... Now we see why people are so easily confused by the overall issue 
- ask the question and get conflicting answers on what's legal, 
implementation-defined, or undefined.


Back in the gcc 1.x days "#pragma" was implementation-defined, so the 
preprocessor would try to execute hack, rogue, and a few other toys 
whenever it was encountered in source, whichever got located first. 
Eventually someone made the pragmatic decision that gcc should do its 
best to actually do something useful when encountering undefined situations.



  There was a defect report
to the C standard about undefined code at runtime and rejecting that
code and the C standard committee decided it was not a defect.
http://std.dkuug.dk/JTC1/SC22/WG14/www/docs/dr_109.html

Here is the rational from that Defect report about not rejecting the
undefined behavior:
A conforming implementation must not fail to translate a strictly
conforming program simply because some possible execution of that
program would result in undefined behavior. Because foo might never be
called, the example given must be successfully translated by a
conforming implementation.
  


What does "successfully translated" mean? Shouldn't "translation" mean 
the source code is translated into object code? Shouldn't that mean it 
should actually generate code that actually executes, and not get 
ignored? Otherwise, "successfully translated" may just as well mean 
"invokes nethack, rogue, larn..." at that instant.


--
 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sunhttp://highlandsun.com/hyc
 OpenLDAP Core Teamhttp://www.openldap.org/project/



Re: strict aliasing question

2006-11-10 Thread Rask Ingemann Lambertsen
On Fri, Nov 10, 2006 at 02:32:10PM -0800, Howard Chu wrote:
> 
> With the previous example, if alias1.c was instead:
> 
> 
> #include 
> 
> extern void getit( void **arg );
> 
> main() {
>union {
>int *foo;
>void *bar;
>} u;
> 
>getit( &u.bar );
>printf("foo: %x\n", *u.foo);
> }
> 
> 
> gcc no longer complains, but according to the spec, the code is not any 
> more reliable.

   As far as I know, memcpy() is the answer:

#include 
#include 

extern void getit (void **arg);

int main ()
{
int *foo;
void *bar;

getit (&bar);
memcpy (&foo, &bar, sizeof (foo));
printf ("foo: %x\n", *foo);
return (0);
}

-- 
Rask Ingemann Lambertsen


  1   2   >