Re: GCC 4.3.2 bug (was: Illegal subtraction in tmp-dive_1.s)‏

2009-04-20 Thread James Dennett
2009/4/19 Jason Mancini :
>
>> Vincent Lefevre  writes:
>>    while ((*(q++))-- == 0) ;
>
> Is that defined and legal??  Is q incremented before or after *q is 
> decremented?  They are both post operators!
> Jason Mancini

It's defined and legal (so long as q != &q, which might well be
guaranteed by the type system for an incrementable q -- it's late, and
I might be missing a counterexample to that).  The order of the
increment/decrement makes no difference except in the pathological
case where they attempt to change the same object (and in that case,
the behavior is undefined).

Note: the decrement is done to *initial_value_of_q, as q++ evaluates
to a copy of q's initial value.  q could even be incremented before
that, so long as the decrement still applies to *initial_value_of_q.
All of this assumes the absence of "volatile", of course.

-- James


Re: vector<> issue in g++?

2009-04-20 Thread James Dennett
On Sun, Apr 19, 2009 at 7:19 PM, James Dennett  wrote:
> On Sun, Apr 19, 2009 at 4:15 PM, Paolo Piacentini
>  wrote:
>> I don't think this is a bug but certainly it is a problem.
>>
>> Would you please consider it and let me know? I hope so. Thanks.
>>
>> The following simple volcalc.cpp code compiles with no errors (and
>> works) in Windows Visual C++.
>> It simply sizes the "alldata" array later in the code.
>
> If Visual C++ does not diagnose the error in the code in its best
> standards-conforming mode, that is a bug in Visual C++.  Allowing it
> as an extension is an entirely reasonable thing to do though.
>
>> With g++ v.4.3.2 instead I get the error reported hereby.
>> For some reason it does not like the fact that struct is declared
>> local.
>
> That is what C++ requires; template parameters must have external
> linkage, and in C++98 local types do not have external linkage.
>
>> If you declare struct as global it will be working but I cannot
>> change the code so drastically.
>>
>> I would thankfully appreciate any help (including tough critics to the code).
>
> The "drastic" change would be needed to make your code valid C++, and
> if you do that then g++ will compile it.
>
> There has been discussion of changing this rule for the next C++
> standard, see e.g.,
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2402.pdf
> though I don't see signs of it having been merged into the current
> committee draft.  N2402 does mention the change in MSVC++ as being a
> relatively recent extension.  A quick search hasn't turned up the
> current status of N2402, though there was some discussion of weakening
> it by removing support for using unnamed types as template parameters,
> and it seemed to have reasonably strong support in that form.

Asking around a little more, and I've been pointed to
  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2657.htm
which has been incorporated into the current committee draft for
C++0x, and allows use of local types as template parameters.  It looks
like your code will become legal in the next C++ standard, and g++
might well start supporting this before then (if someone is inspired
to implement it, that is).

-- James


Re: [RFC] Massive recursion in tree_ssa_phiprop_1?

2009-04-20 Thread Richard Guenther
On Mon, Apr 20, 2009 at 8:29 AM, Paolo Bonzini  wrote:
> Dave Korn wrote:
>> Richard Guenther wrote:
>>
>>> Well ... in this case it's likely the problem that propagate_with_phi is
>>> inlined (single-use static function) and maybe other helpers of it too.
>>
>>   It is inlined.  I rebuilt jc1 after adding __attribute__ ((noinline)), and
>> the stack frame size for tree_ssa_phiprop_1 went down from 0xcc to 0x3c, so
>> that buys us some breathing room, but the problem is still lurking there;
>> compilation of a larger function could still trip it.  (It saved enough
>> headroom for my trial build of the libjava html parser to complete 
>> successfully.)
>>
>>   Should we be concerned that end-users might run into this in real-world
>> situations when they're compiling large files of bulk auto-generated code?
>
> Indeed we should use dom-walk.c, or better copy the worklist approach
> from it.
>
>  worklist[sp++] = ENTRY_BLOCK_PTR;
>  while (sp)
>    {
>      bb = worklist[--sp];
>      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>        did_something |= propagate_with_phi (bb, gsi_stmt (gsi), phivn, n);
>
>      for (dest = first_dom_son (walk_data->dom_direction, bb);
>           dest; dest = next_dom_son (walk_data->dom_direction, dest))
>        worklist[sp++] = dest;
>    }
>  return did_something;

Feel free to post patches replacing the various similar walks with
the above pattern (or add a FOR_EACH_BB_IN_DOM_ORDER
that does it, possibly with a BREAK_FROM_ that frees the
VEC used for the worklist).

grep next_dom_son *.c

only finds 22 possible uses of the above pattern.

dom-walk.c is indeed overkill for the simple cases.

Thanks,
Richard.


Re: [RFC] Massive recursion in tree_ssa_phiprop_1?

2009-04-20 Thread Paolo Bonzini

> Feel free to post patches replacing the various similar walks with
> the above pattern (or add a FOR_EACH_BB_IN_DOM_ORDER
> that does it, possibly with a BREAK_FROM_ that frees the
> VEC used for the worklist).
> 
> grep next_dom_son *.c
> 
> only finds 22 possible uses of the above pattern.

Even fewer actually.  You can even centralize it in get_dominated_blocks
and then use it.

I already have enough cleanups on my "worklist", and usually
"regressions" (even of this kind) should be fixed by people who
introduce them shouldn't they? ;-)

Paolo


branches/gcc-4_4-branch frozen in preparation of 4.4.0 release

2009-04-20 Thread Jakub Jelinek
Hi!

The 4.4 branch is now frozen, all commits to it require explicit RM approval.

If there is anything that should block the 4.4 release, please CC me on it,
otherwise I plan to create the 4.4.0 release tomorrow.

Jakub


Re: [RFC] Massive recursion in tree_ssa_phiprop_1?

2009-04-20 Thread Jan Hubicka
> On Mon, Apr 20, 2009 at 8:29 AM, Paolo Bonzini  wrote:
> > Dave Korn wrote:
> >> Richard Guenther wrote:
> >>
> >>> Well ... in this case it's likely the problem that propagate_with_phi is
> >>> inlined (single-use static function) and maybe other helpers of it too.
> >>
> >>   It is inlined.  I rebuilt jc1 after adding __attribute__ ((noinline)), 
> >> and
> >> the stack frame size for tree_ssa_phiprop_1 went down from 0xcc to 0x3c, so
> >> that buys us some breathing room, but the problem is still lurking there;
> >> compilation of a larger function could still trip it.  (It saved enough
> >> headroom for my trial build of the libjava html parser to complete 
> >> successfully.)
> >>
> >>   Should we be concerned that end-users might run into this in real-world
> >> situations when they're compiling large files of bulk auto-generated code?
> >
> > Indeed we should use dom-walk.c, or better copy the worklist approach
> > from it.
> >
> >  worklist[sp++] = ENTRY_BLOCK_PTR;
> >  while (sp)
> >    {
> >      bb = worklist[--sp];
> >      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> >        did_something |= propagate_with_phi (bb, gsi_stmt (gsi), phivn, n);
> >
> >      for (dest = first_dom_son (walk_data->dom_direction, bb);
> >           dest; dest = next_dom_son (walk_data->dom_direction, dest))
> >        worklist[sp++] = dest;
> >    }
> >  return did_something;
> 
> Feel free to post patches replacing the various similar walks with
> the above pattern (or add a FOR_EACH_BB_IN_DOM_ORDER

I am very sure I implemented dom order iterators and FOR_EACH* macro
once for this reason.  Will try to search archives.

Honza
> that does it, possibly with a BREAK_FROM_ that frees the
> VEC used for the worklist).
> 
> grep next_dom_son *.c
> 
> only finds 22 possible uses of the above pattern.
> 
> dom-walk.c is indeed overkill for the simple cases.
> 
> Thanks,
> Richard.


Re: [RFC] Massive recursion in tree_ssa_phiprop_1?

2009-04-20 Thread Richard Guenther
2009/4/20 Jan Hubicka :
>> On Mon, Apr 20, 2009 at 8:29 AM, Paolo Bonzini  wrote:
>> > Dave Korn wrote:
>> >> Richard Guenther wrote:
>> >>
>> >>> Well ... in this case it's likely the problem that propagate_with_phi is
>> >>> inlined (single-use static function) and maybe other helpers of it too.
>> >>
>> >>   It is inlined.  I rebuilt jc1 after adding __attribute__ ((noinline)), 
>> >> and
>> >> the stack frame size for tree_ssa_phiprop_1 went down from 0xcc to 0x3c, 
>> >> so
>> >> that buys us some breathing room, but the problem is still lurking there;
>> >> compilation of a larger function could still trip it.  (It saved enough
>> >> headroom for my trial build of the libjava html parser to complete 
>> >> successfully.)
>> >>
>> >>   Should we be concerned that end-users might run into this in real-world
>> >> situations when they're compiling large files of bulk auto-generated code?
>> >
>> > Indeed we should use dom-walk.c, or better copy the worklist approach
>> > from it.
>> >
>> >  worklist[sp++] = ENTRY_BLOCK_PTR;
>> >  while (sp)
>> >    {
>> >      bb = worklist[--sp];
>> >      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>> >        did_something |= propagate_with_phi (bb, gsi_stmt (gsi), phivn, n);
>> >
>> >      for (dest = first_dom_son (walk_data->dom_direction, bb);
>> >           dest; dest = next_dom_son (walk_data->dom_direction, dest))
>> >        worklist[sp++] = dest;
>> >    }
>> >  return did_something;
>>
>> Feel free to post patches replacing the various similar walks with
>> the above pattern (or add a FOR_EACH_BB_IN_DOM_ORDER
>
> I am very sure I implemented dom order iterators and FOR_EACH* macro
> once for this reason.  Will try to search archives.

I am testing a patch to use/rewrite get_all_dominated_blocks.

RIchard.

> Honza
>> that does it, possibly with a BREAK_FROM_ that frees the
>> VEC used for the worklist).
>>
>> grep next_dom_son *.c
>>
>> only finds 22 possible uses of the above pattern.
>>
>> dom-walk.c is indeed overkill for the simple cases.
>>
>> Thanks,
>> Richard.
>


Re: GCC 4.3.2 bug (was: Illegal subtraction in tmp-dive_1.s)

2009-04-20 Thread Vincent Lefevre
On 2009-04-17 12:09:42 -0500, Gabriel Dos Reis wrote:
> At least, let's get it archived on GCC mailing lists.

Is it a bug that has been identified? If not, perhaps this should
be added to the regression tests.

The program without the quotes:

/* With GCC 4.3.2 and -O2 option: output value is 1 instead of 0.
 * If -fno-strict-aliasing is added, this bug disappears.
 */

#include 
#include 

int test (int n)
{
  unsigned long *p, *q;
  int i;

  q = p = malloc (n * sizeof (unsigned long));
  if (p == NULL)
return 2;
  for (i = 0; i < n - 1; i++)
p[i] = 0;
  p[n - 1] = 1;
  while ((*(q++))-- == 0) ;
  return p[n - 1] == 1;
}

int main (void)
{
  int r;

  r = test (17);
  printf ("%d\n", r);
  return r;
}

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


Re: GCC 4.3.2 bug (was: Illegal subtraction in tmp-dive_1.s)?

2009-04-20 Thread Vincent Lefevre
On 2009-04-20 00:30:21 -0700, James Dennett wrote:
> 2009/4/19 Jason Mancini :
> >
> >> Vincent Lefevre  writes:
> >>    while ((*(q++))-- == 0) ;
> >
> > Is that defined and legal??  Is q incremented before or after *q
> > is decremented?  They are both post operators!
> 
> It's defined and legal (so long as q != &q, which might well be
[...]

Yes. BTW, I wondered if this could be due to a pathological case
such as a[a[i]] = ..., which is undefined when a[i] == i, even
though the code looks correct. But this seems to be OK.

As the bug occurs only when malloc is in the tested function, I also
wondered whether the failure was due to the use of uninitialized data
or a buffer overflow (note that in particular, the GMP code was much
more complex than my testcase), but again, this is OK.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


Re: Reserving a number of consecutive registers

2009-04-20 Thread fearyourself
Ok, I've move forward in this problem by identifying the loads I want
to suggest hard registers for and I've inspired myself with code in
combine_regs and finally have tried this:

  /* Suggest pseudo register to hard register */
  s_qtyno=reg_qty[pseudo];

  if (s_qtyno >= 0) {
  if (qty_phys_copy_sugg[s_qtyno] != 0)
  {
  SET_HARD_REG_BIT(qty_phys_copy_sugg[s_qtyno], hard);
  qty_phys_num_copy_sugg[s_qtyno]++;
  }
  else
  if (qty_phys_sugg[s_qtyno] != 0)
  {
  SET_HARD_REG_BIT(qty_phys_sugg[s_qtyno], hard);
  qty_phys_num_sugg[s_qtyno]++;
  }
  }

It seems that this works for most cases but there are still cases
where this does not work (I of course first test if the register hard
is free).

This is a little bit simpler than what you find in find_free_reg. Any
comments/suggestions at this point?

Thanks a lot,
Jc

On Fri, Apr 17, 2009 at 2:50 PM, fearyourself  wrote:
> Hi all,
>
> My target architecture has an load multiple instruction requiring a
> certain number of consecutive registers. I've been working on handling
> this case and trying to convince the local register allocator that he
> really does want to try to get those consecutive registers for the
> loads. But have been running into certain difficulties.
>
> For example:
>
> a = tab[0];
> b = tab[1];
> c = tab[2];
> d = tab[3];
> e = tab[4];
>
> sum = a + b + c + d + e;
>
> I would like to get :
>
> load_multiple tab, r10-r14
>
> and then have r10 linked to a, r11 to b, etc.
>
> Basically, I've worked in the direction of detecting a number of loads
> in the block from the same base register or symbol reference. I've got
> that detection running and I know in what order I want to associate
> the pseudo registers to the hard registers but I can't seem to request
> hard registers at that point. I thought I could be playing with the
> arrays qty_phys_sugg and qty_phys_copy_sugg but I have trouble
> wrapping my mind around what they really represent and how to do that.
>
> Do you have any ideas about how I can achieve this, am I going in the
> wrong direction?
>
> Thanks a lot,
> Jc
>


Re: GCC 4.3.2 bug (was: Illegal subtraction in tmp-dive_1.s)?

2009-04-20 Thread Joern Rennecke

As the bug occurs only when malloc is in the tested function,


Note that gcc 'knows' that memory obtained by malloc does not
alias other memory.

You can use a differently named wrapper function for malloc,
or use the malloc attribute for another function, to experiment
how this affects code generation.


Re: GCC 4.3.2 bug (was: Illegal subtraction in tmp-dive_1.s)?

2009-04-20 Thread Vincent Lefevre
On 2009-04-20 10:04:00 -0400, Joern Rennecke wrote:
>> As the bug occurs only when malloc is in the tested function,
>
> Note that gcc 'knows' that memory obtained by malloc does not
> alias other memory.

Yes, in the case of GMP, this was a GMP internal function, not malloc,
but this function is declared with __attribute__ ((malloc)), and the
bug disappears if I remove this attribute. I noticed that when I tried
to simplify the testcase.

> You can use a differently named wrapper function for malloc,
> or use the malloc attribute for another function, to experiment
> how this affects code generation.

I did a test with a wrapper function my_malloc (that just calls
malloc and returns its value), but the bug was still visible,
perhaps due to optimization.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


Re: Reserving a number of consecutive registers

2009-04-20 Thread Eric Botcazou
> This is a little bit simpler than what you find in find_free_reg. Any
> comments/suggestions at this point?

The local register allocator (local-alloc.c) has been removed in the upcoming 
4.4.x series of compilers so you might want to try something else.  Doing it
during register allocation seems tricky, a machine-specific reorg pass could 
be more appropriate.  I'm not sure we already have something like this in the 
compiler (SPARC has a double-load instruction but it's only implemented as a 
peephole).

-- 
Eric Botcazou


Re: Reserving a number of consecutive registers

2009-04-20 Thread fearyourself
For the moment, we will be remaining in the 4.3.2 version and have no
plans to follow the next 4.4/4.5 versions.

Does any architecture do such a "machine-specific reorg" pass. I've
looked around and haven't really seen one. Could you give me an idea
of where to look and how exactly that would work?

Thanks a lot,
Jc

On Mon, Apr 20, 2009 at 12:36 PM, Eric Botcazou  wrote:
>> This is a little bit simpler than what you find in find_free_reg. Any
>> comments/suggestions at this point?
>
> The local register allocator (local-alloc.c) has been removed in the upcoming
> 4.4.x series of compilers so you might want to try something else.  Doing it
> during register allocation seems tricky, a machine-specific reorg pass could
> be more appropriate.  I'm not sure we already have something like this in the
> compiler (SPARC has a double-load instruction but it's only implemented as a
> peephole).
>
> --
> Eric Botcazou
>


Re: Reserving a number of consecutive registers

2009-04-20 Thread Alexandre Pereira Nunes
On Mon, Apr 20, 2009 at 1:33 PM, fearyourself  wrote:
>
> For the moment, we will be remaining in the 4.3.2 version and have no
> plans to follow the next 4.4/4.5 versions.
>
> Does any architecture do such a "machine-specific reorg" pass. I've
> looked around and haven't really seen one. Could you give me an idea
> of where to look and how exactly that would work?
>
> Thanks a lot,
> Jc

ARM has multiple load/store and it had some predictable support in gcc
4.3.x, perhaps you can take a look at it. But it was far from good,
IMHO, see PR9831 for details.


Re: [gnat] reuse of ASTs already constructed

2009-04-20 Thread Geert Bosch


On Apr 12, 2009, at 13:29, Oliver Kellogg wrote:


On Tue, 4 Mar 2003, Geert Bosch  wrote:

[...]
Best would be to first post a design overview,
before doing a lot of work in order to prevent spending time
on implementing something that may turn out to have fundamental
problems.


I've done a little experimenting to get a feel for this.

I've looked at the work done toward the GCC compile server but
decided that I want to concentrate on GNAT trees (whereas the
compile server targets the GNU trees.)

Also I am aiming somewhat lower - not making a separate compile
server process but rather extending gnat1 to handle multiple
files in a single invocation.


While this may be an interesting idea, there are some fundamental  
assumptions
in the compiler that each compilation indeed processes a single  
compilation unit,
resulting in a single object and .ali file. It would be best to first  
contemplate
what output a single invocation of the compiler, with multiple  
compilation units

as arguments, should produce.

How would you decide if a unit needs recompilation if there was no 1:1
correspondence between compilation units and object/.ali files?
Note that unlike many other languages, Ada requires checks to avoid
including out-of-date compilation results in a program.

  -Geert


Re: Reserving a number of consecutive registers

2009-04-20 Thread Eric Botcazou
> Does any architecture do such a "machine-specific reorg" pass. I've
> looked around and haven't really seen one.

IA-64 has one, to build bundles; it reuses the scheduler.

> Could you give me an idea of where to look and how exactly that would work?

The reorg pass runs after register allocation.  You could try to identify 
consecutive loads within basic blocks, group them and rename registers or add 
copy insns to be able replace them with multiple loads.  Then you could rerun 
a cprop_hardreg pass to clean up things.

-- 
Eric Botcazou


Re: Reserving a number of consecutive registers

2009-04-20 Thread fearyourself
> The reorg pass runs after register allocation.

> You could try to identify consecutive loads within basic blocks, group them

That is not too difficult, I've written a pass that checks for that
and identifies the loads.

- and rename registers or add
> copy insns to be able replace them with multiple loads.  Then you could rerun
> a cprop_hardreg pass to clean up things.

Ok, so if I copy the locally allocated registers into N consecutive
hard registers BEFORE the group of loads, then rerun cprp_harderg, it
should replace the ones of the loads to the ones of the copies ? And
then I suppose the copies will just disappear at a later pass, correct
?

I guess, my final question is, that pass will be defined with the
TARGET_MACHINE_DEPENDENT_REORG define but how do I see if a register
is free at that point ? Can I still use the TEST_HARD_REG_BIT macros ?

Thanks again,
Jc

On Mon, Apr 20, 2009 at 1:16 PM, Eric Botcazou  wrote:
>> Does any architecture do such a "machine-specific reorg" pass. I've
>> looked around and haven't really seen one.
>
> IA-64 has one, to build bundles; it reuses the scheduler.
>
>> Could you give me an idea of where to look and how exactly that would work?
>
> The reorg pass runs after register allocation.  You could try to identify
> consecutive loads within basic blocks, group them and rename registers or add
> copy insns to be able replace them with multiple loads.  Then you could rerun
> a cprop_hardreg pass to clean up things.
>
> --
> Eric Botcazou
>


Re: Reserving a number of consecutive registers

2009-04-20 Thread Eric Botcazou
> Ok, so if I copy the locally allocated registers into N consecutive
> hard registers BEFORE the group of loads, then rerun cprp_harderg, it
> should replace the ones of the loads to the ones of the copies ? And
> then I suppose the copies will just disappear at a later pass, correct
> ?

Something like that, to be experimented though.

> I guess, my final question is, that pass will be defined with the
> TARGET_MACHINE_DEPENDENT_REORG define but how do I see if a register
> is free at that point ? Can I still use the TEST_HARD_REG_BIT macros ?

You should use the DF framework in 4.3.x and later.

-- 
Eric Botcazou


Re: Reserving a number of consecutive registers

2009-04-20 Thread fearyourself
> You should use the DF framework in 4.3.x and later.

Ok, I'll try to look at that. Is there an area where I can see how to
initialize the framework and get information about which registers are
free?

Right now, I'm looking in combine.c to see how they are using it. Any
insight would be useful,

Thanks!
Jc

On Mon, Apr 20, 2009 at 1:52 PM, Eric Botcazou  wrote:
>> Ok, so if I copy the locally allocated registers into N consecutive
>> hard registers BEFORE the group of loads, then rerun cprp_harderg, it
>> should replace the ones of the loads to the ones of the copies ? And
>> then I suppose the copies will just disappear at a later pass, correct
>> ?
>
> Something like that, to be experimented though.
>
>> I guess, my final question is, that pass will be defined with the
>> TARGET_MACHINE_DEPENDENT_REORG define but how do I see if a register
>> is free at that point ? Can I still use the TEST_HARD_REG_BIT macros ?
>
> You should use the DF framework in 4.3.x and later.
>
> --
> Eric Botcazou
>


Re: [gnat] reuse of ASTs already constructed

2009-04-20 Thread Oliver Kellogg
Hi Geert,

Thanks for your answer - I was starting to feel I'm entertaining a
monologue here ;)

On Mon, 2009-04-20 at 13:05 -0400, Geert Bosch wrote:
>
> While this may be an interesting idea, there are some fundamental  
> assumptions in the compiler that each compilation indeed processes
> a single compilation unit, resulting in a single object and .ali
> file. It would be best to first contemplate what output a single
> invocation of the compiler, with multiple compilation units
> as arguments, should produce.

For an invocation
 gnat1 a.adb b.adb c.adb
, the files a.{s,ali} b.{s,ali} c.{s,ali} are produced.

> How would you decide if a unit needs recompilation if there was no 1:1
> correspondence between compilation units and object/.ali files?

The correspondence is still 1:1, see above.

> Note that unlike many other languages, Ada requires checks to avoid
> including out-of-date compilation results in a program.

Certainly - but that's the job of gnatmake. I'm at the level of gnat1.
In the end, gnatmake would not generate a sequence of individual calls
to gcc but rather supply all files to be recompiled in a single call.

Oliver




Re: Reserving a number of consecutive registers

2009-04-20 Thread Eric Botcazou
> Ok, I'll try to look at that. Is there an area where I can see how to
> initialize the framework and get information about which registers are
> free?

The API is in df.h, see for example ifcvt.c.

-- 
Eric Botcazou


Re: [gnat] reuse of ASTs already constructed

2009-04-20 Thread Geert Bosch


On Apr 20, 2009, at 14:45, Oliver Kellogg wrote:

It would be best to first contemplate what output a single
invocation of the compiler, with multiple compilation units
as arguments, should produce.


For an invocation
gnat1 a.adb b.adb c.adb
, the files a.{s,ali} b.{s,ali} c.{s,ali} are produced.


The back end is not prepared to produce multiple assembly files.
The "gcc" driver program also assumes each invocation produces a
single .s file.

So, if this is what you want to do, you'd have to address all these
underlying limitations first.

  -Geert


The Linux binutils 2.19.51.0.4 is released

2009-04-20 Thread H.J. Lu
This is the beta release of binutils 2.19.51.0.4 for Linux, which is
based on binutils 2009 0418 in CVS on sourceware.org plus various
changes. It is purely for Linux.

All relevant patches in patches have been applied to the source tree.
You can take a look at patches/README to see what have been applied and
in what order they have been applied.

Starting from the 2.18.50.0.4 release, the x86 assembler no longer
accepts

fnstsw %eax

fnstsw stores 16bit into %ax and the upper 16bit of %eax is unchanged.
Please use

fnstsw %ax

Starting from the 2.17.50.0.4 release, the default output section LMA
(load memory address) has changed for allocatable sections from being
equal to VMA (virtual memory address), to keeping the difference between
LMA and VMA the same as the previous output section in the same region.

For

.data.init_task : { *(.data.init_task) }

LMA of .data.init_task section is equal to its VMA with the old linker.
With the new linker, it depends on the previous output section. You
can use

.data.init_task : AT (ADDR(.data.init_task)) { *(.data.init_task) }

to ensure that LMA of .data.init_task section is always equal to its
VMA. The linker script in the older 2.6 x86-64 kernel depends on the
old behavior.  You can add AT (ADDR(section)) to force LMA of
.data.init_task section equal to its VMA. It will work with both old
and new linkers. The x86-64 kernel linker script in kernel 2.6.13 and
above is OK.

The new x86_64 assembler no longer accepts

monitor %eax,%ecx,%edx

You should use

monitor %rax,%ecx,%edx

or
monitor

which works with both old and new x86_64 assemblers. They should
generate the same opcode.

The new i386/x86_64 assemblers no longer accept instructions for moving
between a segment register and a 32bit memory location, i.e.,

movl (%eax),%ds
movl %ds,(%eax)

To generate instructions for moving between a segment register and a
16bit memory location without the 16bit operand size prefix, 0x66,

mov (%eax),%ds
mov %ds,(%eax)

should be used. It will work with both new and old assemblers. The
assembler starting from 2.16.90.0.1 will also support

movw (%eax),%ds
movw %ds,(%eax)

without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are
available at

http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch
http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch

The ia64 assembler is now defaulted to tune for Itanium 2 processors.
To build a kernel for Itanium 1 processors, you will need to add

ifeq ($(CONFIG_ITANIUM),y)
CFLAGS += -Wa,-mtune=itanium1
AFLAGS += -Wa,-mtune=itanium1
endif

to arch/ia64/Makefile in your kernel source tree.

Please report any bugs related to binutils 2.19.51.0.4 to
hjl.to...@gmail.com

and

http://www.sourceware.org/bugzilla/

Changes from binutils 2.19.51.0.3:

1. Update from binutils 2009 0418.
2. Remove EFI targets and use PEI targets for EFI. Add --file-alignment,
--heap, --image-base, --section-alignment, --stack and --subsystem command
line options for objcopy.  PR 10074.
3. Update linker to warn alternate ELF machine code.
4. Fix x86 linker TLS transition.  PR 9938.
5. Improve DWARF dumper to check relocations against STT_SECTION
symbol.
6. Guard DWARF dumper on bad DWARF input.
7. Add EM_ETPU and EM_SLE9X.  Reserve 3 ELF machine types for Intel.
8. Adding a linker missing entry symbol warning for -pie. PR 9970.
9. Make the -e option for linker to imply -u.  PR 6766.
10. Properly handle paging for PEI targets.
11. Fix assembler listing with input from stdin.
12. Update objcopy/string to generate symbol table if there is any
relocation in output.  PR 9945.
13. Require texinfo 4.7 for build.  PR 10039.
14. Add moxie support.
15. Improve gold support.
16. Improve AIX support.
17. Improve arm support.
18. Improve cris support.
19. Improve crx support.
20. Improve mips support.
21. Improve ppc support.
22. Improve s390 support.
23. Improve spu support.
24. Improve vax support.

Changes from binutils 2.19.51.0.2:

1. Update from binutils 2009 0310.
2. Fix strip on common symbols in relocatable file.  PR 9933.
3. Fix --enable-targets=all build.
4. Fix ia64 build with -Wformat-security.  PR 9874.
5. Add REGION_ALIAS support in linker script.
6. Add think archive support to readelf.
7. Improve DWARF support in objdump.
8. Improve alpha support.
9. Improve arm support.
10. Improve hppa support.
11. Improve m68k support.
12. Improve mips support.
13. Improve ppc support.
14. Improve xtensa support.
15. Add score 7 support.

Changes from binutils 2.19.51.0.1:

1. Update from binutils 2009 0204.
2. Support AVX Programming Reference (January, 2009)
3. Improve .s suffix support in x86 disassembler.
4. Add --prefix/--prefix-strip for objdump -S.  PR 9784.
5. Change "ld --as-needed" to resolve undefined references in DSO.
6. Add -Ttext-segment to ld to set address of text segment.
7. Fix "ld -r --gc-sections --entry" crash with

Re: [gnat] reuse of ASTs already constructed

2009-04-20 Thread Oliver Kellogg
On Mon, 2009-04-20 at 14:55 -0400, Geert Bosch wrote:
> > For an invocation
> > gnat1 a.adb b.adb c.adb
> > , the files a.{s,ali} b.{s,ali} c.{s,ali} are produced.
> 
> The back end is not prepared to produce multiple assembly files.
> The "gcc" driver program also assumes each invocation produces a
> single .s file.
> 
> So, if this is what you want to do, you'd have to address all these
> underlying limitations first.
> 

Sure, that's what I'm doing. 
See also the first rough patch which I attached to
http://gcc.gnu.org/ml/gcc/2009-04/msg00380.html , namely
http://gcc.gnu.org/ml/gcc/2009-04/msg00380/gnat1_multi_source_compile-0.diff.gz
(which meanwhile is outdated.)

Oliver




Summer of Code 2009 "Support for an ELF writer"

2009-04-20 Thread Kirill Kononenko
Hello


So how did it happen that the only project which was a candidate for
libJIT Summer of Code in GNU, with the same title got selected in
LLVM?


Does it mean that the same genius idea came to two minds?


Thanks,
Kirill

-- 
http://code.google.com/p/libjit-linear-scan-register-allocator/


Re: Summer of Code 2009 "Support for an ELF writer"

2009-04-20 Thread Joe Buck
On Mon, Apr 20, 2009 at 01:34:15PM -0700, Kirill Kononenko wrote:
> So how did it happen that the only project which was a candidate for
> libJIT Summer of Code in GNU, with the same title got selected in
> LLVM?

You can ask the Google folks that question, but it's off-topic for the GCC
list.


Re: [LLVMdev] Summer of Code 2009 "Support for an ELF writer"

2009-04-20 Thread Bill Wendling
On Mon, Apr 20, 2009 at 1:34 PM, Kirill Kononenko
 wrote:
> Hello
>
>
> So how did it happen that the only project which was a candidate for
> libJIT Summer of Code in GNU, with the same title got selected in
> LLVM?
>
>
> Does it mean that the same genius idea came to two minds?
>
No. It's a conspiracy, of course.

-bw


For backend maintainers: changes for C++ compatibility

2009-04-20 Thread Ian Lance Taylor
Last week I committed a patch to check for enum comparisons which are
invalid in C++.  Today I committed a patch to check for enum conversions
during function calls which are invalid in C++.  These new warnings are
enabled by -Wc++-compat.

I have not fixed every gcc backend to compile without producing any of
the new warnings.  I have fixed some: arm, i386, ia64, mips, pa, rs6000,
s390, sparc, spu.  This covers all the primary and secondary platforms,
plus spu which I had to touch because I changed a target vector which it
implements.

Adding these warnings to -Wc++-compat may break bootstrap for other
platforms, and may cause additional warnings when building
cross-compilers.  I would like to ask the maintainers for backends which
I did not mention to bootstrap their targets if possible, and/or to
build them with a newly built mainline compiler, to see if there are new
warnings about C++ compatibility.  These warnings are normally
straightforward to fix.  For warnings related to calling
gen_rtx_EXPR_LIST to add a register note, change the code to use the
add_reg_note function instead.

I am willing to help fix any backends which have trouble with the new
warnings.  However, I do not plan to build every backend myself.

There will be some more improvements to -Wc++-compat coming, causing
more warnings on existing gcc code, and I plan to follow this same
procedure.

Ian


Re: For backend maintainers: changes for C++ compatibility

2009-04-20 Thread Andrew Pinski
On Mon, Apr 20, 2009 at 2:30 PM, Ian Lance Taylor  wrote:
> I have not fixed every gcc backend to compile without producing any of
> the new warnings.  I have fixed some: arm, i386, ia64, mips, pa, rs6000,
> s390, sparc, spu.  This covers all the primary and secondary platforms,
> plus spu which I had to touch because I changed a target vector which it
> implements.

Did you test the SPU compiler before you committed this?  If you did
not test this, then you did not follow at all what new developers are
requested to do which is documented at
http://gcc.gnu.org/contribute.html.  You can request someone to help
you out if you need a place to test the spu compiler but I did not see
that request before.

Thanks,
Andrew Pinski


CONSTRAINT__LIMIT

2009-04-20 Thread Ian Lance Taylor
Vlad, I noticed that the code in setup_cover_and_important_classes in
ira.c does #ifdef CONSTRAINT__LIMIT.  However, CONSTRAINT__LIMIT is not
a preprocessor macro; it is an enum constant defined in the generated
file tm-preds.h.  I think that the code within the #ifdef is either
unnecessary or is not running when it should.  I haven't investigated
further.

Ian


Re: For backend maintainers: changes for C++ compatibility

2009-04-20 Thread Joseph S. Myers
On Mon, 20 Apr 2009, Andrew Pinski wrote:

> On Mon, Apr 20, 2009 at 2:30 PM, Ian Lance Taylor  wrote:
> > I have not fixed every gcc backend to compile without producing any of
> > the new warnings.  I have fixed some: arm, i386, ia64, mips, pa, rs6000,
> > s390, sparc, spu.  This covers all the primary and secondary platforms,
> > plus spu which I had to touch because I changed a target vector which it
> > implements.
> 
> Did you test the SPU compiler before you committed this?  If you did
> not test this, then you did not follow at all what new developers are
> requested to do which is documented at
> http://gcc.gnu.org/contribute.html.  You can request someone to help
> you out if you need a place to test the spu compiler but I did not see
> that request before.

With experience one may judge exactly what testing is appropriate for 
complicated and wide-ranging patches (whether ones one submits or ones one 
approves) and declare what has been done when sending the patch.  
Verifying that cc1 builds is a common form of testing for some mechanical 
changes (though one might explicitly say in some cases that a certain 
amount of time is being left for target maintainers to object to the 
changes to their target).  The instructions for contributors (much of 
which I wrote) are generally applicable to most changes and new developers 
would be well advised to check their contributions carefully against them, 
but they do not provide a perfect algorithm for determining exactly what 
should be done with every patch.  (The legal requirements of FSF policy do 
however always have to be followed unless the FSF grants an exception in a 
particular case.)

-- 
Joseph S. Myers
jos...@codesourcery.com

How does gcc compile source code just like *(ptr base + offset) ?

2009-04-20 Thread 房陈
Hi!
    I really want to how does gcc compile code like *(ptr base +
offset), where ptr base is the initial address of a pointer variable
and offset is any legal integer expression. There is a example here:

    int i = 1;
    int j = 1;
    int *buf = (int*)malloc(10 *sizeof(int));
    *(buf + i + j) = 7;

    And the correspondent assembly code is :
    ..
        int i = 1;
 80483b5:   c7 45 f0 01 00 00 00    movl   $0x1,-0x10(%ebp)
    int j = 1;
 80483bc:   c7 45 f4 01 00 00 00    movl   $0x1,-0xc(%ebp)
    int *buf = (int*)malloc(10 * sizeof(int));
 80483c3:   c7 04 24 28 00 00 00    movl   $0x28,(%esp)
 80483ca:   e8 09 ff ff ff  call   80482d8 
 80483cf:   89 45 f8    mov    %eax,-0x8(%ebp)

    *(buf + i + j) = 7;
 80483d2:   8b 55 f0    mov    -0x10(%ebp),%edx
 80483d5:   8b 45 f4    mov    -0xc(%ebp),%eax
 80483d8:   8d 04 02    lea    (%edx,%eax,1),%eax
 80483db:   c1 e0 02    shl    $0x2,%eax
 80483de:   03 45 f8    add    -0x8(%ebp),%eax
 80483e1:   c7 00 07 00 00 00   movl   $0x7,(%eax)
   ..
So I guess that gcc would always compute offset "i+j" first, and then
add the result of "i + j" to the base address of buf to obtain the
final address. Do I guess right? Is there any exception?
ps: My gcc version is 4.3.3.
 Thank you!


[m32c] IRA reload failures in libstdc++

2009-04-20 Thread DJ Delorie

This is typical of the types of failures m32c got before IRA, too.  I
had a good build on Feb 19th, but if I try to reproduce it, it fails
too.

Fails with -O2, works with -Os.  Note: you might need -fno-ivopts to
get around the recent m32c/IV problems.

Any ideas?  Any thoughts on why gcc has so many problems with this
chip?


d...@greed pts/9 ~/m32c/gcc/m32c-elf/m32c-elf/m32cm/libstdc++-v3/src
$ make CXXFLAGS="-g -O2 -dap" wstring-inst.o
/greed/dj/m32c/gcc/m32c-elf/./gcc/xgcc -shared-libgcc 
-B/greed/dj/m32c/gcc/m32c-elf/./gcc -nostdinc++ 
-L/greed/dj/m32c/gcc/m32c-elf/m32c-elf/m32cm/libstdc++-v3/src 
-L/greed/dj/m32c/gcc/m32c-elf/m32c-elf/m32cm/libstdc++-v3/src/.libs 
-B/greed/dj/m32c/install/m32c-elf/bin/ -B/greed/dj/m32c/install/m32c-elf/lib/ 
-isystem /greed/dj/m32c/install/m32c-elf/include -isystem 
/greed/dj/m32c/install/m32c-elf/sys-include  -mcpu=m32cm -DHAVE_CONFIG_H -I. 
-I../../../../../gcc/libstdc++-v3/src -I..  
-I/greed/dj/m32c/gcc/m32c-elf/m32c-elf/m32cm/libstdc++-v3/include/m32c-elf 
-I/greed/dj/m32c/gcc/m32c-elf/m32c-elf/m32cm/libstdc++-v3/include 
-I/greed/dj/m32c/gcc/gcc/libstdc++-v3/libsupc++  -fno-implicit-templates -Wall 
-Wextra -Wwrite-strings -Wcast-qual  -fdiagnostics-show-location=once  
-ffunction-sections -fdata-sections  -g -O2 -dap -std=gnu++0x -c 
../../../../../gcc/libstdc++-v3/src/wstring-inst.cc

In file included from 
/greed/dj/m32c/gcc/m32c-elf/m32c-elf/m32cm/libstdc++-v3/include/string:53,
 from ../../../../../gcc/libstdc++-v3/src/string-inst.cc:33,
 from ../../../../../gcc/libstdc++-v3/src/wstring-inst.cc:34:
/greed/dj/m32c/gcc/m32c-elf/m32c-elf/m32cm/libstdc++-v3/include/bits/basic_string.h:
 In member function 'std::basic_string<_CharT, _Traits, _Alloc>& 
std::basic_string<_CharT, _Traits, 
_Alloc>::assign(std::initializer_list<_CharT>) [with _CharT = wchar_t, _Traits 
= std::char_traits, _Alloc = std::allocator]':

/greed/dj/m32c/gcc/m32c-elf/m32c-elf/m32cm/libstdc++-v3/include/bits/basic_string.h:1007:
 error: unable to find a register to spill in class 'A_REGS'
/greed/dj/m32c/gcc/m32c-elf/m32c-elf/m32cm/libstdc++-v3/include/bits/basic_string.h:1007:
 error: this is the insn:
(insn 18 17 19 2 
/greed/dj/m32c/gcc/m32c-elf/m32c-elf/m32cm/libstdc++-v3/include/bits/basic_string.h:1475
 (parallel [
(set (reg:HI 0 r0 [39])
(ashift:HI (mem/s:HI (plus:PSI (subreg:PSI (reg/f:SI 25 [ 
__i1$_M_current ]) 0)
(const_int -6 [0xfffa])) [3 
.D.12386._M_length+0 S2 A8])
(const_int 2 [0x2])))
(clobber (scratch:HI))
]) 224 {ashlhi3_i} (nil))
/greed/dj/m32c/gcc/m32c-elf/m32c-elf/m32cm/libstdc++-v3/include/bits/basic_string.h:1007:
 internal compiler error: in spill_failure, at reload1.c:2094


[wstring-inst.cc.175r.ira]

Spilling for insn 18.
reload failure for reload 0

Reloads for insn # 18
Reload 0: reload_in (SI) = (reg/f:SI 25 [ __i1$_M_current ])
A_REGS, RELOAD_FOR_OTHER_ADDRESS (opnum = 0)
reload_in_reg: (reg/f:SI 25 [ __i1$_M_current ])
Reload 1: reload_in (HI) = (mem/s:HI (plus:PSI (subreg:PSI (reg/f:SI 25 [ 
__i1$_M_current ]) 0)
(const_int -6 
[0xfffa])) [3 .D.12386._M_length+0 S2 A8])
reload_out (HI) = (reg:HI 0 r0 [39])
A_HI_MEM_REGS, RELOAD_OTHER (opnum = 0)
reload_in_reg: (mem/s:HI (plus:PSI (subreg:PSI (reg/f:SI 25 [ 
__i1$_M_current ]) 0)
(const_int -6 
[0xfffa])) [3 .D.12386._M_length+0 S2 A8])
reload_out_reg: (reg:HI 0 r0 [39])
reload_reg_rtx: (reg:HI 0 r0 [39])


Re: CONSTRAINT__LIMIT

2009-04-20 Thread Vladimir Makarov

Ian Lance Taylor wrote:

Vlad, I noticed that the code in setup_cover_and_important_classes in
ira.c does #ifdef CONSTRAINT__LIMIT.  However, CONSTRAINT__LIMIT is not
a preprocessor macro; it is an enum constant defined in the generated
file tm-preds.h.  I think that the code within the #ifdef is either
unnecessary or is not running when it should.  I haven't investigated
further.
  
Ian, thanks for reporting this.   I'll investigate this more.  It 
affects only ports using priority allocation (I know only one which is 
m32c).  DJ just recently reported a reload failure problem on m32c.  
Probably that is because of this wrong code.




Re: How does gcc compile source code just like *(ptr base + offset) ?

2009-04-20 Thread Ian Lance Taylor
房陈  writes:

>     I really want to how does gcc compile code like *(ptr base +
> offset), where ptr base is the initial address of a pointer variable
> and offset is any legal integer expression. There is a example here:
>
>     int i = 1;
>     int j = 1;
>     int *buf = (int*)malloc(10 *sizeof(int));
>     *(buf + i + j) = 7;
>
>     And the correspondent assembly code is :
>     ..
>         int i = 1;
>  80483b5:   c7 45 f0 01 00 00 00    movl   $0x1,-0x10(%ebp)
>     int j = 1;
>  80483bc:   c7 45 f4 01 00 00 00    movl   $0x1,-0xc(%ebp)
>     int *buf = (int*)malloc(10 * sizeof(int));
>  80483c3:   c7 04 24 28 00 00 00    movl   $0x28,(%esp)
>  80483ca:   e8 09 ff ff ff  call   80482d8 
>  80483cf:   89 45 f8    mov    %eax,-0x8(%ebp)
>
>     *(buf + i + j) = 7;
>  80483d2:   8b 55 f0    mov    -0x10(%ebp),%edx
>  80483d5:   8b 45 f4    mov    -0xc(%ebp),%eax
>  80483d8:   8d 04 02    lea    (%edx,%eax,1),%eax
>  80483db:   c1 e0 02    shl    $0x2,%eax
>  80483de:   03 45 f8    add    -0x8(%ebp),%eax
>  80483e1:   c7 00 07 00 00 00   movl   $0x7,(%eax)
>    ..
> So I guess that gcc would always compute offset "i+j" first, and then
> add the result of "i + j" to the base address of buf to obtain the
> final address. Do I guess right? Is there any exception?
> ps: My gcc version is 4.3.3.

Unless you plan to modify gcc itself, this question would be more
appropriate for the gcc-help mailing list.  Please take any followups
there.

If you compile at -O0, gcc will probably generate code more or less as
you describe.  However, there is no guarantee of that.  If you compile
with optimization, then the instructions can and will be completely
changed.  In particular, for your example, gcc will most likely forward
propagate i + j, fold the constant, and simply use buf + 2.

One way to see what gcc does is to use -fdump-tree-all and examine the
generated dump files.

Ian


Re: [LLVMdev] Summer of Code 2009 "Support for an ELF writer"

2009-04-20 Thread Kirill Kononenko
Sorry for this off-topic.

I did not tell it is a conspiracy. I only found this funny.


Thanks,
Kirill

2009/4/21 Bill Wendling :
> On Mon, Apr 20, 2009 at 1:34 PM, Kirill Kononenko
>  wrote:
>> Hello
>>
>>
>> So how did it happen that the only project which was a candidate for
>> libJIT Summer of Code in GNU, with the same title got selected in
>> LLVM?
>>
>>
>> Does it mean that the same genius idea came to two minds?
>>
> No. It's a conspiracy, of course.
>
> -bw
>



-- 
http://code.google.com/p/libjit-linear-scan-register-allocator/