Profile-directed feedback and remote testing

2005-03-25 Thread Mark Mitchell
When we generate data for feedback, we insert the .gcda name into the 
object file as an absolute path.  As a result, when we try to do remote 
testing, we lose, as, in general the remote file system does not have 
the same file hierarchy as the build system.

I understand why we generate an asbolute path; we want to make sure that 
the data ends up there, not in the directory where the user happens to 
run the program.  So, I intend to disable these tests when $host != 
$target.  Any objections, or better ideas?

--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: Profile-directed feedback and remote testing

2005-03-25 Thread Gabriel Dos Reis
Mark Mitchell <[EMAIL PROTECTED]> writes:

| When we generate data for feedback, we insert the .gcda name into the
| object file as an absolute path.  As a result, when we try to do
| remote testing, we lose, as, in general the remote file system does
| not have the same file hierarchy as the build system.
| 
| I understand why we generate an asbolute path; we want to make sure
| that the data ends up there, not in the directory where the user
| happens to run the program.  So, I intend to disable these tests when
| $host != $target.  Any objections, or better ideas?

I'm supportive of that idea.

Maybe an option to tell the compiler where to put thos .gcda files?

(in general, when the "user" running the compiler does not have the
same rights as the "builder", one runs into the same problem.  I used
to find that quite annoying.)

-- Gaby


maybe a gcc bug

2005-03-25 Thread zouq

/testcom.c
int main (void)
{
int i,j;
int u[100][100], v[100][100],
p[100][100], unew[100][100],
vnew[100][100],pnew[100][100],
uold[100][100],vold[100][100],
pold[100][100],cu[100][100],
cv[100][100],z[100][100],h[100][100],psi[100][100];

 int tdts8=2;
 int tdtsdx=3;
 int tdtsdy=4;

 for (i=0;i<100;i++)
   for (j=0;j<100;j++)
   {
   unew[i+1][j]=uold[i+1][j]+tdts8*(z[i+1][j]+z[i+1][j])*
(cv[i+1][j+1]+cv[i][j+1]+cv[i][j]+cv[i+1][j])
-tdtsdx*(h[i+1][j]-h[i][j]);
   /*vnew[i][j+1]=vold[i][j+1]-tdts8*(z[i+1][j+1]+z[i][j+1])
*(cu[i+1][j+1]+cu[i][j+1]+cu[i][j]+cu[i+1][j])
-tdtsdy*(h[i][j+1]-h[i][j]);*/
   /*pnew[i][j]=pold[i][j]-tdtsdx*(cu[i+1][j]-cu[i][j])-
tdtsdy*(cv[i][j+1]-cv[i][j]);*/

   }

 for (i=0;i<100;i++)
   for (j=0;j<100;j++)
 printf ("%d\n%d\n%d\n",unew[i][j], vnew[i][j], pnew[i][j]);

 return 1;
}

first i made gcc-4.1-20050320 a cross-compiler for powerpc.
when i compile the above program,  it goes like this:

testcom.c:34: internal compiler error: in schedule_insns, at sched-rgn.c:2549

who can tell me why?
why can it bring compiler error?





Re: GCC3 to GCC4 performance regression. Bug?

2005-03-25 Thread Steven Bosscher
On Friday 25 March 2005 01:31, James E Wilson wrote:
> On Thu, 2005-03-24 at 15:52, Steven Bosscher wrote:
> > I'd suggest trying -fmove-loop-invariants, and report a bug about
> > that instead if it does not move those loop invariants.  We really
> > should move away from loop.c anyway.
>
> In general, yes, but we will probably always need some RTL loop
> optimizations.

I don't believe anyone has ever claimed otherwise.  And my suggestion
is exactly to enable the new RTL invariant code motion pass ;-)  See
loop-invariant.c.


> Lowering gimple to RTL may expose target dependent loop 
> invariants that were not present in the gimple.  Hence, we still need
> the RTL loop pass to work.

No, we need the *new*, CFG based RTL loop optimizers to work.  They,
together with the tree loop optimizers, should subsume and improve on
all the billion-and-a-half things loop.c currently does.

> There is also the issue of the special looping branches,

If you mean doloop, see loop-doloop.c, the code in loop.c is already
gone for a long time now ;-)  

> There is also the more pragmatic problem that we are suffering user
> visible performance losses right now, and we shouldn't force users to
> wait for future tree-ssa enhancements to fix them when an apparently
> relatively simple RTL optimizer change can fix them.

Unless the new loop optimizer does not fix the performance losses,
I'd disagree.  We should concentrate on blowing away loop.c.  It is
a tragedy we still need it.  Right now, loop.c makes it impossible to
shuffle the passes in the RTL optimizer path, because loop.c destroys
the CFG and, all profile information with it.  And there really is
not that much that loop.c still does.  On the tree-profiling-branch
we have already disabled the old RTL loop optimizer (note *old*) and
I would be very disappointed if we can not do the same thing on the
mainline for GCC 4.1.

Gr.
Steven


Re: maybe a gcc bug

2005-03-25 Thread Giovanni Bajo
zouq <[EMAIL PROTECTED]> wrote:

> /testcom.c
> int main (void)
> {
> int i,j;
> int u[100][100], v[100][100],
> p[100][100], unew[100][100],
> vnew[100][100],pnew[100][100],
> uold[100][100],vold[100][100],
> pold[100][100],cu[100][100],
> cv[100][100],z[100][100],h[100][100],psi[100][100];
>
>  int tdts8=2;
>  int tdtsdx=3;
>  int tdtsdy=4;
>
>  for (i=0;i<100;i++)
>for (j=0;j<100;j++)
>{
>unew[i+1][j]=uold[i+1][j]+tdts8*(z[i+1][j]+z[i+1][j])*
> (cv[i+1][j+1]+cv[i][j+1]+cv[i][j]+cv[i+1][j])
> -tdtsdx*(h[i+1][j]-h[i][j]);
>/*vnew[i][j+1]=vold[i][j+1]-tdts8*(z[i+1][j+1]+z[i][j+1])
> *(cu[i+1][j+1]+cu[i][j+1]+cu[i][j]+cu[i+1][j])
> -tdtsdy*(h[i][j+1]-h[i][j]);*/
>/*pnew[i][j]=pold[i][j]-tdtsdx*(cu[i+1][j]-cu[i][j])-
> tdtsdy*(cv[i][j+1]-cv[i][j]);*/
>
>}
>
>  for (i=0;i<100;i++)
>for (j=0;j<100;j++)
>  printf ("%d\n%d\n%d\n",unew[i][j], vnew[i][j], pnew[i][j]);
>
>  return 1;
> }
>
> first i made gcc-4.1-20050320 a cross-compiler for powerpc.
> when i compile the above program,  it goes like this:
>
> testcom.c:34: internal compiler error: in schedule_insns, at
sched-rgn.c:2549
>
> who can tell me why?
> why can it bring compiler error?

Any compiler error, *whichever* source file you use, is a bug of GCC. Would
you please submit this as a proper bugreport in Bugzilla? Read the
instructions at http://gcc.gnu.org/bugs.html.

Thanks
-- 
Giovanni Bajo



Re: BOOT_CFLAGS and -fomit-frame-pointer

2005-03-25 Thread Greg Schafer
On Fri, Mar 25, 2005 at 08:46:12AM +0100, Eric Botcazou wrote:

> Isn't that always the case in general?  With a 'make bootstrap' the compiler 
> is built by itself whereas with a bare 'make' it is built by the installed 
> compiler.  So in general the final compilers are not identical.

Umm.. you've missed my point. I'm talking about the case where the installed
compiler is an already "make bootstrapped" compiler of the same version.

Imagine this scenario:

 - GCC-4.0 is "make bootstrapped" with --prefix=/usr and installed as the
   system compiler.

 - Another GCC-4.0 is then built with a plain `make' but with
   --prefix=/opt/gcc-test1 and installed

 - Another GCC-4.0 is then built with `make bootstrap' but with
   --prefix=/opt/gcc-test2 and installed

The compilers in /opt/gcc-test1 and /opt/gcc-test2 are different because 1
was built with -fomit-frame-pointer and the other one wasn't.

To reiterate, this is different behaviour from past GCC releases, and it
appears wrong to me.

> What prevents you from setting CFLAGS="-O2 -fomit-frame-pointer" if you
> happen to be rebuilding the compiler with an installed version of itself?

Nothing at all. I already mentioned the issue is easily worked around. But
that is not the point. My point is that the behaviour seems wrong, but I
wasn't sure whether it was worth entering in BZ, hence my raising the issue
here for insight by GCC developers. Not to worry.. it's no big deal.

Regards
Greg


Re: BOOT_CFLAGS and -fomit-frame-pointer

2005-03-25 Thread Eric Botcazou
> Umm.. you've missed my point.

Not really if you read correctly. :-)  I was saying that the compilers are 
not meant to be identical in the general case.

> To reiterate, this is different behaviour from past GCC releases, and it
> appears wrong to me.

What is wrong exactly?  Why should 2 different build processes generate the 
same executable?  Is there a (written) rule about this?

-- 
Eric Botcazou




Re: BOOT_CFLAGS and -fomit-frame-pointer

2005-03-25 Thread Greg Schafer
On Fri, Mar 25, 2005 at 12:06:33PM +0100, Eric Botcazou wrote:

> What is wrong exactly?  Why should 2 different build processes generate the 
> same executable?  Is there a (written) rule about this?

No, there is no written rule. However, some folks (like me) are concerned
with matters of binary reproducibility, and this subtle change in x86 GCC's
default behaviour seemed a little suspect (to me at least). Like I said, no
big deal. If noone else thinks it's a problem, don't worry.

Regards
Greg


Re: reload question

2005-03-25 Thread tm_gccmail
On 22 Mar 2005, Ian Lance Taylor wrote:

> Miles Bader <[EMAIL PROTECTED]> writes:
> 
> > I've defined SECONDARY_*_RELOAD_CLASS (and PREFERRED_* to try to help
> > things along), and am now running into more understandable reload
> > problems:  "unable to find a register to spill in class"  :-/
> > 
> > The problem, as I understand, is that reload doesn't deal with conflicts
> > between secondary and primary reloads -- which are common with my arch
> > because it's an accumulator architecture.
> > 
> > For instance, slightly modifying my previous example:
> > 
> >Say I've got a mov instruction that only works via an accumulator A,
> >and a two-operand add instruction.  "r" regclass includes regs A,X,Y,
> >and "a" regclass only includes reg A.
> > 
> >mov has constraints like: 0 = "g,a"   1 = "a,gi"
> >and add3 has constraints: 0 = "a" 1 = "0"2 = "ri" (say)
> > 
> > So if before reload you've got an instruction like:
> > 
> >add temp, [sp + 4], [sp + 6]
> > 
> > and v2 and v3 are in memory, it will have to have generate something like:
> > 
> >mov A, [sp + 4]; primary reload 1 in X, with secondary reload 0 A
> >mov X, A   ;   ""
> >mov A, [sp + 6]; primary reload 2 in A, with no secondary reload
> >add A, X
> >mov temp, A
> > 
> > There's really only _one_ register that can be used for many reloads, A.
> 
> I don't think there is any way that reload can cope with this
> directly.  reload would have to get a lot smarter about ordering the
> reloads.
> 
> Since you need the accumulator for so much, one approach you should
> consider is not exposing the accumulator register until after reload.
> You could do this by writing pretty much every insn as a
> define_insn_and_split, with reload_completed as the split condition.
> Then you split into code that uses the accumulator.  Your add
> instruction permits you to add any two general registers, and you
> split into moving one into the accumulator, doing the add, and moving
> the result whereever it should go.  If you then split all the insns
> before the postreload pass, perhaps the generated code won't even be
> too horrible.
> 
> Ian
> 

This approach by itself has obvious problems.

It will generate a lot of redundant moves to/from the accumulator because
the accumulator is exposed much too late.

Consider the 3AC code:

add i,j,k
add k,l,m

it will be broken down into:

mov i,a
add j,a
mov a,k
mov k,a
add l,a
mov a,m

where the third and fourth instructions are basically redundant.

I did a lot of processor architecture research about three years ago, and
I came to some interesting conclusions about accumulator architectures.

Basically, with naive code generation, you will generate 3x as many
instructions for an accumulator machine than for a 3AC machine.

If you have a SWAP instruction so you can swap the accumulator with the
index registers, then you can lower the instruction count penalty to about
2x that of a 3AC machine. If you think about this for a while, the reason
will become readily apparent.

In order to reach this 2x figure, it requires a good understanding of how
the data flows through the accumulator in an accumulator arch.

Toshi






Re: reload question

2005-03-25 Thread Ian Lance Taylor
<[EMAIL PROTECTED]> writes:

> It will generate a lot of redundant moves to/from the accumulator because
> the accumulator is exposed much too late.
> 
> Consider the 3AC code:
> 
> add i,j,k
> add k,l,m
> 
> it will be broken down into:
> 
> mov i,a
> add j,a
> mov a,k
> mov k,a
> add l,a
> mov a,m
> 
> where the third and fourth instructions are basically redundant.

In the gcc context, the fourth instruction can be removed by the
reload CSE pass in postreload.c.  Still, obviously the generated code
is going to be bad.

I agree that gcc is not well designed to cope with an accumulator
architecture.  Reload can't cope.

Ian


Re: reload question

2005-03-25 Thread Alan Lehotsky
Look at the IP2K port.  It's an 8-bit chip with a 16 bit accumulator 
and VERY limited
registers and addressing.  When I did this port originally, I mostlyh 
hid the accumulator
from the register allocator.  But I did implement extended precision 
arithmetic as a pattern that
optimized use of the accumulator.

We got pretty good code generated.  There's a pretty complete TCP/IP 
stack implemented for this chip (it's basically an internet toaster 
controller) and this is a machine with (IIRC) only 16k words of 
instruction and 4K BYTES of
data (and that's banked memory at that!).

On Mar 25, 2005, at 08:13, <[EMAIL PROTECTED]> wrote:
On 22 Mar 2005, Ian Lance Taylor wrote:
Miles Bader <[EMAIL PROTECTED]> writes:
I've defined SECONDARY_*_RELOAD_CLASS (and PREFERRED_* to try to help
things along), and am now running into more understandable reload
problems:  "unable to find a register to spill in class"  :-/
The problem, as I understand, is that reload doesn't deal with 
conflicts
between secondary and primary reloads -- which are common with my 
arch
because it's an accumulator architecture.

For instance, slightly modifying my previous example:
   Say I've got a mov instruction that only works via an accumulator 
A,
   and a two-operand add instruction.  "r" regclass includes regs 
A,X,Y,
   and "a" regclass only includes reg A.

   mov has constraints like: 0 = "g,a"   1 = "a,gi"
   and add3 has constraints: 0 = "a" 1 = "0"2 = "ri" 
(say)

So if before reload you've got an instruction like:
   add temp, [sp + 4], [sp + 6]
and v2 and v3 are in memory, it will have to have generate something 
like:

   mov A, [sp + 4]; primary reload 1 in X, with secondary reload 
0 A
   mov X, A   ;   ""
   mov A, [sp + 6]; primary reload 2 in A, with no secondary 
reload
   add A, X
   mov temp, A

There's really only _one_ register that can be used for many 
reloads, A.
I don't think there is any way that reload can cope with this
directly.  reload would have to get a lot smarter about ordering the
reloads.
Since you need the accumulator for so much, one approach you should
consider is not exposing the accumulator register until after reload.
You could do this by writing pretty much every insn as a
define_insn_and_split, with reload_completed as the split condition.
Then you split into code that uses the accumulator.  Your add
instruction permits you to add any two general registers, and you
split into moving one into the accumulator, doing the add, and moving
the result whereever it should go.  If you then split all the insns
before the postreload pass, perhaps the generated code won't even be
too horrible.
Ian
This approach by itself has obvious problems.
It will generate a lot of redundant moves to/from the accumulator 
because
the accumulator is exposed much too late.

Consider the 3AC code:
add i,j,k
add k,l,m
it will be broken down into:
mov i,a
add j,a
mov a,k
mov k,a
add l,a
mov a,m
where the third and fourth instructions are basically redundant.
I did a lot of processor architecture research about three years ago, 
and
I came to some interesting conclusions about accumulator architectures.

Basically, with naive code generation, you will generate 3x as many
instructions for an accumulator machine than for a 3AC machine.
If you have a SWAP instruction so you can swap the accumulator with the
index registers, then you can lower the instruction count penalty to 
about
2x that of a 3AC machine. If you think about this for a while, the 
reason
will become readily apparent.

In order to reach this 2x figure, it requires a good understanding of 
how
the data flows through the accumulator in an accumulator arch.

Toshi


Alan Lehotsky - [EMAIL PROTECTED]
Carbon Design Systems, Inc


Re: A plan for eliminating cc0

2005-03-25 Thread Paul Schlie
> Ian Lance Taylor  writes:
> We would like to eliminate cc0 and the associated machinery from the
> compiler, because it is complicated and not supported on popular or
> modern processors.  Here is a plan which I think could accomplish that
> without unreasonable effort.

I pre-apologize if this is a dumb question, but:

Does GCC truley need to identify/treat condition state registers uniquely
from any other value produced as a result of a calculation?

As it would seem that condition state dependencies may be tracked just as
any other defined (virtual or physical) register value dependency is between
instructions; as it seems that the only thing unique about the machine's
condition state value, is that it tends to be an implied vs. explicit
operand of an instruction, just as accumulators tend to be in some machines?

Where then it would seem that as long as a target accurately describes it's
instruction dependencies on any implied register operands it chooses to
partition machine state into; they may be treated just any register value
dependency would be, irrespective of it's value or purpose?

Thereby GCC need only designate the statically determined "branch condition"
as a function of a single previously computed register value as compared
against 0* to a logical branch instruction in addition to a basic block
label; which the target may implement as being composed of any combination
of instruction and/or register value dependencies specified as necessary;
without GCC needing to be aware of their purpose, only it's dependencies?

(* as 0 tends to be the basis of all arithmetic and logical operation result
comparisons, including comparison operations themselves, which tend to be
simply a subtraction, who's result is compared against 0, but not saved;
therefore truly, explicit comparison operations are optimizations, and not a
fundamental operation, so should never be explicitly required. i.e.:
conditional branching should be based on the canonical comparison of an
arbitrary value against 0:
 
 (set rx (op ...)) ;; rx is the result of an arbitrary operation
 (branch bc rx label)  ;; (branch bc rx label) :: if (rx bc 0) goto label;

 Where some targets may need to generate an explicit subtract (or compare),
 or others may specify an implied register in which the result of rx ?? 0
 is stored in and the branch is dependant on, or that branch may compute
 the comparison between rx and 0 itself, etc)





Re: Profile-directed feedback and remote testing

2005-03-25 Thread Daniel Jacobowitz
On Thu, Mar 24, 2005 at 11:59:55PM -0800, Mark Mitchell wrote:
> When we generate data for feedback, we insert the .gcda name into the 
> object file as an absolute path.  As a result, when we try to do remote 
> testing, we lose, as, in general the remote file system does not have 
> the same file hierarchy as the build system.
> 
> I understand why we generate an asbolute path; we want to make sure that 
> the data ends up there, not in the directory where the user happens to 
> run the program.  So, I intend to disable these tests when $host != 
> $target.  Any objections, or better ideas?

It would be nice if we could preserve the ability to run them - when
your build directory is mounted on the target system at the same path,
the tests will pass.  Perhaps a compiler option, as Gabriel
suggested...

-- 
Daniel Jacobowitz
CodeSourcery, LLC


Re: A plan for eliminating cc0

2005-03-25 Thread Ian Lance Taylor
Paul Schlie <[EMAIL PROTECTED]> writes:

> Does GCC truley need to identify/treat condition state registers uniquely
> from any other value produced as a result of a calculation?

No, it doesn't.  The change I am proposing removes the unique handling
of condition state registers, and treats them like other registers.
The unique handling of condition state registers is historical, and
arose because of the characteristics of the initial gcc targets (e.g.,
vax, m68k).

The idea to do this is not mine; for more background see the
discussion of cc0 here:
http://gcc.gnu.org/wiki/general%20backend%20cleanup

Ian


Re: maybe a gcc bug

2005-03-25 Thread David Edelsohn
> zouq  writes:

zouq> first i made gcc-4.1-20050320 a cross-compiler for powerpc.
zouq> when i compile the above program,  it goes like this:

zouq> testcom.c:34: internal compiler error: in schedule_insns, at 
sched-rgn.c:2549

zouq> who can tell me why?
zouq> why can it bring compiler error?

As Giovanni mentioned, please report this problem through the GCC
Bugzilla bugreport system.  Also, please include all details about how you
invoked the compiler, especially the commanlind options used.  The
information in your email is not sufficient to reproduce your problem
without a lot of experimentation.

Thanks, David



Re: GCC 4.0 Status Report (2005-03-24)

2005-03-25 Thread Mark Mitchell
Eric Botcazou wrote:
20263 SPARC64 ASM bug
Eric has a patch; I've asked about possible other ways to fix it.

I've answered, but probably not very constructively as your remark was not 
crystal-clear either. :-)  Btw, I think you should "Add CC" you when you 
comment on specific PRs in order to speed up the discussion.
OK.  (FWIW, you're not on the CC: list for that PR either.)
Note that the patch has been approved by Roger for 4.x, so it should already 
have been checked in, had I not run into technical contingencies lately.
Great; I shan't second-guess then.
However the same problem is present in 3.4.x for the C++ compiler (but is not 
a regression there) so I'd like you to make a decision for that branch too.
I'd prefer not to apply this for 3.4.
Thanks,
--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: Profile-directed feedback and remote testing

2005-03-25 Thread Timothy J . Wood
On Mar 24, 2005, at 11:59 PM, Mark Mitchell wrote:
When we generate data for feedback, we insert the .gcda name into the 
object file as an absolute path.  As a result, when we try to do 
remote testing, we lose, as, in general the remote file system does 
not have the same file hierarchy as the build system.
  A similar thing happens for coverage files (.bb., .bbg, .da) (the 
file goes into the cwd of the compiler when invoked).

I understand why we generate an asbolute path; we want to make sure 
that the data ends up there, not in the directory where the user 
happens to run the program.  So, I intend to disable these tests when 
$host != $target.  Any objections, or better ideas?
  A compiler option to set the target directory for these files (and 
the coverage ones!) would be great.  Possibly even better would be an 
environment variable.  If the user wants to compare two sets of 
coverage information from two sets of tests, they shouldn't have to 
rebuild their entire project to do so (though I guess they could do 
directory swapping after each run).

-tim


Re: GCC 4.0 Status Report (2005-03-24)

2005-03-25 Thread Eric Botcazou
> OK.  (FWIW, you're not on the CC: list for that PR either.)

No, but I'm the assignee so... :-)

> > Note that the patch has been approved by Roger for 4.x, so it should
> > already have been checked in, had I not run into technical contingencies
> > lately.
>
> Great; I shan't second-guess then.

Sorry, I only wanted to explain why the patch is still pending.  About your 
question: am I right in thinking that the real name is the name as written 
in the assembly file?  If so, that's what is now implemented in 4.x.

> > However the same problem is present in 3.4.x for the C++ compiler (but
> > is not a regression there) so I'd like you to make a decision for that
> > branch too.
>
> I'd prefer not to apply this for 3.4.

Agreed.

-- 
Eric Botcazou




Re: GCC 4.0 Status Report (2005-03-24)

2005-03-25 Thread Mark Mitchell
Eric Botcazou wrote:
OK.  (FWIW, you're not on the CC: list for that PR either.)

Sorry, I only wanted to explain why the patch is still pending.  About your 
question: am I right in thinking that the real name is the name as written 
in the assembly file?  If so, that's what is now implemented in 4.x.
Yes, I think that makes sense.
I'm a little concerned about the fact that (in theory) DECL_NAME could 
have spaces, or other assembler-unfriendly characters.  I'm not sure 
what to do in that circumstance; it's probably impossible to do anything 
better than we do now, without the assembler providing some kind of 
special support.  (I'm not actually sure what the assembler does with 
the name; presumably puts it in debug information.)

--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: Profile-directed feedback and remote testing

2005-03-25 Thread Joe Buck

Mark Mitchell <[EMAIL PROTECTED]> writes:
> | When we generate data for feedback, we insert the .gcda name into the
> | object file as an absolute path.  As a result, when we try to do
> | remote testing, we lose, as, in general the remote file system does
> | not have the same file hierarchy as the build system.

I had just that problem the other day, when a colleague asked me to help
him debug a problem.  Since his executable was set up to generate gcov
data, I couldn't run it as I didn't have permission to write his count
files, and he'd left for the day, so I couldn't get him to change the
permissions.  I wound up rebuilding the whole large executable from
source, but that was an annoying waste of time.

On Fri, Mar 25, 2005 at 10:17:22AM +0100, Gabriel Dos Reis wrote:
> Maybe an option to tell the compiler where to put thos .gcda files?

That wouldn't have saved me in the case described above, as the pathnames
are already set in the executable.  A *runtime* way of altering the
locations of the .gcda files would be nice to have.  For example, we could
have something like

GCDA_PATH_PREFIX

which, if set, would be prepended to the pathnames of the .gcda files.
We could even arrange to create needed directories on demand when
creating new .gcda files when this option is set.




Re: BOOT_CFLAGS and -fomit-frame-pointer

2005-03-25 Thread Joe Buck
On Fri, Mar 25, 2005 at 10:45:36PM +1100, Greg Schafer wrote:
> On Fri, Mar 25, 2005 at 12:06:33PM +0100, Eric Botcazou wrote:
> 
> > What is wrong exactly?  Why should 2 different build processes generate the 
> > same executable?  Is there a (written) rule about this?
> 
> No, there is no written rule. However, some folks (like me) are concerned
> with matters of binary reproducibility, and this subtle change in x86 GCC's
> default behaviour seemed a little suspect (to me at least). Like I said, no
> big deal. If noone else thinks it's a problem, don't worry.

It does seem surprising that the flags used are different, and this kind
of thing might potentially trip up a new GCC developer (who might
experience a Heisenbug when he makes a change and just runs "make" for
a quick test, not noticing that the flags have changed).

But I don't think it raises an issue of binary reproducibility, because
the three-stage bootstrap explicitly tests that the binaries come out byte
for byte identical.



Re: Profile-directed feedback and remote testing

2005-03-25 Thread Gabriel Dos Reis
"Timothy J.Wood" <[EMAIL PROTECTED]> writes:

| On Mar 24, 2005, at 11:59 PM, Mark Mitchell wrote:
| 
| > When we generate data for feedback, we insert the .gcda name into
| > the object file as an absolute path.  As a result, when we try to do
| > remote testing, we lose, as, in general the remote file system does
| > not have the same file hierarchy as the build system.
| 
|A similar thing happens for coverage files (.bb., .bbg, .da) (the
| file goes into the cwd of the compiler when invoked).
| 
| > I understand why we generate an asbolute path; we want to make sure
| > that the data ends up there, not in the directory where the user
| > happens to run the program.  So, I intend to disable these tests
| > when $host != $target.  Any objections, or better ideas?
| 
|A compiler option to set the target directory for these files (and
| the coverage ones!) would be great.  Possibly even better would be an
| environment variable.  If the user wants to compare two sets of
| coverage information from two sets of tests, they shouldn't have to
| rebuild their entire project to do so (though I guess they could do
| directory swapping after each run).

yes, but would not a compiler option suffice?  

-- Gaby


Texinfo appears to be FUBAR.

2005-03-25 Thread Steve Kargl
In trying to do "gmake dvi" in the build directory, the
gfortran.texi eventually dies with


Loading texinfo [version 2004-10-31.06]: Basics, pdf, fonts, page headings,
tables, conditionals, indexing, sectioning, toc, environments, defuns, macros,
cross references, insertions, (/usr/local/share/texmf/tex/generic/misc/epsf.tex
) localization, and turning on texinfo input format.) (./gfortran.aux
! Missing @endcsname inserted.
 
   @begingroup 
@code ->@begingroup 
@catcode [EMAIL PROTECTED]@active @let [EMAIL PROTECTED] 
@catcode [EMAIL PROTECTED]@activ...
 @code 
 {ABORT}-title
@xrdef #1#2->@expandafter @gdef @csname XR#1
@endcsname [EMAIL PROTECTED] @iff...
l.71 [EMAIL PROTECTED] {ABORT} --- Abort the program}
  

-- 
Steve


Re: A plan for eliminating cc0

2005-03-25 Thread Paul Schlie
> From: Ian Lance Taylor 
>> Paul Schlie <[EMAIL PROTECTED]> writes:
>> Does GCC truley need to identify/treat condition state registers uniquely
>> from any other value produced as a result of a calculation?
> 
> No, it doesn't.  The change I am proposing removes the unique handling
> of condition state registers, and treats them like other registers.
> The unique handling of condition state registers is historical, and
> arose because of the characteristics of the initial gcc targets (e.g.,
> vax, m68k).
> 
> The idea to do this is not mine; for more background see the
> discussion of cc0 here:
> http://gcc.gnu.org/wiki/general%20backend%20cleanup

Thank you. After reviewing that reference, a question comes to mind:

Is there any convenient way to reference the newly set register by an
instruction, as opposed to otherwise needing to redundantly re-specify
the operation producing it's value again?

Thereby enabling something like:

(insn xxx [(set (reg: A) (xxx: (reg: B) (reg: C)))
   (set (reg: CC) (newly-set-reg: A))
  )

(insn branch-equal (set (pc) (if_then_else
   (ge: CC 0)
   (label_ref 23)
   (pc)))
...)

Thereby enabling an xxx instruction to specify the CC register value being
virtually assigned the result of the instruction's operation (i.e. no code
will actually be generated for assignments to the CC register), upon which
an independently specified conditional branch may be defined to be dependant
upon it (the virtual CC register). Which would seem to be a simple way to
closely approximate the semantics of a global cc-state register?




Re: Profile-directed feedback and remote testing

2005-03-25 Thread Timothy J . Wood
On Mar 25, 2005, at 9:47 AM, Gabriel Dos Reis wrote:
"Timothy J.Wood" <[EMAIL PROTECTED]> writes:
|A compiler option to set the target directory for these files (and
| the coverage ones!) would be great.  Possibly even better would be an
| environment variable.  If the user wants to compare two sets of
| coverage information from two sets of tests, they shouldn't have to
| rebuild their entire project to do so (though I guess they could do
| directory swapping after each run).
yes, but would not a compiler option suffice?
  Certainly; as I indicated above, directory swizzling between runs 
would make up for the lack of being able to configure this at runtime.  
I also like Joe Buck's suggestion of creating the intermediate 
directories if missing (under either approach).

-tim


Re: A plan for eliminating cc0

2005-03-25 Thread Ian Lance Taylor
Paul Schlie <[EMAIL PROTECTED]> writes:

> Is there any convenient way to reference the newly set register by an
> instruction, as opposed to otherwise needing to redundantly re-specify
> the operation producing it's value again?
> 
> Thereby enabling something like:
> 
> (insn xxx [(set (reg: A) (xxx: (reg: B) (reg: C)))
>(set (reg: CC) (newly-set-reg: A))
>   )
> 
> (insn branch-equal (set (pc) (if_then_else
>(ge: CC 0)
>(label_ref 23)
>(pc)))
> ...)
> 
> Thereby enabling an xxx instruction to specify the CC register value being
> virtually assigned the result of the instruction's operation (i.e. no code
> will actually be generated for assignments to the CC register), upon which
> an independently specified conditional branch may be defined to be dependant
> upon it (the virtual CC register). Which would seem to be a simple way to
> closely approximate the semantics of a global cc-state register?

Yes, a backend could be implemented this way.  There are two problems.

1) Many of the optimizers analyze instructions by first calling
   single_set and working with the results of that.  For example,
   combine won't work with any insn for which single_set returns NULL.
   And single_set will normally return NULL for your insn xxx above.

2) Reload requires the ability to insert arbitrary instructions
   between any pair of instructions.  The instructions inserted by
   reload will load registers from memory, store registers to memory,
   and possibly, depending on the port, move values between different
   classes of registers and add constants to base registers.  If
   reload can't do that without affecting the dependencies between
   instructions, then it will break.  And I don't think reload will be
   able to do that between your two instructions above, on a typical
   cc0 machine in which every move affects the condition codes.

Ian


Re: GCC 4.0 Status Report (2005-03-24)

2005-03-25 Thread Eric Botcazou
> I'm a little concerned about the fact that (in theory) DECL_NAME could
> have spaces, or other assembler-unfriendly characters.  I'm not sure
> what to do in that circumstance; it's probably impossible to do anything
> better than we do now, without the assembler providing some kind of
> special support.  (I'm not actually sure what the assembler does with
> the name; presumably puts it in debug information.)

I can speak for the SPARC 64-bit assembler: it creates a special ELF symbol 
for it (STB_GLOBAL, STT_REGISTER, SHN_UNDEF).

-- 
Eric Botcazou




Re: getopt.h getopt() decl broken for many targets

2005-03-25 Thread Ian Lance Taylor
"Aaron W. LaFramboise" <[EMAIL PROTECTED]> writes:

> This is due to this code:
> 
> #if !HAVE_DECL_GETOPT
> #if defined (__GNU_LIBRARY__) || defined (HAVE_DECL_GETOPT)
> /* Many other libraries have conflicting prototypes for getopt, with
>differences in the consts, in unistd.h.  To avoid compilation
>errors, only prototype getopt for the GNU C library.  */
> extern int getopt (int argc, char *const *argv, const char *shortopts);
> #else
> #ifndef __cplusplus
> extern int getopt ();
> #endif /* __cplusplus */
> #endif
> #endif /* !HAVE_DECL_GETOPT */
> 
> Is the situation described in this comment still true?  Would it be
> possible to turn this whitelist into a blacklist?

Yes, the comment is still true enough that I don't think we should
change it.  Instead, copy the gcc configure code which sets
HAVE_DECL_GETOPT to the binutils.  In fact, I think Nick already did
this, although I don't know whether he checked it in yet.

Ian


Re: Profile-directed feedback and remote testing

2005-03-25 Thread Gabriel Dos Reis
Joe Buck <[EMAIL PROTECTED]> writes:

| Mark Mitchell <[EMAIL PROTECTED]> writes:
| > | When we generate data for feedback, we insert the .gcda name into the
| > | object file as an absolute path.  As a result, when we try to do
| > | remote testing, we lose, as, in general the remote file system does
| > | not have the same file hierarchy as the build system.
| 
| I had just that problem the other day, when a colleague asked me to help
| him debug a problem.  Since his executable was set up to generate gcov
| data, I couldn't run it as I didn't have permission to write his count
| files, and he'd left for the day, so I couldn't get him to change the
| permissions.  I wound up rebuilding the whole large executable from
| source, but that was an annoying waste of time.
| 
| On Fri, Mar 25, 2005 at 10:17:22AM +0100, Gabriel Dos Reis wrote:
| > Maybe an option to tell the compiler where to put thos .gcda files?
| 
| That wouldn't have saved me in the case described above, as the pathnames
| are already set in the executable.  A *runtime* way of altering the
| locations of the .gcda files would be nice to have.  For example, we could
| have something like

I guess I was unclear:  What I meant was a compiler option, meaning it
is specified when the compiler is run to compiler something, like

gcc --profile-output-dir=/blah/blah

| 
| GCDA_PATH_PREFIX
| 
| which, if set, would be prepended to the pathnames of the .gcda files.
| We could even arrange to create needed directories on demand when
| creating new .gcda files when this option is set.

-- Gaby


Re: Profile-directed feedback and remote testing

2005-03-25 Thread Joe Buck
On Fri, Mar 25, 2005 at 08:03:55PM +0100, Gabriel Dos Reis wrote:
> Joe Buck <[EMAIL PROTECTED]> writes:
> | That wouldn't have saved me in the case described above, as the pathnames
> | are already set in the executable.  A *runtime* way of altering the
> | locations of the .gcda files would be nice to have.  For example, we could
> | have something like
> 
> I guess I was unclear:  What I meant was a compiler option, meaning it
> is specified when the compiler is run to compiler something, like
> 
> gcc --profile-output-dir=/blah/blah

No, you were clear.  As I said, the problem with your suggested approach
is that, once the executable is compiled, the paths for the profile data
are still wired in.  All you've done is change the paths.

What I'd like is the ability to alter the paths at runtime, without
recompiling.  This would have all kinds of uses, like maintaining
separate counts files for two classes of runs, allowing people to
generate profiling data for the same executable without stepping on
each others' work, etc.




Re: Profile-directed feedback and remote testing

2005-03-25 Thread Gabriel Dos Reis
Joe Buck <[EMAIL PROTECTED]> writes:

| On Fri, Mar 25, 2005 at 08:03:55PM +0100, Gabriel Dos Reis wrote:
| > Joe Buck <[EMAIL PROTECTED]> writes:
| > | That wouldn't have saved me in the case described above, as the pathnames
| > | are already set in the executable.  A *runtime* way of altering the
| > | locations of the .gcda files would be nice to have.  For example, we could
| > | have something like
| > 
| > I guess I was unclear:  What I meant was a compiler option, meaning it
| > is specified when the compiler is run to compiler something, like
| > 
| > gcc --profile-output-dir=/blah/blah
| 
| No, you were clear.  As I said, the problem with your suggested approach
| is that, once the executable is compiled, the paths for the profile data
| are still wired in.  All you've done is change the paths.

OK, I think I get what you're saying.  Thanks!

| What I'd like is the ability to alter the paths at runtime, without
| recompiling.  This would have all kinds of uses, like maintaining
| separate counts files for two classes of runs, allowing people to
| generate profiling data for the same executable without stepping on
| each others' work, etc.

-- Gaby


ISO C prototype style for libiberty?

2005-03-25 Thread Gabriel Dos Reis

Hi,

  Would there be any objection to patches that convert function
definitions in libiberty to use ISO C prototype style, instead of 
K&R style?

(rationale: they are getting in the way when compiling GCC with a C++
compiler, for example).

-- Gaby


Re: A plan for eliminating cc0

2005-03-25 Thread Paul Schlie
> From: Ian Lance Taylor 
>> Paul Schlie <[EMAIL PROTECTED]> writes:
>> Thereby enabling something like:
>> 
>> (insn xxx [(set (reg: A) (xxx: (reg: B) (reg: C)))
>>(set (reg: CC) (newly-set-reg: A))
>>   )
>> 
>> (insn branch-equal (set (pc) (if_then_else
>>(ge: CC 0)
>>(label_ref 23)
>>(pc)))
>> ...)
>> 
> Yes, a backend could be implemented this way.  There are two problems.
> 
> 1) Many of the optimizers analyze instructions by first calling
>single_set and working with the results of that.  For example,
>combine won't work with any insn for which single_set returns NULL.
>And single_set will normally return NULL for your insn xxx above.

- As leading edge processor architectures seem to be slowly increasing
  intra-instruction level parallelism (i.e. backing slowly away from a
  pure simple one-operation/one-instruction RISC ISA, toward enabling a
  single instruction to potentially operate on a small set of operands
  which my yield multiple, non-interdependent results simultaneously
  (i.e. results dependant on only on the operands).  Are there any
  plans to eliminate this "single-set" restricting presumption, thereby
  enabling the potential optimization of multi-operand/operation/result
  instruction sequences?

- However regardless, although the above represents a primitive example
  of intra-instruction level multi-operation/result parallelism; I wonder
  if it's restricted enough case that optimization may be simply enabled
  for all instructions which have a "single-live-set"?

  In other words, although I understand parallel instruction optimization
  may be beyond the capabilities of many of the present optimizers, it
  seems "safe" to enable optimization of the "live" path, which would be
  the case when only a "single-set" has live dependencies, and the remaining
  "set(s)" are are "dead" (i.e. have no dependants), therefore irrelevant?

  Which would seem to be the most likely the case when a second parallel
  "set" is used to specify an update to a global condition-state, as most
  updates won't likely have conditional branch dependants? (Therefore safe
  to optimize the data-path, or in cases when the data path is dead,
  implying the condition-state path may be optimized irrespective of the
  data path.  Which would be analogous to turning a subtract into a compare
  instruction when the result of the subtract isn't used other than as
  required to set the global condition-state.)
  
> 2) Reload requires the ability to insert arbitrary instructions
>between any pair of instructions.  The instructions inserted by
>reload will load registers from memory, store registers to memory,
>and possibly, depending on the port, move values between different
>classes of registers and add constants to base registers.  If
>reload can't do that without affecting the dependencies between
>instructions, then it will break.  And I don't think reload will be
>able to do that between your two instructions above, on a typical
>cc0 machine in which every move affects the condition codes.

- Understood, however along a similar line of though; it would seem "safe"
  to simply "save/restore" the global condition-state around all potentially
  destructive memory operations.

  Which at first glance may seem excessive, however observing that most
  frequently required simple load/store operations on most targets won't
  modify global condition-state; and in the circumstances when more complex
  operations which may modify it are required, it need only be save/restored
  if it has dependants. Which as observed above, won't likely typically be
  the case.

  So overall it appears that there's likely very little effective overhead
  making all memory transactions effectively transparent to the global
  condition-state when it matters, as global condition-state will only be
  required to saved/restored in the likely few circumstances when a complex
  memory transaction may modify it, and it has dependants?

  (does that seem rational? or may I be missing something fundamental?)

thanks again, -paul-





gcc-3.4-20050325 is now available

2005-03-25 Thread gccadmin
Snapshot gcc-3.4-20050325 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/3.4-20050325/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 3.4 CVS branch
with the following options: -rgcc-ss-3_4-20050325 

You'll find:

gcc-3.4-20050325.tar.bz2  Complete GCC (includes all of below)

gcc-core-3.4-20050325.tar.bz2 C front end and core compiler

gcc-ada-3.4-20050325.tar.bz2  Ada front end and runtime

gcc-g++-3.4-20050325.tar.bz2  C++ front end and runtime

gcc-g77-3.4-20050325.tar.bz2  Fortran 77 front end and runtime

gcc-java-3.4-20050325.tar.bz2 Java front end and runtime

gcc-objc-3.4-20050325.tar.bz2 Objective-C front end and runtime

gcc-testsuite-3.4-20050325.tar.bz2The GCC testsuite

Diffs from 3.4-20050318 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-3.4
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


[rtl-optimization] Improve Data Prefetch for IA-64

2005-03-25 Thread Canqun Yang
Hi, all

Currently, GCC just ignores all data prefetches within 
loop when the number of prefetches exceeds 
SIMULTANEOUS_PREFETCHES. It isn't advisable. 

Also, macros defined in ia64.h for data prefetching 
are too small.

This patch modified the data prefetch algorithm 
defined in loop.c and redefines some macros in ia64.h 
accordingly. The test shows 2.5 percent perfomance 
improvements is gained for SPEC CFP2000 benchmarks on 
IA-64. If the new loop unroller was perfectly (just 
like the old one which was removed) implemented, much 
more performance improvements would be gained.

Canqun Yang
Creative Compiler Research Group.
National University of Defense Technology, China.
2005-03-25  Canqun Yang  <[EMAIL PROTECTED]>

* ia64.c (SIMULTANEOUS_PREFETCHES): Redefine as 18.
(PREFETCH_BLOCK): Redefine as 64.
(PREFETCH_BLOCKS_BEFORE_LOOP_MAX): New definition.

2005-03-25  Canqun Yang  <[EMAIL PROTECTED]>

* loop.c (PREFETCH_BLOCKS_BEFORE_LOOP_MAX): Defined conditionally.
(scan_loop): Change extra_size from 16 to 128.
(emit_prefetch_instructions): Don't ignore all prefetches within loop.Index: loop.c
===
RCS file: /cvs/gcc/gcc/gcc/loop.c,v
retrieving revision 1.522
diff -c -3 -p -r1.522 loop.c
*** loop.c  17 Jan 2005 08:46:15 -  1.522
--- loop.c  25 Mar 2005 12:03:44 -
*** struct loop_info
*** 434,440 
--- 434,442 
  #define MAX_PREFETCHES 100
  /* The number of prefetch blocks that are beneficial to fetch at once before
 a loop with a known (and low) iteration count.  */
+ #ifndef PREFETCH_BLOCKS_BEFORE_LOOP_MAX
  #define PREFETCH_BLOCKS_BEFORE_LOOP_MAX  6
+ #endif
  /* For very tiny loops it is not worthwhile to prefetch even before the loop,
 since it is likely that the data are already in the cache.  */
  #define PREFETCH_BLOCKS_BEFORE_LOOP_MIN  2
*** scan_loop (struct loop *loop, int flags)
*** 1100,1106 
/* Allocate extra space for REGs that might be created by load_mems.
   We allocate a little extra slop as well, in the hopes that we
   won't have to reallocate the regs array.  */
!   loop_regs_scan (loop, loop_info->mems_idx + 16);
insn_count = count_insns_in_loop (loop);
  
if (loop_dump_stream)
--- 1102,1108 
/* Allocate extra space for REGs that might be created by load_mems.
   We allocate a little extra slop as well, in the hopes that we
   won't have to reallocate the regs array.  */
!   loop_regs_scan (loop, loop_info->mems_idx + 128);
insn_count = count_insns_in_loop (loop);
  
if (loop_dump_stream)
*** emit_prefetch_instructions (struct loop 
*** 4398,4406 
{
  if (loop_dump_stream)
fprintf (loop_dump_stream,
!"Prefetch: ignoring prefetches within loop: ahead is zero; 
%d < %d\n",
 SIMULTANEOUS_PREFETCHES, num_real_prefetches);
! num_real_prefetches = 0, num_real_write_prefetches = 0;
}
  }
/* We'll also use AHEAD to determine how many prefetch instructions to
--- 4400,4411 
{
  if (loop_dump_stream)
fprintf (loop_dump_stream,
!"Prefetch: ignoring some prefetches within loop: ahead is 
zero; %d < %d\n",
 SIMULTANEOUS_PREFETCHES, num_real_prefetches);
! num_real_prefetches = MIN (num_real_prefetches,
!SIMULTANEOUS_PREFETCHES);
! num_real_write_prefetches = MIN (num_real_write_prefetches,
!  SIMULTANEOUS_PREFETCHES);
}
  }
/* We'll also use AHEAD to determine how many prefetch instructions to
Index: config/ia64/ia64.h
===
RCS file: /cvs/gcc/gcc/gcc/config/ia64/ia64.h,v
retrieving revision 1.194
diff -c -3 -p -r1.194 ia64.h
*** config/ia64/ia64.h  17 Mar 2005 17:35:16 -  1.194
--- config/ia64/ia64.h  25 Mar 2005 12:05:05 -
*** do {
\
*** 1993,2004 
 ??? This number is bogus and needs to be replaced before the value is
 actually used in optimizations.  */
  
! #define SIMULTANEOUS_PREFETCHES 6
  
  /* If this architecture supports prefetch, define this to be the size of
 the cache line that is prefetched.  */
  
! #define PREFETCH_BLOCK 32
  
  #define HANDLE_SYSV_PRAGMA 1
  
--- 1993,2008 
 ??? This number is bogus and needs to be replaced before the value is
 actually used in optimizations.  */
  
! #define SIMULTANEOUS_PREFETCHES 18
  
  /* If this architecture supports prefetch, define this to be the size of
 the cache line that is prefetched.  */
  
! #define PREFETCH_BLOCK 64 
! 
! /* The number of prefetch blocks that are beneficial to fetch at once before
!a loop.  */
! #define PREFETCH_BLOCKS_BEFORE

Re: [rtl-optimization] Improve Data Prefetch for IA-64

2005-03-25 Thread Steven Bosscher
On Saturday 26 March 2005 02:22, Canqun Yang wrote:
> * loop.c (PREFETCH_BLOCKS_BEFORE_LOOP_MAX): Defined 
> conditionally.
> (scan_loop): Change extra_size from 16 to 128.
> (emit_prefetch_instructions): Don't ignore all prefetches 
> within
> loop.

OK, so I know this is not a popular subject, but can we *please* stop
working on loop.c and focus on getting the new RTL and tree loop passes
to do what we want?  All this loop.c patching is a typical example of
why free software development does not always work: always going for
the low-hanging fruit.  In this case, there have been several attempts
to replace the prefetching stuff in loop.c with something better.  On
the rtl-opt branch there is a new RTL loop-prefetch.c, and on the LNO
branch there is a re-use analysis based prefetching pass.  Why don't
you try to finish and improve those passes, instead of making it yet
again harder to remove loop.c.  This one file is a *huge* problem for
just about the entire RTL optimizer path.  It is, for example, the
reason why there is no profile information available before this old
piece of, if I may say, junk runs, and it the only reason why a great
many functions in for example jump.c and the various cfg*.c files can
still not be removed.

Gr.
Steven



Re: A plan for eliminating cc0

2005-03-25 Thread Ian Lance Taylor
Paul Schlie <[EMAIL PROTECTED]> writes:

> > 1) Many of the optimizers analyze instructions by first calling
> >single_set and working with the results of that.  For example,
> >combine won't work with any insn for which single_set returns NULL.
> >And single_set will normally return NULL for your insn xxx above.
> 
> - As leading edge processor architectures seem to be slowly increasing
>   intra-instruction level parallelism (i.e. backing slowly away from a
>   pure simple one-operation/one-instruction RISC ISA, toward enabling a
>   single instruction to potentially operate on a small set of operands
>   which my yield multiple, non-interdependent results simultaneously
>   (i.e. results dependant on only on the operands).  Are there any
>   plans to eliminate this "single-set" restricting presumption, thereby
>   enabling the potential optimization of multi-operand/operation/result
>   instruction sequences?

I'm not aware of any particular plans to eliminate the single_set
presumption.  I think that there is a general awareness that it is an
issue, but I don't know of any general agreement on how serious it is
or how difficult it would be to change.

I'm also not aware of processors changing as you describe, except for
the particular special case of SIMD vector instructions.  gcc can and
does represent vector instructions as a single set.

> - However regardless, although the above represents a primitive example
>   of intra-instruction level multi-operation/result parallelism; I wonder
>   if it's restricted enough case that optimization may be simply enabled
>   for all instructions which have a "single-live-set"?
> 
>   In other words, although I understand parallel instruction optimization
>   may be beyond the capabilities of many of the present optimizers, it
>   seems "safe" to enable optimization of the "live" path, which would be
>   the case when only a "single-set" has live dependencies, and the remaining
>   "set(s)" are are "dead" (i.e. have no dependants), therefore irrelevant?

Yes, that is how it works today.  A parallel set in which all but one
of the results is dead counts as a single set.  See the code in
rtlanal.c.

>   Which would seem to be the most likely the case when a second parallel
>   "set" is used to specify an update to a global condition-state, as most
>   updates won't likely have conditional branch dependants? (Therefore safe
>   to optimize the data-path, or in cases when the data path is dead,
>   implying the condition-state path may be optimized irrespective of the
>   data path.  Which would be analogous to turning a subtract into a compare
>   instruction when the result of the subtract isn't used other than as
>   required to set the global condition-state.)

Yes, but the point of representing changes to the condition flags is
precisely to permit optimizations when the condition flags are used,
and that is precisely when the single_set assumption will fail.  You
are correct that in general adding descriptions of the condition code
changes to the RTL won't inhibit optimizations that don't meaningfully
set the condition code flags.  But it will inhibit optimizations which
do set the condition code flags, and that more or less obviates the
whole point of representing the condition code setting in the first
place.

> > 2) Reload requires the ability to insert arbitrary instructions
> >between any pair of instructions.  The instructions inserted by
> >reload will load registers from memory, store registers to memory,
> >and possibly, depending on the port, move values between different
> >classes of registers and add constants to base registers.  If
> >reload can't do that without affecting the dependencies between
> >instructions, then it will break.  And I don't think reload will be
> >able to do that between your two instructions above, on a typical
> >cc0 machine in which every move affects the condition codes.
> 
> - Understood, however along a similar line of though; it would seem "safe"
>   to simply "save/restore" the global condition-state around all potentially
>   destructive memory operations.

Safe but very costly.  It assumes that every processor has a cheap way
to save and restore the condition codes in user mode, which is not
necessarily the case.  And it assumes that the save and restore can be
removed when not required, which is not obvious to me.

>   Which at first glance may seem excessive, however observing that most
>   frequently required simple load/store operations on most targets won't
>   modify global condition-state; and in the circumstances when more complex
>   operations which may modify it are required, it need only be save/restored
>   if it has dependants. Which as observed above, won't likely typically be
>   the case.

On machines which currently use cc0, which are the only machines under
discussion here, simple load/store operations do modify global
condition state.

Ian


Re: ISO C prototype style for libiberty?

2005-03-25 Thread DJ Delorie

>   Would there be any objection to patches that convert function
> definitions in libiberty to use ISO C prototype style, instead of 
> K&R style?

I would be in support of such a patch iff it converts all the
functions, not just the ones gcc happens to use.


Re: ISO C prototype style for libiberty?

2005-03-25 Thread Gabriel Dos Reis
DJ Delorie <[EMAIL PROTECTED]> writes:

| >   Would there be any objection to patches that convert function
| > definitions in libiberty to use ISO C prototype style, instead of 
| > K&R style?
| 
| I would be in support of such a patch iff it converts all the
| functions, not just the ones gcc happens to use.

Just to make sure I understand.  I was thinking of whatever was under
$GCC/libiberty (and included).  Are you thinking of something more?
A single patch is a huge stuff; I propose to break it into a series of
patches.  Is that OK with you?

-- Gaby


Re: ISO C prototype style for libiberty?

2005-03-25 Thread DJ Delorie

> Just to make sure I understand.  I was thinking of whatever was
> under $GCC/libiberty (and included).  Are you thinking of something
> more?

No.

> A single patch is a huge stuff; I propose to break it into a series
> of patches.  Is that OK with you?

I only want to avoid a situation where libiberty is left half
converted (except short term, of course).  The mechanics of the
process are irrelevent to me.


Re: ISO C prototype style for libiberty?

2005-03-25 Thread Gabriel Dos Reis
DJ Delorie <[EMAIL PROTECTED]> writes:

| > Just to make sure I understand.  I was thinking of whatever was
| > under $GCC/libiberty (and included).  Are you thinking of something
| > more?
| 
| No.

OK.

| > A single patch is a huge stuff; I propose to break it into a series
| > of patches.  Is that OK with you?
| 
| I only want to avoid a situation where libiberty is left half
| converted (except short term, of course).  The mechanics of the
| process are irrelevent to me.

OK.  I'm hold to convert all of liberbity.   Thanks.

-- Gaby


Re: ISO C prototype style for libiberty?

2005-03-25 Thread Zack Weinberg
DJ Delorie <[EMAIL PROTECTED]> writes:

> I only want to avoid a situation where libiberty is left half
> converted (except short term, of course).  The mechanics of the
> process are irrelevent to me.

I take it that all libiberty-using projects have taken the plunge,
then?  You vetoed this conversion awhile back because libiberty had
to be done last.

What's your opinion on dropping C89 library routines from libiberty?

zw


Re: ISO C prototype style for libiberty?

2005-03-25 Thread DJ Delorie

> I take it that all libiberty-using projects have taken the plunge,
> then?  You vetoed this conversion awhile back because libiberty had
> to be done last.

At this point, I think libiberty *is* the last.

> What's your opinion on dropping C89 library routines from libiberty?

What would that buy us?  I mean, aside from the obvious "less to
maintain" reason?


Re: ISO C prototype style for libiberty?

2005-03-25 Thread Zack Weinberg
DJ Delorie <[EMAIL PROTECTED]> writes:

>> I take it that all libiberty-using projects have taken the plunge,
>> then?  You vetoed this conversion awhile back because libiberty had
>> to be done last.
>
> At this point, I think libiberty *is* the last.

I'm glad to hear it.  It'll be nice to be completely done with this
conversion.

>> What's your opinion on dropping C89 library routines from libiberty?
>
> What would that buy us?  I mean, aside from the obvious "less to
> maintain" reason?

Less to maintain is all I was hoping for.  I think the configure
scripts (both libiberty's and gcc's) could be simplified quite a bit
if we assumed a C89 compliant runtime library, as could libiberty.h
and system.h.

zw


Re: ISO C prototype style for libiberty?

2005-03-25 Thread Joe Buck
On Fri, Mar 25, 2005 at 10:10:17PM -0800, Zack Weinberg wrote:
> DJ Delorie <[EMAIL PROTECTED]> writes:
[ dropping C89 functions ]
> > What would that buy us?  I mean, aside from the obvious "less to
> > maintain" reason?
> 
> Less to maintain is all I was hoping for.  I think the configure
> scripts (both libiberty's and gcc's) could be simplified quite a bit
> if we assumed a C89 compliant runtime library, as could libiberty.h
> and system.h.

Any retro people out there still trying to run SunOS 4.x?  Other than that
I can't think of any non-C89 systems people might be using.




Re: ISO C prototype style for libiberty?

2005-03-25 Thread DJ Delorie

> Less to maintain is all I was hoping for.  I think the configure
> scripts (both libiberty's and gcc's) could be simplified quite a bit
> if we assumed a C89 compliant runtime library, as could libiberty.h
> and system.h.

Well, gcc can make assumptions libiberty can't, and as far as
libiberty's configure goes, "if it ain't broke, don't fix it" seems to
be the best course.

I admit that cleaning up the includes intrigues me, but I also
hesitate to change that without completely understanding the OSs we
still need to support.

A target environment, for example, may use libiberty to *provide* c89
support functions for its runtime library.


new cctools for powerpc-darwin7 required for HEAD

2005-03-25 Thread Geoff Keating
To be used conveniently on Panther, the recent stfiwx change in HEAD  
requires a later version of cctools than the 528.5 version that's  
currently on gcc.gnu.org.  So, I've put cctools-576 on gcc.gnu.org.   
You can install it by clicking on the link below, or by running these  
commands:

ftp ftp://gcc.gnu.org/pub/gcc/infrastructure/cctools-576.dmg
hdiutil attach cctools-576.dmg
sudo installer -verbose -pkg /Volumes/cctools-576/cctools-576.pkg  
-target /

This version also handles 8-bit characters in identifiers properly, so  
all those ucnid* testcases should pass.

It's not necessary to upgrade cctools to use 4.0, since the features  
that need the cctools fixes aren't there.

Source for the new cctools is at  
.  You  
can also get it from  
 (they're the same tarfile, just different compression).

The checksums are:
0ebb1a56b1af0e21d9de30b644c1c059  cctools-576.dmg
3b9a5dd3db6b4a7e9c8de02198faea25  cctools-576.tar.bz2


smime.p7s
Description: S/MIME cryptographic signature