Re: gcc don't allow commas between clauses for openmp

2007-12-17 Thread Lijuan Hai
gcc-4.3-20070912 doesn't allow commas between clauses. Details given
following. I have just scanned c-parser.c and found we could change
c_parser_omp_clause_name () to enable it.  But I want to know more
before making any changes on it myself.  "openmp implementation in
gcc" in GCC SUMMIT 2006 seems not covering the details,  e.g. what
kind of features gcc doesn't presently support for openmp.  Thanks,

> --lijuan
>
> micro# /import/dr3/s10/gcc-4.3/bin/gcc a.c -fopenmp
> a.c: In function 'main':
> a.c:11: error: expected '#pragma omp' clause before ',' token
> micro# cat a.c
> #include 
> #include 
>
> int main(void)
> {
>   int i = 1, j = 2;
>
>   omp_set_dynamic(0);
>   omp_set_num_threads(4);
>
>   #pragma omp parallel shared(i), private(j)
>   {
> j =  omp_get_thread_num();
> printf("t#: %i  i: %i  j: %i\n", omp_get_thread_num(), i, j);
>   }
>
>   return 0;
> }
> micro# /import/dr3/s10/gcc-4.3/bin/gcc -v
> Using built-in specs.
> Target: sparc-sun-solaris2.10
> Configured with: /import/dr2/starlex/orig/trunk/configure 
> --prefix=/import/dr3/s10/gcc-4.3/ --enable-languages=c,c++,fortran 
> --disable-gnattools --with-mpfr=/ws/gccfss/tools --with-gmp=/ws/gccfss/tools
> Thread model: posix
> gcc version 4.3.0 20070912 (experimental) (GCC)
>


Re: gcc don't allow commas between clauses for openmp

2007-12-17 Thread Diego Novillo

On 12/17/07 02:27, Lijuan Hai wrote:
gcc-4.3-20070912 doesn't allow commas between clauses. Details given 
following. I have just scanned c-parser.c and found we could change 
c_parser_omp_clause_name () to enable it.


Thanks for the report.  Jakub submitted a patch to fix this problem 
which I recently approved.  It should be available in 4.3 and the 4.2 
branch (if backported).



before making any changes on it myself.  "openmp implementation in gcc" 
in GCC SUMMIT 2006 seems not covering the details, e.g. what kind of 
features gcc doesn't presently support for openmp.  Thanks,


GCC should support the whole OpenMP 2.5 standard.  Support for 3.0 is 
being implemented by Jakub.  Anything not supported is considered a bug 
and we'd ask you to submit it to bugzilla.  Thanks.



Diego.


Re: Designs for better debug info in GCC

2007-12-17 Thread Alexandre Oliva
On Dec 16, 2007, "Daniel Berlin" <[EMAIL PROTECTED]> wrote:

>> It is obvious that you misunderstood what I want, and how intrusive
>> the approach is.

> Yes Alexandre, everyone who disagrees with you must not understand!

My conclusion is not based on disagreement, but rather on the faulty
arguments presented during the discussion.

For example, when you took the argument that every transformation had
effects on debug information, and used that to conclude that every
transformation would need difficult changes to generate correct debug
information, you left out from your reasoning a major strength of the
design, that I had mentioned in the e-mail you responded to: that the
optimizers already perform the transformations we need to keep debug
information accurate.

So, by missing or misunderstanding an essential part of the thought
process that went into the design, you came to a false conclusion
about it.

> That's really the problem here.
> None of us understand but you.

I guess I'm to blame, for having naïvely put the code out without as
much as a design and goals document, such that people started looking
at it without actually understanding what it was about, and at the
same time taking conclusions about it based on hunches rather than on
solid logical grounds.

At this point, we have a scenario in which people have already jumped
to their conclusions, and whatever I say requires a much higher
threshold to be listened to and accepted.  It's quite unfortunate that
psychological factors take such a large role in the making of
technical decisions, and I naïvely assumed this wouldn't raise so much
rejection, for being such a simple and well thought-out design.  Oh,
well...  Something to avoid next time...

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: Designs for better debug info in GCC

2007-12-17 Thread Diego Novillo

On 12/17/07 12:51, Alexandre Oliva wrote:


I guess I'm to blame, for having naïvely put the code out without as
much as a design and goals document


Yes, you are.

You need to provide such a document now.  I can't see how you'll be able 
to incorporate your implementation without a convincing design.


The barrier is probably going to be higher.  You raised too much 
controversy, so I have my doubts about your simplicity claims.



Diego.


Re: Designs for better debug info in GCC

2007-12-17 Thread Alexandre Oliva
On Dec 16, 2007, Joe Buck <[EMAIL PROTECTED]> wrote:

> However, since preserving accurate debug information
> has a cost, I think it would be better to turn -O1, not -O2, into the
> mode that Alexandre wants, where debug information is preserved.

In terms of memory, that's true, it does have a cost, for we have to
keep more information around.  That's one of the reasons why I'm
implementing this all under the control of a command-line option: you
can selectively enable or disable it, regardless of the level of
optimization.  If we want to make it default for -O1, but not for -O2,
sure, that works.

But this won't make much of a difference in terms of code change.
Except for the fact that we could simply leave alone the passes that
are only executed at -O2 or higher (which is not worth it, given that
I've already done the small work needed for them to keep debug info
accurate), most of the passes will still keep the information
accurate, nearly all of them without any code changes whatsoever.

So, doing this only for -O1 seems like a waste, given that -O2 is the
most common optimization level, and it's most often accompanied by -g.

> Trying to rework all optimizations to keep perfect debug information
> is going to take forever and make the compiler worse.

This statement is easy to make and to believe, but my approach is
proving it false, given a design that took this concern into account.

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: Help with another constraint

2007-12-17 Thread Rask Ingemann Lambertsen
On Wed, Dec 12, 2007 at 03:35:09PM +0100, 'Rask Ingemann Lambertsen' wrote:
> 
>The movxx patterns are special and you'll need to hold the compiler's
> hands a little. Since your target can't move immediates directly to memory,
> you have to ask for a secondary reload to an intermediate register. Use the
> target hook TARGET_SECONDARY_RELOAD.

   Actually, how do you do that? I can't see any place in the documentation
that says how TARGET_SECONDARY_RELOAD can be used for that purpose.

-- 
Rask Ingemann Lambertsen
Danish law requires addresses in e-mail to be logged and stored for a year


RE: Help with another constraint

2007-12-17 Thread Balaji V. Iyer
Hi Rask,
First, Thank you very much for all help you have provided me. It
really help me finish my project.

This is what I did:

I capture all the moves regardless of the operand and then to move an
immediate into a regiser, I force a register:

here is the code for this:


  if (!no_new_pseudos)
{ 
  /* taking care of moving constant integers */
  if (GET_CODE (operands[1]) == CONST_INT)
{
  rtx reg = gen_reg_rtx (SImode);

  emit_insn (gen_movsi (reg, operands[1]));
  operands[1] = gen_lowpart (QImode, reg);
}
   /* moving memory operands */
  if (GET_CODE (operands[1]) == MEM)
{
  rtx reg = gen_reg_rtx (SImode);

  emit_insn (gen_rtx_SET (SImode, reg,
  gen_rtx_ZERO_EXTEND (SImode,
   operands[1])));

  operands[1] = gen_lowpart (QImode, reg);
}
   /* moving register operands */
  if (GET_CODE (operands[0]) != REG)
operands[1] = force_reg (QImode, operands[1]);
}

I hope this helps.

-Balaji V. Iyer.

-- 
 
Balaji V. Iyer
PhD Student, 
Center for Efficient, Scalable and Reliable Computing,
Department of Electrical and Computer Engineering,
North Carolina State University.


-Original Message-
From: Rask Ingemann Lambertsen [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 17, 2007 1:33 PM
To: Balaji V. Iyer
Cc: gcc@gcc.gnu.org; [EMAIL PROTECTED]
Subject: Re: Help with another constraint

On Wed, Dec 12, 2007 at 03:35:09PM +0100, 'Rask Ingemann Lambertsen'
wrote:
> 
>The movxx patterns are special and you'll need to hold the 
> compiler's hands a little. Since your target can't move immediates 
> directly to memory, you have to ask for a secondary reload to an 
> intermediate register. Use the target hook TARGET_SECONDARY_RELOAD.

   Actually, how do you do that? I can't see any place in the
documentation that says how TARGET_SECONDARY_RELOAD can be used for that
purpose.

--
Rask Ingemann Lambertsen
Danish law requires addresses in e-mail to be logged and stored for a
year



Re: Designs for better debug info in GCC

2007-12-17 Thread Alexandre Oliva
On Dec 17, 2007, Diego Novillo <[EMAIL PROTECTED]> wrote:

> On 12/17/07 12:51, Alexandre Oliva wrote:
>> I guess I'm to blame, for having naïvely put the code out without as
>> much as a design and goals document

> Yes, you are.

Wow, thanks.  At least we agree on something! ;-)

> You need to provide such a document now.

Can't I instead provide it when it's ready?

You know, it wasn't me who asked to have the thing developed in the
open.  I didn't push it out just so that people who didn't want to
understand it could beat on it before it was ready to defend itself.
I put it out because there was an offer for contribution.

> I can't see how you'll be able to incorporate your implementation
> without a convincing design.

Agreed, I don't see how this would be doable for any but the most
trivial patches.

> The barrier is probably going to be higher.
> You raised too much controversy, so I have my doubts about your
> simplicity claims.

Oh, nice!  *I* raised too much controversy.  So people first ask me to
put the code out such that they can peek at it and help, then most
refrain from peeking at it because it's not ready and some who do
raise some concerns that are not reflected by the code, and then
everyone doubts I've taken those concerns into account and demand a
design document that will no more than just repeat the information
that's already out there but that people fail to take into account.

And then, this is a technical discussion, so historical controversy
shouldn't play any role in it, if people were rational about it.

Now, can you please explain to me how the efforts of repeating myself
one more time, rather than completing the implementation, are going to
make it any more likely that people who have already made up their
minds based on groundless fears will be convinced?

If you really think it would be worth it, can you point out at what
you feel to be missing in the consolidated documentation I posted
upthread, in response to your request?  I'd be happy to fill in the
blanks, if you're willing to listen.  But I wouldn't be happy to waste
more time.

(This is not to say that the document won't ever be produced; it's to
say that I'm to work on it right now.  I have other deliverables ahead
of it.)

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: Designs for better debug info in GCC

2007-12-17 Thread Diego Novillo

On 12/17/07 15:28, Alexandre Oliva wrote:


You need to provide such a document now.


Can't I instead provide it when it's ready?


Of course.


Diego.


gcc-4.1-20071217 is now available

2007-12-17 Thread gccadmin
Snapshot gcc-4.1-20071217 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20071217/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.1 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch 
revision 131021

You'll find:

gcc-4.1-20071217.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.1-20071217.tar.bz2 C front end and core compiler

gcc-ada-4.1-20071217.tar.bz2  Ada front end and runtime

gcc-fortran-4.1-20071217.tar.bz2  Fortran front end and runtime

gcc-g++-4.1-20071217.tar.bz2  C++ front end and runtime

gcc-java-4.1-20071217.tar.bz2 Java front end and runtime

gcc-objc-4.1-20071217.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.1-20071217.tar.bz2The GCC testsuite

Diffs from 4.1-20071210 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.1
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


__builtin_expect for indirect function calls

2007-12-17 Thread trevor_smigiel
Hi,

I'm looking for comments on a possible GCC extensions described below.

For the target I'm interested in, Cell SPU, taken branches are only
predicted correctly by explicitly inserting a specific instructions (a
hint instruction) that says "the branch at address A is branching to
address B".  This allows the processor to prefetch the instructions at
B, potentially with no penalty.

For indirect function calls, the ideal case is we know the target soon
enough at run-time that the hint instruction simply specifies the real
target.  Soon enough means about 18 cycles before the execution of the
branch.  I don't have any numbers as to how often this happens, but
there are enough cases where it doesn't.

When we can't hint the real target, we want to hint the most common
target.   There are potentially clever ways for the compiler to do this
automatically, but I'm most interested in giving the user some way to do
it explicitly.  One possiblity is to have something similar to
__builtin_expect, but for functions.  For example, I propose:

  __builtin_expect_call (FP, PFP)

which returns the value of FP with the same type as FP, and tells the
compiler that PFP is the expected target of FP.  Trival examples:

  typedef void (*fptr_t)(void);

  extern void foo(void);

  void
  call_fp (fptr_t fp)
  {
/* Call the function pointed to by fp, but predict it as if it is
   calling foo() */
__builtin_expect_call (fp, foo)();
  }

  void
  call_fp_predicted (fptr_t fp, fptr_t predicted)
  {
/* same as above but the function we are calling doesn't have to be
   known at compile time */
__builtin_expect_call (fp, predicted)();
  }

I believe I can add this just for the SPU target without effecting
anything else, but it could be useful for other targets.

Are there any comments about the name, semantics, or usefulness of this
extension?

Thanks,
Trevor




Re: Designs for better debug info in GCC

2007-12-17 Thread Alexandre Oliva
On Dec 17, 2007, Diego Novillo <[EMAIL PROTECTED]> wrote:

> On 12/17/07 15:28, Alexandre Oliva wrote:
>>> You need to provide such a document now.
>> 
>> Can't I instead provide it when it's ready?

> Of course.

Thanks,

Now, since you're so interested in it and you've already read the
various perspectives on the issue that I listed in my yesterday's
e-mail to you, would you help me improve this document, by letting me
know what you believe to be missing from the selected postings on
design strategies, rationales and goals:

http://gcc.gnu.org/ml/gcc/2007-11/msg00229.html (goals)
http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00160.html (initial plan)
http://gcc.gnu.org/ml/gcc/2007-11/msg00261.html (detailed plan)
http://gcc.gnu.org/ml/gcc/2007-11/msg00317.html (example)
http://gcc.gnu.org/ml/gcc/2007-11/msg00590.html (more example)
http://gcc.gnu.org/ml/gcc/2007-11/msg00176.html (design rationale)
http://gcc.gnu.org/ml/gcc/2007-11/msg00177.html (clarification)

I could then focus on these missing aspects too, in addition to the
ones I already have, while designing the best form to present the
ideas.

Thanks in advance,

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: Designs for better debug info in GCC

2007-12-17 Thread Diego Novillo

On 12/17/07 19:50, Alexandre Oliva wrote:


Now, since you're so interested in it and you've already read the
various perspectives on the issue that I listed in my yesterday's
e-mail to you, would you help me improve this document, by letting me
know what you believe to be missing from the selected postings on
design strategies, rationales and goals:


No.  I am not interested in organizing your thoughts for you.

I am interested in reading a single, concise and well organized design 
document that you produce for all of us to understand what you want to do.


Take your time.  It doesn't need to be now.


Diego.


Re: Designs for better debug info in GCC

2007-12-17 Thread Alexandre Oliva
On Dec 17, 2007, Geert Bosch <[EMAIL PROTECTED]> wrote:

> We could conceptually have inspection points between each source
> statement and declaration, which would roughly correspond to a
> use of all memory and all source variables, wether in memory or
> in registers.
> These inspections points would be considered potentially trapping.

Yes, I've considered something along these lines, but decided against
it, for we can't afford for debug information to affect executable
code generation in any way whatsoever, and we don't want to pessimize
optimized code when compiling without -g just so that compiling with
-g would get us the same code.

> Also, since no user-visible state can be modified by speculatively
> executed instructions such as loads, such instructions should not
> be tagged with their original source location information.

Line number information has a well-defined meaning: it ought to
represent the source code line that best represents the source-code
construct that ended up implemented using that instruction.

To address what we have in mind, there's an additional annotation on
top of line number information: the is_stmt flag.  This is what we
should use to tell debuggers what the best instruction is to set a
breakpoint at a certain line number or so, and for debuggers to be
able to step line by line more seamlessly.

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: Designs for better debug info in GCC

2007-12-17 Thread Joe Buck
On Mon, Dec 17, 2007 at 11:11:46PM -0200, Alexandre Oliva wrote:
> Line number information has a well-defined meaning: it ought to
> represent the source code line that best represents the source-code
> construct that ended up implemented using that instruction.

You implicitly assume that souch a source code line exists.
Consider something like

int func(bool cond, int a, int b, int c)
{
  int out;
  if (cond)
out = a + b;
  else
out = a + b + c;
  return out;
}

The optimizer might produce something that structurally resembles

  out = a + b;
  if (!cond)
out += c;
  return out;

If you set a breakpoint on the addition of a and b, it will trigger
regardless of the value of cond.  Furthermore, there isn't a place
to put a breakpoint that will trigger only for the case where cond
is true, as you can on unoptimized code.  So you need to choose
between natural debugging and optimization.



Re: __builtin_expect for indirect function calls

2007-12-17 Thread Jonathan Adamczewski
[EMAIL PROTECTED] wrote:
> Are there any comments about the name, semantics, or usefulness of this
> extension?
>   

Sounds very useful for SPU code. I look forward to trying it out.


Toying with the idea, the following seems like a potentially useful C++
form of the proposed extension :

struct A {
virtual void foo();
};

struct B : public A {
virtual void foo();
};

A* a;

...

__builtin_expect_call (a->foo, B::foo)();



jonathan.


A proposal to align GCC stack

2007-12-17 Thread Ye, Joey
-- 0. MOTIVATION --
Some local variables (such as of __m128 type or marked with alignment
attribute) require stack aligned at a boundary larger than the default
stack
boundary. Current GCC partially supports this with limitations. We are
proposing a new design to fully solve the problem.


-- 1. CURRENT IMPLEMENTATION --
There are two ways current GCC supports bigger than default stack
alignment.  One is to make sure that stack is aligned at program entry
point, and then ensure that for each non-leaf function, its frame size
is
aligned. This approach doesn't work when linking with libs or objects
compiled by other psABI confirming compilers. Some problems are logged
as
PR 33721. Another is to adjust stack alignment at the entry point of a
function if it is marked with __attribute__ ((force_align_arg_pointer))
or -mstackrealign option is provided. This method guarantees the
alignment
in most of the cases but with following problems and limitations:

*  Only 16 bytes alignment is supported
*  Adjusting stack alignment at each function prologue hurts performance
unnecessarily, because not all functions need bigger alignment. In fact,
commonly only those functions which have SSE variables defined locally
(either declared by the user or compiler generated internal temporary
variables) need corresponding alignment.
*  Doesn't support x86_64 for the cases when required stack alignment
is > 16 bytes
*  Emits inefficient and complicated prologue/epilogue code to adjust
stack alignment
*  Doesn't work with nested functions
*  Has a bug handling register parameters, which resulted in a cpu2006
failure. A patch is available as a workaround.

-- 2. NEW PROPOSAL: DESIGN --
Here, we propose a new design to fully support stack alignment while
overcoming above problems. The new design will
*  Support arbitrary alignment value, including 4,8,16,32...
*  Adjust function stack alignment only when necessary
*  Initial development will be on i386 and x86_64, but can be extended
to other platforms
*  Emit more efficient prologue/epilogue code
*  Coexist with special features like dynamic stack allocation (alloca),
nested functions, register parameter passing, PIC code and tail call
optimization
*  Be able to debug and unwind stack

2.1 Support arbitrary alignment value
Different source code and optimizations requires different stack
alignment,
as in following table:
Feature Alignment (bytes)
i386_ABI4
x86_64_ABI  16
char1
short   2
int 4
long4/8*
long long   8
__m64   8
__m128  16
float   4
double  8
long double 4/16*
user specified  any power of 2

*Note: 4 for i386, 8/16 for x86_64
The new design will support any alignment value in this table.

2.2 Adjust function stack alignment only when necessary

Current GCC defines following macros related to stack alignment:
i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386
and
64 for x86_64. It is the minimum stack boundary. It is fixed.
ii. PREFERRED_STACK_BOUNDARY. It sets the stack alignment when calling a
function. It may be set at command line and has no impact on stack
alignment at function entry. This proposal requires PREFERRED >= STACK,
and
by default set to ABI_STACK_BOUNDARY

This design will define a few more macros, or concepts not explicitly
defined in code:
iii. ABI_STACK_BOUNDARY in bits, which is the stack boundary specified
by
psABI, 32 for i386 and 128 for x86_64.  ABI_STACK_BOUNDARY >=
STACK_BOUNDARY. It is fixed for a given psABI.
iv. LOCAL_STACK_BOUNDARY in bits. Each function stack has its own stack
alignment requirement, which depends the alignment of its stack
variables,
LOCAL_STACK_BOUNDARY = MAX (alignment of each effective stack variable).
v. INCOMING_STACK_BOUNDARY in bits, which is the stack boundary at
function
entry. If a function is marked with __attribute__
((force_align_arg_pointer))
or -mstackrealign option is provided, INCOMING = STACK_BOUNDARY.
Otherwise,
INCOMING == MIN(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) because a
function can be called via psABI externally or called locally with
PREFERRED_STACK_BOUNDARY.
vi. REQUIRED_STACK_ALIGNMENT in bits, which is stack alignment required
by
local variables and calling other function. REQUIRED_STACK_ALIGNMENT ==
MAX(LOCAL_STACK_BOUNDARY,PREFERRED_STACK_BOUNDARY) in case of a non-leaf
function. For a leaf function, REQUIRED_STACK_ALIGNMENT ==
LOCAL_STACK_BOUNDARY.

This proposal won't adjust stack when INCOMING_STACK_BOUNDARY >=
REQUIRED_STACK_ALIGNMENT. Only when INCOMING_STACK_BOUNDARY <
REQUIRED_STACK_ALIGNMENT, it will adjust stack to
REQUIRED_STACK_ALIGNMENT
at prologue.

2.3 Initial development on i386 and x86_64
We initially support i386 and x86_64. In this document we focus more on
i386 because it is hard to implement because of the restriction of
having
a small register file.  But all that we discuss can be easily applied
to x86_64.

2.4 Emit more efficient prologue/epil

Re: A proposal to align GCC stack

2007-12-17 Thread Ross Ridge
Ye, Joey writes:
>i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386
>and 64 for x86_64. It is the minimum stack boundary. It is fixed.

Strictly speaking by the above definition it would be 8 for i386.
The hardware doesn't force the stack to be 32-bit aligned, it just
performs poorly if it isn't.

>v. INCOMING_STACK_BOUNDARY in bits, which is the stack boundary
>at function entry. If a function is marked with __attribute__
>((force_align_arg_pointer)) or -mstackrealign option is provided,
>INCOMING = STACK_BOUNDARY.  Otherwise, INCOMING == MIN(ABI_STACK_BOUNDARY,
>PREFERRED_STACK_BOUNDARY) because a function can be called via psABI
>externally or called locally with PREFERRED_STACK_BOUNDARY.

This section doesn't make sense to me.  The force_align_arg_pointer
attribute and -mstackrealign assume that the ABI is being
followed, while the -fpreferred-stack-boundary option effectively
changes the ABI.  According your defintions, I would think
that INCOMING should be ABI_STACK_BOUNDARY in the first case,
and MAX(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) in the second.
(Or just PREFERRED_STACK_BOUNDARY because a boundary less than the ABI's
should be rejected during command line processing.)

>vi. REQUIRED_STACK_ALIGNMENT in bits, which is stack alignment required
>by local variables and calling other function. REQUIRED_STACK_ALIGNMENT
>== MAX(LOCAL_STACK_BOUNDARY,PREFERRED_STACK_BOUNDARY) in case of a
>non-leaf function. For a leaf function, REQUIRED_STACK_ALIGNMENT ==
>LOCAL_STACK_BOUNDARY.

Hmm... I think you should define STACK_BOUNDARY as the minimum
alignment that ABI requires the stack pointer to keep at all times.
ABI_STACK_BOUNDARY should be defined as the stack alignment the
ABI requires at function entry.  In that case a leaf function's
REQUIRED_STACK_ALIGMENT should be MAX(LOCAL_STACK_BOUNDARY,
STACK_BOUNDARY).

>Because I386 PIC requires BX as GOT pointer and I386 may use AX, DX
>and CX as parameter passing registers, there are limited candidates for
>this proposal to choose. Current proposal suggests EDI, because it won't
>conflict with i386 PIC or regparm.

Could you pick a call-clobbered register in cases where one is availale?

>//  Reserve two stack slots and save return address 
>//  and previous frame pointer into them. By
>//  pointing new ebp to them, we build a pseudo 
>//  stack for unwinding

Hmmm... I don't know much about the DWARF unwind information, but
couldn't it handle this case without creating the "pseudo frame"?
Or at least be extended so it could?

Ross Ridge



Re: Designs for better debug info in GCC

2007-12-17 Thread Alexandre Oliva
On Dec 17, 2007, Joe Buck <[EMAIL PROTECTED]> wrote:

> On Mon, Dec 17, 2007 at 11:11:46PM -0200, Alexandre Oliva wrote:
>> Line number information has a well-defined meaning: it ought to
>> represent the source code line that best represents the source-code
>> construct that ended up implemented using that instruction.

> You implicitly assume that souch a source code line exists.

Actually, no.  I'm not sure where you got that impression, and how you
came to the conclusion that I'd assign line numbers the way you have.
To me, when you hoist something that is present in both blocks of a
conditional, it probably makes more sense to give it the line number
of the conditional, rather than that of either block.  But I won't
pretend to have thought very hard about this particular issue.  For
the time being, I'm focusing my efforts on local variable locations.

Anyhow, very clearly you don't want to mark such hoisted-out
computation as is_stmt.  This should eliminate at least the solvable
problem you're worried about.

>   out = a + b;
>   if (!cond)
> out += c;
>   return out;

> Furthermore, there isn't a place to put a breakpoint that will
> trigger only for the case where cond is true, as you can on
> unoptimized code.

Yep.  Sometimes code just is optimized away.  Can't stop that without
harming optimizations.

If dwarf line number programs were smarter, we could perhaps encode
multiple lines for the same instruction, along with conditions to tell
when the instruction applies to such or such lines, and even more
fancy stuff like that.  But line number programs don't let us express
this in Dwarf3.

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: Designs for better debug info in GCC

2007-12-17 Thread Alexandre Oliva
On Dec 17, 2007, Diego Novillo <[EMAIL PROTECTED]> wrote:

> On 12/17/07 19:50, Alexandre Oliva wrote:
>> Now, since you're so interested in it and you've already read the
>> various perspectives on the issue that I listed in my yesterday's
>> e-mail to you, would you help me improve this document, by letting me
>> know what you believe to be missing from the selected postings on
>> design strategies, rationales and goals:

> No.  I am not interested in organizing your thoughts for you.

Wow, nice shot!

So tell me, what part of what you've read in the selected bibliography
seemed not organized for you?  Maybe that's what I have to work on
first.

> I am interested in reading a single, concise and well organized design
> document that you produce for all of us to understand what you want to
> do.

You got that already, except now I'm no longer sure you've actually
read it.  Have you?

You got the goals.  You got the way I intend to get there, in two
levels of detail.  You got examples that show why the goals can't be
achieved in other simpler ways.  You got various justifications for
the representation I've chosen.

Would reformatting these and stamping a title on top make it worthy of
your interest?

I really don't see what else you might want, and if the above isn't
enough, then my rephrasing it all into a single document still
wouldn't be enough.  I'd be just wasting my time, and yours.

So, please do tell me, what is it that you're still missing?  Note
that I can't promise to deliver, but I can't possibly give you what
you want unless you help me figure out what it is.

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: A proposal to align GCC stack

2007-12-17 Thread H.J. Lu
On Mon, Dec 17, 2007 at 11:25:35PM -0500, Ross Ridge wrote:
> Ye, Joey writes:
> >i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386
> >and 64 for x86_64. It is the minimum stack boundary. It is fixed.
> 
> Strictly speaking by the above definition it would be 8 for i386.
> The hardware doesn't force the stack to be 32-bit aligned, it just
> performs poorly if it isn't.

We can change the wording.

> 
> >v. INCOMING_STACK_BOUNDARY in bits, which is the stack boundary
> >at function entry. If a function is marked with __attribute__
> >((force_align_arg_pointer)) or -mstackrealign option is provided,
> >INCOMING = STACK_BOUNDARY.  Otherwise, INCOMING == MIN(ABI_STACK_BOUNDARY,
> >PREFERRED_STACK_BOUNDARY) because a function can be called via psABI
> >externally or called locally with PREFERRED_STACK_BOUNDARY.
> 
> This section doesn't make sense to me.  The force_align_arg_pointer
> attribute and -mstackrealign assume that the ABI is being
> followed, while the -fpreferred-stack-boundary option effectively

According to Apple engineer who implemented the -mstackrealign,
on MacOS/ia32, psABI is 16byte, but -mstackrealign will assume
4byte, which is STACK_BOUNDARY.

> changes the ABI.  According your defintions, I would think
> that INCOMING should be ABI_STACK_BOUNDARY in the first case,
> and MAX(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) in the second.

That isn't true since some .o files may not be compiled with
-fpreferred-stack-boundary or with a different value of
-fpreferred-stack-boundary.

> (Or just PREFERRED_STACK_BOUNDARY because a boundary less than the ABI's
> should be rejected during command line processing.)

On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may
want to use 8 byte for PREFERRED_STACK_BOUNDARY.

> 
> >vi. REQUIRED_STACK_ALIGNMENT in bits, which is stack alignment required
> >by local variables and calling other function. REQUIRED_STACK_ALIGNMENT
> >== MAX(LOCAL_STACK_BOUNDARY,PREFERRED_STACK_BOUNDARY) in case of a
> >non-leaf function. For a leaf function, REQUIRED_STACK_ALIGNMENT ==
> >LOCAL_STACK_BOUNDARY.
> 
> Hmm... I think you should define STACK_BOUNDARY as the minimum
> alignment that ABI requires the stack pointer to keep at all times.
> ABI_STACK_BOUNDARY should be defined as the stack alignment the
> ABI requires at function entry.  In that case a leaf function's
> REQUIRED_STACK_ALIGMENT should be MAX(LOCAL_STACK_BOUNDARY,
> STACK_BOUNDARY).

That is true since if the only local variable is char, LOCAL_STACK_BOUNDARY
will be 1. But we want the stack to be aligned at STACK_BOUNDARY.
We will update our proposal. 

> 
> >Because I386 PIC requires BX as GOT pointer and I386 may use AX, DX
> >and CX as parameter passing registers, there are limited candidates for
> >this proposal to choose. Current proposal suggests EDI, because it won't
> >conflict with i386 PIC or regparm.
> 
> Could you pick a call-clobbered register in cases where one is availale?

Joey, Xuepeng, is that doable?

> 
> >//  Reserve two stack slots and save return address 
> >//  and previous frame pointer into them. By
> >//  pointing new ebp to them, we build a pseudo 
> >//  stack for unwinding
> 
> Hmmm... I don't know much about the DWARF unwind information, but
> couldn't it handle this case without creating the "pseudo frame"?
> Or at least be extended so it could?


Joey, Xuepeng, what do you think?


H.J.


Re: Designs for better debug info in GCC

2007-12-17 Thread Robert Dewar

Alexandre Oliva wrote:


Yes, I've considered something along these lines, but decided against
it, for we can't afford for debug information to affect executable
code generation in any way whatsoever, and we don't want to pessimize
optimized code when compiling without -g just so that compiling with
-g would get us the same code.


I disagree, I think it would be fine to degrade -O1 slightly to achieve
full debuggability, and of course -g cannot affect the generated code.
If indeed

a) it is possible to get perfect debuggability without any pessimization
b) that includes unexpected jumping around
c) everyone agrees on how to achieve a) and b)
d) this is implemented

then fine, but in the absence of these conditions, if we need to
pessimize -O1 code slightly to achieve this, that's OK by me. If
it really worries people, introduce a -Og that achieves this. In
my experience people use -O1 not because they are very performance
sensitive (those folk use -O2), but because -O0 is so horrible,
that they need something better than that for production delivery.


Re: Designs for better debug info in GCC

2007-12-17 Thread Robert Dewar

Alexandre Oliva wrote:


Yep.  Sometimes code just is optimized away.  Can't stop that without
harming optimizations.


OK, so you are agreeing that good debuggability is impossible
with all the optimizations in place, so once again, let's have
an optimziation level that optimizes as far as possible without
harming debuggability.


If dwarf line number programs were smarter, we could perhaps encode
multiple lines for the same instruction, along with conditions to tell
when the instruction applies to such or such lines, and even more
fancy stuff like that.  But line number programs don't let us express
this in Dwarf3.


So, that's not an option.




RE: A proposal to align GCC stack

2007-12-17 Thread Ye, Joey
Ross, HJ,

> 
> >Because I386 PIC requires BX as GOT pointer and I386 may use AX, DX
> >and CX as parameter passing registers, there are limited candidates for
> >this proposal to choose. Current proposal suggests EDI, because it won't
> >conflict with i386 PIC or regparm.
> 
> Could you pick a call-clobbered register in cases where one is availale?
I think it is doable. In current Apple engineer's code to support 
-mstackrealign,
hard register ECX is used. We need to add additional code to find which caller 
save register is not used to pass parameters. If none of them is available, 
we still have to use callee save reg like EDI.

> 
> >//  Reserve two stack slots and save return address 
> >//  and previous frame pointer into them. By
> >//  pointing new ebp to them, we build a pseudo 
> >//  stack for unwinding
> 
> Hmmm... I don't know much about the DWARF unwind information, but
> couldn't it handle this case without creating the "pseudo frame"?
> Or at least be extended so it could?

I haven't spent time investigated it yet. I agree it will be much more 
beautiful 
without "pseudo frame". I will be happy if solution can be found or be 
suggested here. 
But I doubt if it is worthwhile effort. Remember only when stack adjustment + 
alloca is 
present, will "pseudo frame" be generated. It may not be so common to impact 
performance.


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of H.J. Lu
Sent: 2007年12月18日 13:17
To: Ross Ridge
Cc: gcc@gcc.gnu.org
Subject: Re: A proposal to align GCC stack

On Mon, Dec 17, 2007 at 11:25:35PM -0500, Ross Ridge wrote:
> Ye, Joey writes:
> >i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386
> >and 64 for x86_64. It is the minimum stack boundary. It is fixed.
> 
> Strictly speaking by the above definition it would be 8 for i386.
> The hardware doesn't force the stack to be 32-bit aligned, it just
> performs poorly if it isn't.

We can change the wording.

> 
> >v. INCOMING_STACK_BOUNDARY in bits, which is the stack boundary
> >at function entry. If a function is marked with __attribute__
> >((force_align_arg_pointer)) or -mstackrealign option is provided,
> >INCOMING = STACK_BOUNDARY.  Otherwise, INCOMING == MIN(ABI_STACK_BOUNDARY,
> >PREFERRED_STACK_BOUNDARY) because a function can be called via psABI
> >externally or called locally with PREFERRED_STACK_BOUNDARY.
> 
> This section doesn't make sense to me.  The force_align_arg_pointer
> attribute and -mstackrealign assume that the ABI is being
> followed, while the -fpreferred-stack-boundary option effectively

According to Apple engineer who implemented the -mstackrealign,
on MacOS/ia32, psABI is 16byte, but -mstackrealign will assume
4byte, which is STACK_BOUNDARY.

> changes the ABI.  According your defintions, I would think
> that INCOMING should be ABI_STACK_BOUNDARY in the first case,
> and MAX(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) in the second.

That isn't true since some .o files may not be compiled with
-fpreferred-stack-boundary or with a different value of
-fpreferred-stack-boundary.

> (Or just PREFERRED_STACK_BOUNDARY because a boundary less than the ABI's
> should be rejected during command line processing.)

On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may
want to use 8 byte for PREFERRED_STACK_BOUNDARY.

> 
> >vi. REQUIRED_STACK_ALIGNMENT in bits, which is stack alignment required
> >by local variables and calling other function. REQUIRED_STACK_ALIGNMENT
> >== MAX(LOCAL_STACK_BOUNDARY,PREFERRED_STACK_BOUNDARY) in case of a
> >non-leaf function. For a leaf function, REQUIRED_STACK_ALIGNMENT ==
> >LOCAL_STACK_BOUNDARY.
> 
> Hmm... I think you should define STACK_BOUNDARY as the minimum
> alignment that ABI requires the stack pointer to keep at all times.
> ABI_STACK_BOUNDARY should be defined as the stack alignment the
> ABI requires at function entry.  In that case a leaf function's
> REQUIRED_STACK_ALIGMENT should be MAX(LOCAL_STACK_BOUNDARY,
> STACK_BOUNDARY).

That is true since if the only local variable is char, LOCAL_STACK_BOUNDARY
will be 1. But we want the stack to be aligned at STACK_BOUNDARY.
We will update our proposal. 



H.J.


Re: Designs for better debug info in GCC

2007-12-17 Thread Alexandre Oliva
On Dec 18, 2007, Robert Dewar <[EMAIL PROTECTED]> wrote:

> Alexandre Oliva wrote:
>> Yep.  Sometimes code just is optimized away.  Can't stop that without
>> harming optimizations.

> OK, so you are agreeing that good debuggability is impossible
> with all the optimizations in place, so once again, let's have
> an optimziation level that optimizes as far as possible without
> harming debuggability.

I don't oppose such an optimization level, even though I don't know
that we agree on what "good debuggability" stands for.

It's just that changing optimizations is precisely *against* the goals
of my current project.  So, don't expect significant efforts to this
end from me at this time.

>> If dwarf line number programs were smarter, we could perhaps encode
>> multiple lines for the same instruction, along with conditions to tell
>> when the instruction applies to such or such lines, and even more
>> fancy stuff like that.  But line number programs don't let us express
>> this in Dwarf3.

> So, that's not an option.

Yup.  Best we can do right now is to emit the condition line number.

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: Designs for better debug info in GCC

2007-12-17 Thread Kai Henningsen
On Tue, Dec 18, 2007 at 02:38:31AM -0200, Alexandre Oliva wrote:

> Would reformatting these and stamping a title on top make it worthy of
> your interest?

Actually, I think that *would* help (though, of course, it's impossible
to predict if it would help *enough*).

I've noticed before (though this thread is a particularly extreme
example) that GCC developers seem no more immune than other people, from
being able to ignore what's in a mail message (or news article) they're
replying to, even up to ignoring the carefully-selected part they're
quoting.

I don't claim to understand it (nor to be completely immune to it
myself), but I'm no longer surprised by it. Disappointed, but not
surprised.

Anyway, the point is that this seems much rarer when the subject is
*not* in the inbox or a newsgroup. For whatever reason, people apply
their reading skills differently in different situations.

So, my advice would be:

1. Wait a while, so people have time to calm down.

2. Reformat and reorganize the stuff.

3. Put it in an obviously different format - say, give a link to a PDF,
instead of putting it in a mail to this list.

Oh, and it probably wouldn't hurt to give a short summary of what you
did to the various optimizers, including mentioning "no change", *after*
you know that that actually works. (For a work in progress, people seem
to often disbelieve such claims, however well justified ... at least, if
they're already looking hard for arguments against it, however
spurious.)

And no, I have no idea why this particular discussion degenerated so
badly, and similar others didn't. Your style of argumentation may not
have been perfect, but the same can be said for many other people here,
and it doesn't always seem to lead to a meltdown. Maybe it depends on
unpredictable factors like the mood people are in when they go reading
their mail.