Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Tomash Brechko
On Mon, Oct 22, 2007 at 00:07:50 +0100, Dave Korn wrote:
>   Because of the 'as-if' rule.  Since the standard is neutral with regard to
> threads, gcc does not have to take them into account when it decides whether
> an optimisation would satisfy the 'as-if' rule.

If this would be true, then the compiler is free to inject the
sequence

  mov mem -> reg
  mov reg -> mem

just _anywhere_.  How the programmer can predict where and when to
lock the mutex to protect mem?  The only thing we could relay on then
is that the compiler is sound, it wouldn't inject such a sequence
unless it really feels so.  But still, how to determine when the
compiler really feels so?

Here's another piece of code, more real and sound this time:


  #include 

  static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
  static int acquires_count = 0;

  int
  trylock()
  {
int res;

res = pthread_mutex_trylock(&mutex);
if (res == 0)
  ++acquires_count;

return res;
  }


Is it thread safe?  Or rather, should the compiler preserve its
thread-safeness, as seen from the programmer's POV?  Otherwise I don't
get how pthread_mutex_trylock() could possibly ever be used, because
it's exactly the case when you _have_ to do the access based on the
condition, "assume the worst" won't work here.  GCC 4.3 with -O1
generates:

  trylock:
  pushl   %ebp
  movl%esp, %ebp
  subl$8, %esp
  movl$mutex, (%esp)
  callpthread_mutex_trylock
  cmpl$1, %eax; test res
  movlacquires_count, %edx; load
  adcl$0, %edx; maybe add 1
  movl%edx, acquires_count; store
  leave
  ret


-- 
   Tomash Brechko


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Erik Trulsson
On Mon, Oct 22, 2007 at 01:36:17PM +0400, Tomash Brechko wrote:
> On Mon, Oct 22, 2007 at 00:07:50 +0100, Dave Korn wrote:
> >   Because of the 'as-if' rule.  Since the standard is neutral with regard to
> > threads, gcc does not have to take them into account when it decides whether
> > an optimisation would satisfy the 'as-if' rule.
> 
> If this would be true, then the compiler is free to inject the
> sequence
> 
>   mov mem -> reg
>   mov reg -> mem
> 
> just _anywhere_.

As far as the C standard is concerned, yes, the compiler is most certainly
free to insert such a sequence almost anywhere.

If a variable has been declared as 'volatile' however, then all accesses to
it must be according to the abstract machine defined by the C standard, i.e.
the compiler is not allowed to optimize away any access to the variable, nor
is it allowed to insert spurious accesses to the variable.

It is worth noting that exactly what constitutes an access is
implementation-defined.

It is also worth noting that just declaring a variable 'volatile' does not
help all that much in making it safer to use in a threded environment if you
have multiple CPUs.  (There is nothing that says that a multi-CPU system has
to have any kind of automatic cache-coherence.)


>  How the programmer can predict where and when to
> lock the mutex to protect mem?  The only thing we could relay on then
> is that the compiler is sound, it wouldn't inject such a sequence
> unless it really feels so.  But still, how to determine when the
> compiler really feels so?

You will have to read the documentation for the compiler and the
threading library caerfully, and hope that they have something useful
to say on this matter.  All too often they won't, in which case
you will have to do what most programmars do in practice in this situation:
Write something that "should" work, and hope for the best.  Most of the time
it will actually work.


My own conclusion from this discussion (and others) is that shared memory is
a lousy paradigm for communication between different threads of execution,
precisely because it is so hard to specify exactly what should happen or not
happen in various situations. (Most of the time the relevant standards
do not actually specify this in sufficient detail.)  I also conclude that
POSIX threads should be avoided if you are really concerned about
correctness.   (Which of course hasn't stopped lots of people from using
them - with varying results.)
Message passing has an advantage here since then only the people writing
the actual message-passing routines need to know about the underlying
details.



> 
> Here's another piece of code, more real and sound this time:
> 
> 
>   #include 
> 
>   static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
>   static int acquires_count = 0;
> 
>   int
>   trylock()
>   {
> int res;
> 
> res = pthread_mutex_trylock(&mutex);
> if (res == 0)
>   ++acquires_count;
> 
> return res;
>   }
> 
> 
> Is it thread safe?  Or rather, should the compiler preserve its
> thread-safeness, as seen from the programmer's POV?  Otherwise I don't
> get how pthread_mutex_trylock() could possibly ever be used, because
> it's exactly the case when you _have_ to do the access based on the
> condition, "assume the worst" won't work here.  GCC 4.3 with -O1
> generates:
> 
>   trylock:
>   pushl   %ebp
>   movl%esp, %ebp
>   subl$8, %esp
>   movl$mutex, (%esp)
>   callpthread_mutex_trylock
>   cmpl$1, %eax; test res
>   movlacquires_count, %edx; load
>   adcl$0, %edx; maybe add 1
>   movl%edx, acquires_count; store
>   leave
>   ret
> 

What happens if you declare the variables as 'volatile' ?
(There is no guarantee that this will make things better, but it
is very likely.)



-- 

Erik Trulsson
[EMAIL PROTECTED]


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Robert Dewar

Erik Trulsson wrote:


It is also worth noting that just declaring a variable 'volatile' does not
help all that much in making it safer to use in a threded environment if you
have multiple CPUs.  (There is nothing that says that a multi-CPU system has
to have any kind of automatic cache-coherence.)


The first sentence here could be misleading, there are LOTS of systems
where there is automatic cache-coherence, and of course the use of
'volatile' on such systems does indeed help. If you are working on
a systemn without cache-coherence, you indeed have big problems, but
that's rarely the case, most multi-processor computers in common use
do guarantee cache coherence.



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Andrew Haley
Tomash Brechko writes:
 > On Mon, Oct 22, 2007 at 00:07:50 +0100, Dave Korn wrote:
 > >   Because of the 'as-if' rule.  Since the standard is neutral
 > > with regard to threads, gcc does not have to take them into
 > > account when it decides whether an optimisation would satisfy the
 > > 'as-if' rule.
 > 
 > If this would be true, then the compiler is free to inject the
 > sequence
 > 
 >   mov mem -> reg
 >   mov reg -> mem
 > 
 > just _anywhere_.

That's right.  This isn't a standards conformance issue, rather one of
quality of implementation.

The core problem here seems to be that the "C with threads" memory
model isn't sufficiently well-defined to make a determination
possible.  You're assuming that you have no resposibility to mark
shared memory protected by a mutex as volatile, but I know of nothing
in the C standard that makes such a guarantee.  A prudent programmer
will make conservative assumptions.

Please have a read of [1].  Let us know if anything you have observed
isn't covered in that paper.

Andrew.

[1] Hans-Juergen Boehm. Threads cannot be implemented as a library. In
Proc. of the ACM SIGPLAN 2005 Conf. on Programming Language
Design and Implementation (PLDI), pages 261?268, Chicago, IL, June
2005.


RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Dave Korn
On 22 October 2007 02:20, skaller wrote:

> On Mon, 2007-10-22 at 00:07 +0100, Dave Korn wrote:
> 
>>   If you really want all externally-visible accesses to v to be made
>> exactly as the code directs, rather than allowing gcc to optimise them in
>> any way that (from the program's POV) it's just the same 'as-if' they had
>> been done exactly, make v volatile.
> 
> That is not enough. Apart from the lack of ISO semantics for volatile,
> typically a compiler will take volatile as a hint to not hold
> values of the variable in a register.
> 
> On a multi-processor, this is not enough, because each CPU
> may still hold modified values in separate caches.

  Yes.  volatile's job is to make the compiler issue real memory load and
store operations when and where you say in the code.  Beyond that it's all up
to you, just like fflush doesn't guarantee the kernel/filesystem write-back
cache is emptied, only the C runtime library buffer.

> But I don't actually know what gcc does, although I guess
> it does nothing.

  Yep.

>  The OS has to do the right thing here
> when a mutex is locked etc, but the code for that is
> probably in the kernel which is better able to manage
> things like cache synchronisation than a compiler.

  The OS and the system libc together, yes.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Tomash Brechko
On Mon, Oct 22, 2007 at 11:19:31 +0100, Andrew Haley wrote:
> Please have a read of [1].  Let us know if anything you have observed
> isn't covered in that paper.
> 
> [1] Hans-Juergen Boehm. Threads cannot be implemented as a library. In
> Proc. of the ACM SIGPLAN 2005 Conf. on Programming Language
> Design and Implementation (PLDI), pages 261?268, Chicago, IL, June
> 2005.

Unfortunately I'm not lucky enough to have ACM access.  But from the
Abstract:

  We provide specific arguments that a pure library approach, in which
  the compiler is designed independently of threading issues, cannot
  guarantee correctness of the resulting code.


Can't agree less!  That's why for _practical_ reasons I'd say GCC
should be thread-aware, even if _theoretically_ it doesn't have to.
And AFAIU it already _is_, for the most part of it.  That's why I want
to see Bug#31862 be confirmed, accepted, and fixed.


-- 
   Tomash Brechko


RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Dave Korn
On 22 October 2007 11:51, Tomash Brechko wrote:

> On Mon, Oct 22, 2007 at 11:19:31 +0100, Andrew Haley wrote:
>> Please have a read of [1].  Let us know if anything you have observed
>> isn't covered in that paper. 
>> 
>> [1] Hans-Juergen Boehm. Threads cannot be implemented as a library. In
>> Proc. of the ACM SIGPLAN 2005 Conf. on Programming Language
>> Design and Implementation (PLDI), pages 261?268, Chicago, IL, June
>> 2005.
> 
> Unfortunately I'm not lucky enough to have ACM access.


http://www.google.com/search?q=Threads+cannot+be+implemented+as+a+library&sour
ceid=mozilla-search&start=0&start=0&ie=utf-8&oe=utf-8&client=firefox-a&rls=org
.mozilla:en-GB:official


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Tomash Brechko
On Mon, Oct 22, 2007 at 14:50:44 +0400, Tomash Brechko wrote:
> Can't agree less!

Can't agree more!, that's what it was supposed to say, think you've
got it right ;).


-- 
   Tomash Brechko


RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Dave Korn
On 22 October 2007 11:51, Tomash Brechko wrote:

> Can't agree less!  That's why for _practical_ reasons I'd say GCC
> should be thread-aware, even if _theoretically_ it doesn't have to.
> And AFAIU it already _is_, for the most part of it.  That's why I want
> to see Bug#31862 be confirmed, accepted, and fixed.


  Re that particular bug, there are grounds to say that if gcc is going to
implement a flag -fopenmp, it should try and generate threading-compatible
code, I agree, but your point:

"And the essence of this bug report is that gcc chooses to unconditionally
write to variables that are simply lexically mentioned but otherwise aren't
accessed during execution."

is simply something that you have no right to expect of the compiler in the C
language unless you use volatile.  Here's Jakub's original example:

"int var;
void
foo (int x)
{
  int i;
  for (i = 0; i < 100; i++)
{
  if (i > x)
var = i;
}
}

When some other thread modifies var at the same time while foo (200) is
executed, the compiler inserted a race which doesn't really exist in the
original program, as it will do reg = var; ... var = reg; even when var was
never modified."

  If var is volatile, the compiler won't do that, and it is I'm afraid the
right answer to the problem in this case: 'var' is inappropriately declared if
it is to be used from multiple threads in this way.  And even volatile
wouldn't help if the code said

  if (i > x)
var += i;

instead of a simple assignment.  The race in fact *does* exist in the original
program, but is hidden by the fact that you don't care which of two operations
that overwrite the previous value complete in which order, but you're assuming
the operation that modifies var is atomic, and there's nothing to innately
guarantee that in the original program.  The race condition *is* already
there.

  There should really be a lock/unlock mutex sequence around the assignment to
var, but within the scope of the if condition.  And at that point you'd find
that gcc didn't hoist anything past the subroutine calls to the mutex
lock/unlock and so only the code path through the then-part of the if would
ever touch the variable at all.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Andrew Haley
Tomash Brechko writes:
 > On Mon, Oct 22, 2007 at 11:19:31 +0100, Andrew Haley wrote:
 > > Please have a read of [1].  Let us know if anything you have observed
 > > isn't covered in that paper.
 > > 
 > > [1] Hans-Juergen Boehm. Threads cannot be implemented as a library. In
 > > Proc. of the ACM SIGPLAN 2005 Conf. on Programming Language
 > > Design and Implementation (PLDI), pages 261?268, Chicago, IL, June
 > > 2005.
 > 
 > Unfortunately I'm not lucky enough to have ACM access.  But from the
 > Abstract:

www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf 

 >   We provide specific arguments that a pure library approach, in which
 >   the compiler is designed independently of threading issues, cannot
 >   guarantee correctness of the resulting code.
 > 
 > Can't agree less!  That's why for _practical_ reasons I'd say GCC
 > should be thread-aware, even if _theoretically_ it doesn't have to.

Well, that's a big job: you'd have to decide on what a memory model
really should be, and then implement that model.  The right approach
is surely to do this within the standardization bodies, which seems to
be the approach Hans Boehm is suggesting.  In the meantime, a prudent
programmer will make conservative assumptions and use volatile,
especially if they hope to write portable programs.

Andrew.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Tomash Brechko
On Mon, Oct 22, 2007 at 11:54:47 +0100, Dave Korn wrote:
> http://www.google.com/search?q=Threads+cannot+be+implemented+as+a+library&sour
> ceid=mozilla-search&start=0&start=0&ie=utf-8&oe=utf-8&client=firefox-a&rls=org
> .mozilla:en-GB:official


Thanks!


-- 
   Tomash Brechko


From SSA back to GIMPLE

2007-10-22 Thread Jose .
Hi all,

this is my first post in this mailing list. I'm trying to understand
GCC 4 as part of my research, but I'm finding questions which are
difficult to answer just with online documentation.

I understand that the whole process of compiling a C file involves
GENERIC->GIMPLE->SSA->GIMPLE->RTL

If I'm not wrong, GCC currently cannot go from SSA to RTL directly.
What I don't understand is what happens with all versions of the same
variable when doing the SSA->GIMPLE step. Are they mixed into a single
variable declaration? Are they treated as separate variables and
handled later by the register allocator?

Thanks in advance.
Jose.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Tomash Brechko
On Mon, Oct 22, 2007 at 12:07:20 +0100, Dave Korn wrote:
> And even volatile wouldn't help if the code said
> 
>   if (i > x)
> var += i;
> 
> instead of a simple assignment.  The race in fact *does* exist in the original
> program, but is hidden by the fact that you don't care which of two operations
> that overwrite the previous value complete in which order, but you're assuming
> the operation that modifies var is atomic, and there's nothing to innately
> guarantee that in the original program.  The race condition *is* already
> there.

Why?  For that example, if executed verbatim, it is either i > x
always false, or the mutex is properly acquired.  No one is assuming
atomic update.



-- 
   Tomash Brechko


RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Dave Korn
On 22 October 2007 12:17, Tomash Brechko wrote:

> On Mon, Oct 22, 2007 at 12:07:20 +0100, Dave Korn wrote:
>> And even volatile wouldn't help if the code said
>> 
>>   if (i > x)
>> var += i;
>> 
>> instead of a simple assignment.  The race in fact *does* exist in the
>> original program, but is hidden by the fact that you don't care which of
>> two operations that overwrite the previous value complete in which order,
>> but you're assuming the operation that modifies var is atomic, and there's
>> nothing to innately guarantee that in the original program.  The race
>> condition *is* already there.
> 
> Why?  For that example, if executed verbatim, it is either i > x
> always false, or the mutex is properly acquired.  No one is assuming
> atomic update.

  *What* mutex are you referring to?  There is no mutex in that code.

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Tomash Brechko
On Mon, Oct 22, 2007 at 12:08:02 +0100, Andrew Haley wrote:
> Well, that's a big job: you'd have to decide on what a memory model
> really should be, and then implement that model.

Wouldn't the following rule of thumb work?: GCC is allowed to inject
additional store operations on some execution path only if there are
explicit store operations (i.e. issued by the user code if read
verbatim).

The whole problem will vanish if the last store that GCC adds will be
made conditional, like

   if (there_were_explicit_stores_already)
 store;

When execution do not get to basic blocks that have stores, GCC
shouldn't add any.


-- 
   Tomash Brechko


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Tomash Brechko
On Mon, Oct 22, 2007 at 12:19:40 +0100, Dave Korn wrote:
>   *What* mutex are you referring to?  There is no mutex in that code.

I was talking about the code in the comment#7.  For the code in the
comment#1, the piece is simply incomplete.  For it, mutex should be
used if x < 99, not clear if x >= 99.


-- 
   Tomash Brechko


Safe optimization flags for x86 processors.

2007-10-22 Thread numpszi

I don't want to spam, but i have an interesting program. 
It is on the http://procbench.sourceforge.net/ 
It is only for linux, with pb_gcc (or pb_g++) you can execute, 
and compile programs with the best optimization flags. It helps 
me a lot! (But, sometimes it doesn't generates the best flags)
-- 
View this message in context: 
http://www.nabble.com/Safe-optimization-flags-for-x86-processors.-tf4670532.html#a13342268
Sent from the gcc - Dev mailing list archive at Nabble.com.



RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Dave Korn
On 22 October 2007 12:27, Tomash Brechko wrote:

> On Mon, Oct 22, 2007 at 12:19:40 +0100, Dave Korn wrote:
>>   *What* mutex are you referring to?  There is no mutex in that code.
> 
> I was talking about the code in the comment#7.  For the code in the
> comment#1, the piece is simply incomplete.  For it, mutex should be
> used if x < 99, not clear if x >= 99.

  Gotcha.  Well, the rule still is: if you want an exact one-to-one
relationship between assignments in your program and externally-visible memory
accesses, use volatile.  C is not a glorified assembler, it is an idealised
virtual machine implemented on the hardware of a real underlying host, and you
can't make assumptions about internal implementation details of that virtual
machine or the relationship between it and the real machine which is hosting
the code.  The optimisation the compiler is making here is a big win in normal
code, you wouldn't want to disable it unless absolutely necessary; to be
precise, you wouldn't want to automatically disable it for every loop and
variable in a program that used -fopenmp just because /some/ of the variables
in that program couldn't be safely accessed that way.

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: From SSA back to GIMPLE.

2007-10-22 Thread J.C. Pizarro
Jose wrote:
> Hi all,
>
> this is my first post in this mailing list. I'm trying to understand
> GCC 4 as part of my research, but I'm finding questions which are
> difficult to answer just with online documentation.
>
> I understand that the whole process of compiling a C file involves
> GENERIC->GIMPLE->SSA->GIMPLE->RTL
>
> If I'm not wrong, GCC currently cannot go from SSA to RTL directly.
> What I don't understand is what happens with all versions of the same
> variable when doing the SSA->GIMPLE step. Are they mixed into a single
> variable declaration? Are they treated as separate variables and
> handled later by the register allocator?
>
> Thanks in advance.
> Jose.

Is not it easy to write 3 stages GENERIC->GIMPLE->RTL instead of 5 stages?

Is meaningful the optimization of the complex bi-transformation
GIMPLE->SSA->GIMPLE?

Is more powerful GENERIC->GIMPLE->RTL + "trial-and-error" local optimization?

   Sincerely, J.C. Pizarro


RE: one question: tree-ssa vs no tree-ssa? no such global optimization exists.

2007-10-22 Thread Dave Korn
On 20 October 2007 16:40, J.C. Pizarro wrote:

> * Was it useful the implementation of the complicated tree-ssa code
> waited for long time (many years)?
> 
> * Was it better the optimization without tree-ssa code?

  Why in a style like Yoda these questions you are asking?

> If doesn't exist a method for global optimization to use tree-ssa then
> * why did not it implement the simplest trial-and-error method for
> local optimization (e.g. minima/maxima local) following the K.I.S.S.
> principle without tree-ssa code?

  Because 99% of the optimisations performed by gcc cannot be easily modeled
as a problem of local minima determination in a scalar field, and to do so
would be massively inefficient and overgeneralised compared to implementing
more specialised code to run the optimisation.

> IMHO,

  Your opinion is based on uninformed guesswork.  Why don't you try making the
change yourself and performing *measurements*.  This is science, not religion:
we make observations of the universe to find out what is the case, rather than
pronouncing that whatever we wish must somehow be true.

> There are other methods of search of minima/maxima local as Hill
> Climbing, Beam Search, Genetic Algorithms, Simulated Annealing, Tabu
> Search, A*, Alfa-Beta, Min-Max, Branch-and-Bound, Greedy, etc.

  Well done.  You've got a hammer.  That doesn't mean that every problem in
the world just suddenly turned into a nail.

> The extension with more optimization's features is more easy without
> tree-ssa code.

  Massively wrong, but you don't have to take my word for it: try it and see.

  A few limited areas of some of the optimisations could be usefully modelled
in this way - for instance, it might be a useful technique for making better
decisions about when the costs and benefits of inlining a function were, or
combining insns whilst bearing in mind all the related costs; but that's only
helping you make the decision about whether or not to optimise in a particular
case, and you still need all the code that can parse the source under
compilation, spot opportunities for optimisation, and rearrange the IR into
the optimised form.  That's where all the real complexity lies, and that's
what SSA makes a whole lot more efficient, reliable, and debuggable and
maintainable.

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Safe optimization flags for x86 processors.

2007-10-22 Thread Manuel López-Ibáñez
On 22/10/2007, numpszi <[EMAIL PROTECTED]> wrote:
>
> I don't want to spam, but i have an interesting program.
> It is on the http://procbench.sourceforge.net/
> It is only for linux, with pb_gcc (or pb_g++) you can execute,
> and compile programs with the best optimization flags. It helps
> me a lot! (But, sometimes it doesn't generates the best flags)

Hi,

People in gcc-help may also be interested in this. I have added it to the Wiki:

http://gcc.gnu.org/wiki/Links

Cheers,

Manuel.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Tomash Brechko
On Mon, Oct 22, 2007 at 14:53:41 +0100, Dave Korn wrote:
> The optimisation the compiler is making here is a big win in normal
> code, you wouldn't want to disable it unless absolutely necessary;
> to be precise, you wouldn't want to automatically disable it for
> every loop and variable in a program that used -fopenmp just because
> /some/ of the variables in that program couldn't be safely accessed
> that way.

I'd rather wish the optimization would be done differently.  Currently
we have:

 mem -> reg;
   loop  loop
 if (condition)=> optimize =>  if (condition)
   val -> mem;   val -> reg;
 reg -> mem;


But it could use additional register and be:

 0 -> flag_reg;
 loop
   if (condition)
 val -> reg;
 1 -> flag_reg;
 if (flag_reg == 1)
   reg -> mem;


Note that by doing so we also eliminate all memory accesses when they
are not needed (when condition is never true), and memory bandwidth is
a major limiting factor nowadays.  Actually, for the very first code
piece of this thread I'd say that optimization


 mem -> reg;
   if (condition)   => optimize =>   if (condition)
 val -> mem;   val -> reg;
 reg -> mem;

(there's no loop) is actually a counter-optimization even in
single-threaded case: we replace a branch, which surely has its costs,
with unconditional memory load and store, which cost much more.  Even
if branching would flush CPU pipeline even when jump destination is
already in the pipeline (is this the case?), memory load has its own
quite big cost plus the cost of flushing one line from the cache just
to perform single operation on mem.

So, why not use flag_reg and thus make GCC thread-aware for this case?
I read the article suggested by Andrew Haley, its main point is that
the compiler should be made thread-aware.  Making all shared objects
volatile is an overkill, and is more a trick rather than a solution.


-- 
   Tomash Brechko


Re: From SSA back to GIMPLE.

2007-10-22 Thread Paolo Bonzini

J.C. Pizarro wrote:

 Are they mixed into a single

variable declaration? Are they treated as separate variables and
handled later by the register allocator?


If possible, the former.  If not possible, they are kept as separate 
variables.  This happens if the subscripted variables have overlapping 
live ranges because of optimizations that were made on the SSA form.



Is not it easy to write 3 stages GENERIC->GIMPLE->RTL instead of 5 stages?

Is meaningful the optimization of the complex bi-transformation
GIMPLE->SSA->GIMPLE?


I don't know what you mean, but yes, there is value in going to SSA and 
back.  SSA makes global optimization much easier, and that's the main 
improvement introduced in GCC 4.0 and later refined.  Of course not all 
optimizations benefit from SSA, some (such as OpenMP implementation) 
only benefit from having a high-level intermediate representation 
(GIMPLE).  But most do, in one way or the other.


In the future, GCC might instead do GENERIC->GIMPLE->SSA->RTL, without 
going back to GIMPLE.  But the SSA step is there to stay.  :-)


Paolo


Re: one question: tree-ssa vs no tree-ssa? no such global optimization exists.

2007-10-22 Thread Paolo Bonzini



* Was it useful the implementation of the complicated tree-ssa code
waited for long time (many years)?

* Was it better the optimization without tree-ssa code?


  Why in a style like Yoda these questions you are asking?


He not speaks like Yoda, uses the order for words that is in Spanish (no 
habla como Yoda, utiliza la misma disposicion para las palabras que es 
in espanol).


Which begs the question, why does Yoda puts word in the same order you'd 
use in Spanish.


Paolo



RE: one question: tree-ssa vs no tree-ssa? no such global optimization exists.

2007-10-22 Thread Dave Korn
On 22 October 2007 15:37, Paolo Bonzini wrote:

>>> * Was it useful the implementation of the complicated tree-ssa code
>>> waited for long time (many years)?
>>> 
>>> * Was it better the optimization without tree-ssa code?
>> 
>>   Why in a style like Yoda these questions you are asking?
> 
> He not speaks like Yoda, uses the order for words that is in Spanish (no
> habla como Yoda, utiliza la misma disposicion para las palabras que es
> in espanol).

  ¡Sí, lo reconocí!  Fue un juego con palabras.  El punto mas importante eran
lo que he dicho acerca la forma árbol-SSA del IR de GCC.

> Which begs the question, why does Yoda puts word in the same order you'd
> use in Spanish.

  ¿Porque es Español?  (Yoda, significo, no JC!)

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: From SSA back to GIMPLE.

2007-10-22 Thread Zdenek Dvorak
Dear Mr. Pizzaro,

> Is not it easy to write 3 stages GENERIC->GIMPLE->RTL instead of 5 stages?
> 
> Is meaningful the optimization of the complex bi-transformation
> GIMPLE->SSA->GIMPLE?
> 
> Is more powerful GENERIC->GIMPLE->RTL + "trial-and-error" local optimization?
> 
>Sincerely, J.C. Pizarro

everyone else here is too polite to tell it to you, but could you please
shut up, until:

-- you learn at least basics of English grammar (so that we can actually
   understand what you are saying), and
-- at least something about gcc and compilers in general (so that what
   you say makes some sense)?

While I was mildly annoyed by your previous "contributions" to the
discussion in the gcc mailing list, I could tolerate those.  But
answering a seriously ment question of a beginner by this confusing
and completely irrelevant drivel is another thing.

Sincerely,

Zdenek Dvorak


I need your fast reply!

2007-10-22 Thread Pete Fitz

Hi Dear,


I write you this mail with utmost sincerity and truth,
 hoping you will be of great help to me. I am Pete Fitz, 
a 61yrs old man living with Cancer. My condition at home 
is so bad that my doctor said I don't have time left. 
I have a daughter, who is just in high school, 
 with nobody to care of her when I am gone.


I will have to get someone to manage my funds and 
act in my place till my daughter comes of good age and maturity.
 You will be given all the rights to claim the benefits and 
secure for my Jessica.I wouldn't want my lazy cousin whom might
 be with my child to be aware of such plan till she grows because
 they will want to rip me off and treat her bad.


As a proven and a tested personality in your field, I will want 
to appoint you as a MANAGER/GUARDIAN and give you the authority 
to act on my behalf, until she reaches the speculated age of 
handling things by herself. Meanwhile, I do not want her to be 
too hasty about going into any other thing than attending her 
lectures in school. 


I will send you details when you write back.


Yours truly,


Pete Fitz











Re: From SSA back to GIMPLE.

2007-10-22 Thread David Edelsohn
Please keep the discussion on a technical level and not about
someone's fluency with the English language.

Gracias, David



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Michael Matz
Hi,

On Mon, 22 Oct 2007, Tomash Brechko wrote:

> On Mon, Oct 22, 2007 at 14:53:41 +0100, Dave Korn wrote:
> > The optimisation the compiler is making here is a big win in normal
> > code, you wouldn't want to disable it unless absolutely necessary;
> > to be precise, you wouldn't want to automatically disable it for
> > every loop and variable in a program that used -fopenmp just because
> > /some/ of the variables in that program couldn't be safely accessed
> > that way.
> 
> I'd rather wish the optimization would be done differently.  Currently
> we have:
> 
>  mem -> reg;
>loop  loop
>  if (condition)=> optimize =>  if (condition)
>val -> mem;   val -> reg;
>  reg -> mem;
> 
> 
> But it could use additional register and be:
> 
>  0 -> flag_reg;
>  loop
>if (condition)
>  val -> reg;
>  1 -> flag_reg;
>  if (flag_reg == 1)
>reg -> mem;

That could be done but would be besides the point.  You traded one 
conditional store with another one, so you've gained nothing in that 
transformation.  The point of this transformation is precisely to get rid 
of that conditional store (enabling for instance other transformations as 
easier store sinking).  That sometimes gains _much_ performance, so 
something we want to do in all cases where it's possible.  You really have 
to protect your data access itself, or make those data accesses volatile, 
there's no way around this.


Ciao,
Michael.


RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Dave Korn
On 22 October 2007 17:16, Michael Matz wrote:


>> I'd rather wish the optimization would be done differently.  Currently we
>> have: 
>> 
>>  mem -> reg;
>>loop  loop
>>  if (condition)=> optimize =>  if (condition)
>>val -> mem;   val -> reg;
>>  reg -> mem;
>> 
>> 
>> But it could use additional register and be:
>> 
>>  0 -> flag_reg;
>>  loop
>>if (condition)
>>  val -> reg;
>>  1 -> flag_reg;
>>  if (flag_reg == 1)
>>reg -> mem;
> 
> That could be done but would be besides the point.  You traded one
> conditional store with another one, so you've gained nothing in that
> transformation.  

  Not quite: he's hoisted it (lowered it? sunk it?) out of the bottom of the
loop, so the test/branch/store only occurs once, and inside the loop there's
no memory access at all (which should be faster even than a load-cmove-store
with hot caches and no branches...)


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Tomash Brechko
On Mon, Oct 22, 2007 at 18:15:35 +0200, Michael Matz wrote:
> > I'd rather wish the optimization would be done differently.  Currently
> > we have:
> > 
> >  mem -> reg;
> >loop  loop
> >  if (condition)=> optimize =>  if (condition)
> >val -> mem;   val -> reg;
> >  reg -> mem;
> > 
> > 
> > But it could use additional register and be:
> > 
> >  0 -> flag_reg;
> >  loop
> >if (condition)
> >  val -> reg;
> >  1 -> flag_reg;
> >  if (flag_reg == 1)
> >reg -> mem;
> 
> That could be done but would be besides the point.  You traded one 
> conditional store with another one, so you've gained nothing in that 
> transformation.

Rather I traded possibly many conditional stores in a loop with one
conditional store outside the loop.  And this exactly coincides with
the point of discussion: you can't go further, when you replace
conditional store with unconditional one, you introduce the race that
wasn't in the original code.

Several people already suggested to use volatile for shared data.
Yes, it will help because we know it will disable all access
optimizations, including thread-unaware ones.  But I don't want to
disable _all_ optimizations, I rather vote for thread-aware
optimizations.  There is no requirement in POSIX to make all shared
data volatile.  As the article referenced in the thread explains,
there is no agreement between POSIX and C/C++ wrt memory access.  But
should it be fixed in the compiler (as article suggests), or should
every shared data in every threaded program be defined volatile, just
for the case?  I never seen latter approach in any Open Source project
(though didn't look for it specifically), and many of them are
considered quite portable.

Again, we are not discussing some particular code sample, and how it
might be fixed, but the problem in general.  Should GCC do
thread-unsafe optimizations, or not?


-- 
   Tomash Brechko


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Andrew Haley
Tomash Brechko writes:

 > 
 > Several people already suggested to use volatile for shared data.
 > Yes, it will help because we know it will disable all access
 > optimizations, including thread-unaware ones.  But I don't want to
 > disable _all_ optimizations, I rather vote for thread-aware
 > optimizations.

But your plan would disable optimizations even when it isn't necessary
to do so.  Only a small part of the data in a multi-threaded program
are shared.

 > There is no requirement in POSIX to make all shared data volatile.
 > As the article referenced in the thread explains, there is no
 > agreement between POSIX and C/C++ wrt memory access.  But should it
 > be fixed in the compiler (as article suggests), or should every
 > shared data in every threaded program be defined volatile, just for
 > the case?  I never seen latter approach in any Open Source project
 > (though didn't look for it specifically), and many of them are
 > considered quite portable.
 > 
 > Again, we are not discussing some particular code sample, and how it
 > might be fixed, but the problem in general.  Should GCC do
 > thread-unsafe optimizations, or not?

We do understand what you're saying, and simply repeating the same
thing doesn't help.

I think we should wait to see what the C++ working group comes up with
and consider implementing that, rather than some ad-hoc gcc-specific
proposal.

There's some discussion here:

http://www.artima.com/cppsource/threads_meeting.html

and here:

http://www.hpl.hp.com/personal/Hans_Boehm/c++mm/

Andrew.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Tomash Brechko
On Mon, Oct 22, 2007 at 18:33:37 +0100, Andrew Haley wrote:
> We do understand what you're saying, and simply repeating the same
> thing doesn't help.
> 
> I think we should wait to see what the C++ working group comes up with
> and consider implementing that, rather than some ad-hoc gcc-specific
> proposal.

Aha, but repeating worked.  This is the first time someone agrees that
the problem lies not entirely in the programmer's code.  Thank you!
:))


-- 
   Tomash Brechko


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Andrew Haley
Tomash Brechko writes:
 > On Mon, Oct 22, 2007 at 18:33:37 +0100, Andrew Haley wrote:
 > > We do understand what you're saying, and simply repeating the same
 > > thing doesn't help.
 > > 
 > > I think we should wait to see what the C++ working group comes up with
 > > and consider implementing that, rather than some ad-hoc gcc-specific
 > > proposal.
 > 
 > Aha, but repeating worked.  This is the first time someone agrees that
 > the problem lies not entirely in the programmer's code.  Thank you!
 > :))

Err, not exactly.  :)

See http://www.hpl.hp.com/personal/Hans_Boehm/c++mm/why_undef.html

Andrew.


RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Dave Korn
On 22 October 2007 18:34, Andrew Haley wrote:

>  > Again, we are not discussing some particular code sample, and how it
>  > might be fixed, but the problem in general.  Should GCC do
>  > thread-unsafe optimizations, or not?
> 
> We do understand what you're saying, and simply repeating the same
> thing doesn't help.

  Well, just to answer the question at face value, "Yes, of course it should,
because 99.9% of the time the fact of their thread-safety or otherwise is
irrelevant".

  The interesting question of course is how we can get the compiler to
recognize that 0.1% when thread-safety is not just relevant but vital.  That
may still end up requiring some kind of annotation (like 'volatile'), but
defining a clear memory model should allow the compiler to make inferences and
deductions for itself that save the programmer a lot of the work of specifying
what data needs to be thread-safe and when and where.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread skaller

On Mon, 2007-10-22 at 12:09 +0200, Erik Trulsson wrote:

> My own conclusion from this discussion (and others) is that shared memory is
> a lousy paradigm for communication between different threads of execution,
> precisely because it is so hard to specify exactly what should happen or not
> happen in various situations. 

in the abstract, this isn't really the case. Exactly the same problems
occur with message passing as with shared memory, for the trivial
reason that you can view memory reads as sending an address as
a message followed by a reply of the data (smly for writes).

The theorists working on message passing have great fun with
algorithms to ensure proper ordering .. which is just the
same problem as cache synchronisation.

The real difference is scoping: with shared memory it is
easy to accidentally fail to synchronise, but synchronisation
is easy. With processes and message passing, simple jobs
are trivial and all the communication is explicit, but for
complex interactions it is a lot of work and also 
can be extremely inefficient.

the big advantage of processes and message passing is the
potential to scale to the whole universe, whereas shared
memory abstracted across networks is likely to be
extremely slow and hard to reason about even if someone
actually implemented it.

Just as an example, Erlang is dynamically typed, purely
functional, and uses processes and message passing with
no ordering guarantees .. however it allows you to read
messages out of order. What this means is if you want
to synchronise .. you have to write code to actually do
it, eg a double handshake... shared memory systems
do that kind of thing directly in hardware, so you can
sometime work at a much higher level.



-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Tomash Brechko
On Mon, Oct 22, 2007 at 18:48:02 +0100, Andrew Haley wrote:
> Err, not exactly.  :)
> 
> See http://www.hpl.hp.com/personal/Hans_Boehm/c++mm/why_undef.html

Why, I'd say that page is about original races in the program, not
about what compiler should do with races that it introduces itself.

Still, "let's wait and see" is probably the best outcome that I can
expect from this discussion, so thanks anyway. ;)


-- 
   Tomash Brechko


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Andi Kleen
Andrew Haley <[EMAIL PROTECTED]> writes:

> Tomash Brechko writes:
>
>  > 
>  > Several people already suggested to use volatile for shared data.
>  > Yes, it will help because we know it will disable all access
>  > optimizations, including thread-unaware ones.  But I don't want to
>  > disable _all_ optimizations, I rather vote for thread-aware
>  > optimizations.
>
> But your plan would disable optimizations even when it isn't necessary
> to do so.  Only a small part of the data in a multi-threaded program
> are shared.

At least for current x86 it is dubious the cmov change on memory was actually
an improvement.

-Andi


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread skaller

On Mon, 2007-10-22 at 18:32 +0400, Tomash Brechko wrote:

> But it could use additional register and be:
> 
>  0 -> flag_reg;
>  loop
>if (condition)
>  val -> reg;
>  1 -> flag_reg;
>  if (flag_reg == 1)
>reg -> mem;
> 

> So, why not use flag_reg and thus make GCC thread-aware for this case?

Registers are a limited resource.

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-22 Thread Andrew Pinski
On 10/22/07, skaller <[EMAIL PROTECTED]> wrote:
> Registers are a limited resource.

Everything is limited, some processors are more limited than others  :).
Seriously, I think this should be discussed in a language standards
comittee area rather than inside GCC's development since right now GCC
is correct.  I don't want to limit GCC's output to "thread safe"
optimizations.

In fact any optimization that changes order of loads/stores is not
thread safe.  So you just disabled every high level optimization.

-- Pinski


Re: df_insn_refs_record's handling of global_regs[]

2007-10-22 Thread Seongbae Park (박성배, 朴成培)
Hi Dave,

On x86-64, no regression in 4.2 with the patch.
So both 4.2 and mainline patches are OK.

I'd appreciate it if you can add the testcase
- it's up to you whether to add it in a separate patch or with this patch.
Thanks for fixing it.

Seongbae

On 10/19/07, Seongbae Park (박성배, 朴成培) <[EMAIL PROTECTED]> wrote:
> On 10/19/07, David Miller <[EMAIL PROTECTED]> wrote:
> > From: "Seongbae Park (박성배, 朴成培)" <[EMAIL PROTECTED]>
> > Date: Fri, 19 Oct 2007 17:25:14 -0700
> >
> > > If you're not in a hurry, can you wait
> > > till I run the regtest against 4.2 on x86-64 ?
> > > I've already discussed the patch with Kenny
> > > and we agreed that this is the right approach,
> > > but I'd like to see the clean regtest on x86 for both 4.2 and 4.3
> > > before I approve.
> > > Thanks,
> >
> > I am in no rush, please let me know if you want some help
> > tracking down the failure you are seeing.
> >
> > Since you say it is a libgomp failure... I wonder if some of
> > the atomic primitives need some side effect markings which
> > are missing and thus exposed by not clobbering global regs
> > at call sites any more.
>
> It looks like it's just a flaky test - it randomly fails on my test machine
> with or without the patch (for interested, it's omp_parse3.f90  with -O0).
> I haven't started 4.2 testing yet - I'll let you know when I get that done.


Re: From SSA back to GIMPLE.

2007-10-22 Thread skaller

On Mon, 2007-10-22 at 16:32 +0200, Paolo Bonzini wrote:

> I don't know what you mean, but yes, there is value in going to SSA and 
> back.  SSA makes global optimization much easier, and that's the main 
> improvement introduced in GCC 4.0 and later refined.  

IMHO gcc was pretty crappy until 4.0. Now it generates good code.
SSA is a robust representation which allows strong assurances
that vagaries of the way the user wrote the source won't
interfere with generating good assembler.

Still .. the Felix generated version of Ackermann's function
outperforms the almost identical C code by almost 2:1 on
AMD64 .. I have no idea why, although I know what matters:
the number of words pushed onto the stack each recursion
is the only thing that actually affects performance.

Gcc pointlessly unrolls the recursion -- this has no effect
on the number of words pushed. It also makes a fairly serious
mistake, in that the recursion calls the externally visible
function, which is ABI compliant. It should generate a non-recursive
wrapper, and then use a recursive inner function which uses 
an optimal but not necessarily ABI compliant interface.

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


RE: From SSA back to GIMPLE.

2007-10-22 Thread Dave Korn
On 22 October 2007 19:32, skaller wrote:

> On Mon, 2007-10-22 at 16:32 +0200, Paolo Bonzini wrote:
> 
>> I don't know what you mean, but yes, there is value in going to SSA and
>> back.  SSA makes global optimization much easier, and that's the main
>> improvement introduced in GCC 4.0 and later refined.
> 
> IMHO gcc was pretty crappy until 4.0. 

  You dare to besmirch the hallowed memory of 2.95.4?  Prepare to die!



 ;-)

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



modified x86 ABI

2007-10-22 Thread David Taylor
At EMC we have a version of GCC which targets the x86 with a non
standard ABI -- it produces code for 64 bit mode mode, but with types
having the 32 bit ABI sizes.  So, ints, longs, and pointers are 32
bits -- that is, it's ILP32 rather than LP64 -- but with the chip in
64 bit mode.

Actually, pointers are somewhat schizophrenic -- software 32 bits,
hardware 64 bits.

Currently the changes are against 3.4.6 and are not yet
``productized''.

If this set of changes was cleaned up, finished, and made relative to
top of trunk rather than relative to 3.4.6, would people be interested
in them?

Put another way, should I bother to post them to gcc-patches (probably
3-6 months out) for possible inclusion into gcc?

Thanks.

Later,

David


What is a regression?

2007-10-22 Thread Jason Merrill
I think that the release process for recent releases has given undue 
priority to bugs marked as regressions.  I agree that it's important for 
things that worked in the previous release to keep working in the new 
release.  But the regression tag is used for much more trivial things.


For instance, Bug 32252 is an ice-on-valid bug in a new C++ feature, 
variadic templates.  But since 4.2 gave a syntax error instead of an 
ICE, this gets marked as a regression.


This seems wrong to me.  We should only use the regression tag for 
things that worked properly in the previous release and fail in the new 
release.  A change from rejects-valid to ice-on-valid is an extremely 
low priority for me, and should not affect the release schedule.  I 
would like to remove the regression tag entirely from such bugs.


Similarly, bugs marked as 4.1/4.2/4.3 regression don't seem like a high 
priority to me.  If a bug wasn't a blocker for 4.2, it shouldn't be a 
blocker for 4.3.  It makes sense to give such a bug a higher priority 
than it would normally (say, one point higher), but it seems to me that 
only regressions relative to the previous release series should actually 
be considered for release timing.


Incidentally, how are priorities assigned to bugs?  I don't see any 
guidelines on the website.


Jason


Re: From SSA back to GIMPLE.

2007-10-22 Thread J.C. Pizarro
2007/10/22, David Edelsohn <[EMAIL PROTECTED]> wrote:
> Please keep the discussion on a technical level and not about
> someone's fluency with the English language.
>
> Gracias, David
>
>

Thanks David,

i'm very bad english speaker but i'm a good person.

If SSA was made to permit to eliminate prematurely dead-code and to
optimize partially the register allocation then

why is hard to optimize unrolling loop, inlining code, instructions
scheduling, etc because of the SSA's presence?

Don't forget, "Premature optimization is the root of all evil".

J.C. Pizarro


Re: From SSA back to GIMPLE.

2007-10-22 Thread J.C. Pizarro
2007/10/22, Paolo Bonzini <[EMAIL PROTECTED]> wrote:
> J.C. Pizarro wrote:
> >  Are they mixed into a single
> >> variable declaration? Are they treated as separate variables and
> >> handled later by the register allocator?
>
> If possible, the former.  If not possible, they are kept as separate
> variables.  This happens if the subscripted variables have overlapping
> live ranges because of optimizations that were made on the SSA form.

Please, to use temporaly the SSA form for the analysis of live ranges
of the variables , but don't use it to generate SSA-optimized code.

>
> > Is not it easy to write 3 stages GENERIC->GIMPLE->RTL instead of 5 stages?
> >
> > Is meaningful the optimization of the complex bi-transformation
> > GIMPLE->SSA->GIMPLE?
>
> I don't know what you mean, but yes, there is value in going to SSA and
> back.  SSA makes global optimization much easier, and that's the main
> improvement introduced in GCC 4.0 and later refined.  Of course not all
> optimizations benefit from SSA, some (such as OpenMP implementation)
> only benefit from having a high-level intermediate representation
> (GIMPLE).  But most do, in one way or the other.

Wrong, SSA doesn't make global optimization much easier, no such
global optimization exists for SSA, and SSA makes harder the
reincorporation of others kinds of specific optimizers that they need
to optimize still more.

> In the future, GCC might instead do GENERIC->GIMPLE->SSA->RTL, without
> going back to GIMPLE.  But the SSA step is there to stay.  :-)

In the future, GCC will no be the best compiler, the best compiler
could be a powerful compiler with inferences's machines, learning
machines, logic machines, etc where the men don't think in the
specific algorithms.

>
> Paolo
>

J.C. Pizarro


Re: From SSA back to GIMPLE.

2007-10-22 Thread J.C. Pizarro
2007/10/22, Zdenek Dvorak <[EMAIL PROTECTED]> wrote:
> Dear Mr. Pizzaro,
>
> > Is not it easy to write 3 stages GENERIC->GIMPLE->RTL instead of 5 stages?
> >
> > Is meaningful the optimization of the complex bi-transformation
> > GIMPLE->SSA->GIMPLE?
> >
> > Is more powerful GENERIC->GIMPLE->RTL + "trial-and-error" local 
> > optimization?
> >
> >Sincerely, J.C. Pizarro
>
> everyone else here is too polite to tell it to you, but could you please
> shut up, until:
>
> -- you learn at least basics of English grammar (so that we can actually
>understand what you are saying), and
> -- at least something about gcc and compilers in general (so that what
>you say makes some sense)?
>
> While I was mildly annoyed by your previous "contributions" to the
> discussion in the gcc mailing list, I could tolerate those.  But
> answering a seriously ment question of a beginner by this confusing
> and completely irrelevant drivel is another thing.
>
> Sincerely,
>
> Zdenek Dvorak
>

Dear Zdenek Dvorak,

Why have i to shut up? Is it an order?

What is the big problem when i talk about the weakness of such thing?

Ahhh, it's a problem, then i shut up to solve this weakness's problem.

Do you want it Dvorak?

No problem man.

   J.C. Pizarro


Re: From SSA back to GIMPLE.

2007-10-22 Thread David Edelsohn
> J C Pizarro writes:

JC> In the future, GCC will no be the best compiler, the best compiler
JC> could be a powerful compiler with inferences's machines, learning
JC> machines, logic machines, etc where the men don't think in the
JC> specific algorithms.

There are a few research efforts that wish to experiment with
some of these technique, using GCC as a base.

David




Re: From SSA back to GIMPLE.

2007-10-22 Thread Jamie Lokier
Dave Korn wrote:
> On 22 October 2007 19:32, skaller wrote:
> 
> > On Mon, 2007-10-22 at 16:32 +0200, Paolo Bonzini wrote:
> > 
> >> I don't know what you mean, but yes, there is value in going to SSA and
> >> back.  SSA makes global optimization much easier, and that's the main
> >> improvement introduced in GCC 4.0 and later refined.
> > 
> > IMHO gcc was pretty crappy until 4.0. 
> 
>   You dare to besmirch the hallowed memory of 2.95.4?  Prepare to die!

Sadly, some of us are still using 2.95.3 for compatibility reasons...

-- Jamie


Re: From SSA back to GIMPLE.

2007-10-22 Thread J.C. Pizarro
2007/10/22, David Edelsohn <[EMAIL PROTECTED]> wrote:
> > J C Pizarro writes:
>
> JC> In the future, GCC will no be the best compiler, the best compiler
> JC> could be a powerful compiler with inferences's machines, learning
> JC> machines, logic machines, etc where the men don't think in the
> JC> specific algorithms.
>
> There are a few research efforts that wish to experiment with
> some of these technique, using GCC as a base.
>
> David

IMHO, in the future, GCC as a base an experimal compiler IS NOT good
because of enormeous complexities to design this optimizing compiler.

My reasons to select a good base are:

* the programming language to develop a complex optimizing compiler
MUST TO be high-level, more declarative than machine-imperative, OO,
GC'ed, polymorphic and easier to interact with A.I. agents (e.g. ala
ants colonies) to go storing better rules in the databases (to reuse
them later).

* the C programming language that is used to develop GCC is not
following above these principles.

   J.C. Pizarro


Re: From SSA back to GIMPLE.

2007-10-22 Thread Jamie Lokier
J.C. Pizarro wrote:
> IMHO, in the future, GCC as a base an experimal compiler IS NOT good
> because of enormeous complexities to design this optimizing compiler.
> 
> My reasons to select a good base are:
> 
> * the programming language to develop a complex optimizing compiler
> MUST TO be high-level, more declarative than machine-imperative, OO,
> GC'ed, polymorphic and easier to interact with A.I. agents (e.g. ala
> ants colonies) to go storing better rules in the databases (to reuse
> them later).
> 
> * the C programming language that is used to develop GCC is not
> following above these principles.

If you have a sufficiently good code-aware AI system (e.g. ants
colonies?!), then you can use it to _translate_ and _extract_ all the
interesting bits of GCC into its database, or even into your preferred
high-level language, and discard the fluff like C syntax and obsolete
rules.

If your AI isn't that good, I question whether it's good enough to do
the job you want it to :-)

-- Jamie


Re: From SSA back to GIMPLE.

2007-10-22 Thread Joe Buck
On Mon, Oct 22, 2007 at 09:48:24PM +0200, J.C. Pizarro wrote:
> why is hard to optimize unrolling loop, inlining code, instructions
> scheduling, etc because of the SSA's presence?

There's nothing about SSA that makes any of those things harder.

In any case, the use of SSA is fairly fundamental to GCC's current
architecture.  You have a habit of just showing up and telling
people to throw out all their work and start over, based on no
evidence.  That's why you often get a hostile reception.


gcc-4.1-20071022 is now available

2007-10-22 Thread gccadmin
Snapshot gcc-4.1-20071022 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20071022/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.1 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch 
revision 129563

You'll find:

gcc-4.1-20071022.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.1-20071022.tar.bz2 C front end and core compiler

gcc-ada-4.1-20071022.tar.bz2  Ada front end and runtime

gcc-fortran-4.1-20071022.tar.bz2  Fortran front end and runtime

gcc-g++-4.1-20071022.tar.bz2  C++ front end and runtime

gcc-java-4.1-20071022.tar.bz2 Java front end and runtime

gcc-objc-4.1-20071022.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.1-20071022.tar.bz2The GCC testsuite

Diffs from 4.1-20071015 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.1
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: What is a regression?

2007-10-22 Thread David Miller
From: Jason Merrill <[EMAIL PROTECTED]>
Date: Mon, 22 Oct 2007 15:42:50 -0400

> For instance, Bug 32252 is an ice-on-valid bug in a new C++ feature,
> variadic templates.  But since 4.2 gave a syntax error instead of an
> ICE, this gets marked as a regression.

I agree that the regression marker is questionable.

But wouldn't you agree that it's not all that great to ship a new
feature in GCC that users have already found ways to ICE?

The flip side of the coin is that the user has an equal chance
as before to deal with the situation, by not using the feature.

However, the difference that I see as important here is that what
was before a lack of functionality issue is now a quality issue.
And therefore we should really fix the ICE.


Re: df_insn_refs_record's handling of global_regs[]

2007-10-22 Thread David Miller
From: "Seongbae Park (박성배, 朴成培)" <[EMAIL PROTECTED]>
Date: Mon, 22 Oct 2007 11:31:18 -0700

> On x86-64, no regression in 4.2 with the patch.
> So both 4.2 and mainline patches are OK.

Thank you for doing this extra regression testing.

> I'd appreciate it if you can add the testcase
> - it's up to you whether to add it in a separate patch or with this patch.

I will work on this, good idea.

> Thanks for fixing it.

No problem, thanks for reviewing and testing!


GCC 4.1.1 unwind support for arm-none-linux-gnueabi

2007-10-22 Thread Franklin
Hi, list.

Right now I'm building new toolchain using old one provided by our vendor.  I 
have built binutils and gcc-4.1.1 successfully.  However while building 
glibc-2.4 it always told me:

running configure fragment for nptl/sysdeps/pthread
checking for forced unwind support... no
configure: error: forced unwind support is required

I saw that gcc could be built with unwind support, but default turned off.
In gcc/Makefile.in

# Don't build libunwind by default.
LIBUNWIND =
LIBUNWINDDEP =
SHLIBUNWIND_LINK =
SHLIBUNWIND_INSTALL =

I tried to get libunwind library source to build a ``system library'', but it 
did not support arm platform.

Could anyone please give me some hints?  I need pthreads support in glibc, but 
now I am stuck on this unwind issue.  Any suggestion is appreciated.

Thanks.


Regards,
Franklin


signature.asc
Description: This is a digitally signed message part.


builtin_frame_address for stack pointer

2007-10-22 Thread skaller
hi, I have some code using __builtin_frame_address(0) to get
the current stack pointer in a 'portable' way.

Unfortunately, this appears not to work if -fomit-frame-pointer
is used on an x86. My system sets that automatically, since
the x86 is a bit short on registers and this is reputed
to help with optimisation.

This leaves one having to do processor dependent register
hackery to find the stack pointer. [I need a stack pointer
approximation to find roots for a garbage collector]

Is there another way? If not, I think to submit a bug report
with 'feature request' status, to have a way to get the stack
pointer that works even if -fomit-frame-pointer is specified.
Any comments? Is it a reasonable request?

BTW: what happens on ia64 which has two? stacks?

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: What is a regression?

2007-10-22 Thread skaller

On Mon, 2007-10-22 at 15:42 -0400, Jason Merrill wrote:
> I think that the release process for recent releases has given undue 
> priority to bugs marked as regressions.  I agree that it's important for 
> things that worked in the previous release to keep working in the new 
> release.  But the regression tag is used for much more trivial things.
> 
> For instance, Bug 32252 is an ice-on-valid bug in a new C++ feature, 
> variadic templates.  But since 4.2 gave a syntax error instead of an 
> ICE, this gets marked as a regression.
> 
> This seems wrong to me.  We should only use the regression tag for 
> things that worked properly in the previous release and fail in the new 
> release. 

But Jason, the compiler worked properly in rejecting invalid syntax.
Now you're suggesting it fails to do so. This suggests a real regression
and a real bug: the new feature should have an enabling flag
that couldn't have been set before it was implemented, and without
that flag should create the same error in the current version.

Not arguing against your point in general but this particular
case appears to be mishandled and the regression genuine.

BTW: did WG21 already pass this proposal?

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: What is a regression?

2007-10-22 Thread Andrew MacLeod

Jason Merrill wrote:


Similarly, bugs marked as 4.1/4.2/4.3 regression don't seem like a 
high priority to me.  If a bug wasn't a blocker for 4.2, it shouldn't 
be a blocker for 4.3.  It makes sense to give such a bug a higher 
priority than it would normally (say, one point higher), but it seems 
to me that only regressions relative to the previous release series 
should actually be considered for release timing.




I think this is a very important point.  If it didn't block a previous 
release, it shouldn't block the current release. It doesn't mean it 
shouldn't get looked at, but it also shouldn't be a blocker.  I think 
the high priority regressions should be ones that are new to 4.3 because 
they have clearly been either introduced or exposed by this release and 
need to be dealt with. 


Andrew




Re: What is a regression?

2007-10-22 Thread Jason Merrill

David Miller wrote:


But wouldn't you agree that it's not all that great to ship a new
feature in GCC that users have already found ways to ICE?


Oh, absolutely, it's just not a regression.

Jason



Re: What is a regression?

2007-10-22 Thread Jason Merrill

skaller wrote:

But Jason, the compiler worked properly in rejecting invalid syntax.
Now you're suggesting it fails to do so.  This suggests a real regression
and a real bug: the new feature should have an enabling flag
that couldn't have been set before it was implemented, and without
that flag should create the same error in the current version.


The current version gives an error "ISO C++ does not include variadic 
templates" unless you enable C++0x mode.  And then crashes.


But in any case, nobody has code that relies on getting an error from a 
previous version of the compiler that would be broken by moving to 4.3. 
 Only regressions on valid code seem serious enough to me to warrant 
blocking a release.



BTW: did WG21 already pass this proposal?


Yes, a few meetings back.

Jason



Re: What is a regression?

2007-10-22 Thread Andrew Pinski
On 10/22/07, Jason Merrill <[EMAIL PROTECTED]> wrote:
> David Miller wrote:
>
> > But wouldn't you agree that it's not all that great to ship a new
> > feature in GCC that users have already found ways to ICE?
>
> Oh, absolutely, it's just not a regression.

Except it is still ICEing.  There was a discussion before (I cannot
find it right now) that made the following considered a regression:
anything to ICE (even after an error, though not a blocking one)
works to wrong code
rejects invalid to accepts valid
accepts invalid to rejects valid
diagnostic regressions

And there was a previous discussion about setting the priority too
(the release manager is the only one who should be setting it too).

Thanks,
Andrew Pinski


Re: builtin_frame_address for stack pointer

2007-10-22 Thread Andrew Pinski
On 10/22/07, skaller <[EMAIL PROTECTED]> wrote:
> Unfortunately, this appears not to work if -fomit-frame-pointer
> is used on an x86.

What version of GCC?  Since this was fixed for 4.1.0, see
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8335 .

Thanks,
Andrew Pinski


Re: What is a regression?

2007-10-22 Thread Andrew Pinski
On 10/22/07, Andrew MacLeod <[EMAIL PROTECTED]> wrote:
> I think this is a very important point.  If it didn't block a previous
> release, it shouldn't block the current release.

Yes it is but does a regression that is just found to be a regression
is considered a blocking one, it should block the current release.
Likewise for all new incomming regressions.  Just because we did not
know about a regression before, should not cause it not to be high
priority.

Thanks,
Andrew Pinski


Re: What is a regression?

2007-10-22 Thread skaller

On Tue, 2007-10-23 at 00:00 -0400, Jason Merrill wrote:
> skaller wrote:
> > But Jason, the compiler worked properly in rejecting invalid syntax.
> > Now you're suggesting it fails to do so.  This suggests a real regression
> > and a real bug: the new feature should have an enabling flag
> > that couldn't have been set before it was implemented, and without
> > that flag should create the same error in the current version.
> 
> The current version gives an error "ISO C++ does not include variadic 
> templates" unless you enable C++0x mode.  And then crashes.

I see. So this is unpleasant, but technically it has met the
ISO conformance requirements in issuing the diagnostic.
A quick fix would be to abort() the compiler to prevent the ICE
in released versions (but developers would test with the abort
commented out whilst trying to get the implementation working).

> But in any case, nobody has code that relies on getting an error from a 
> previous version of the compiler that would be broken by moving to 4.3. 

Well it seems you're right given that 4.3 DOES issue a diagnostic,
so I guess I'd agree with you now given that data.

>   Only regressions on valid code seem serious enough to me to warrant 
> blocking a release.

I still think that is too strong a position. A good fraction
of compiler time is spent bugging out user code.. one could
even say the job of a compiler is not generating machine code,
but telling programmers they're idiots :)

Still, if you were to use "ISO C++ conformance model" as the
criteria, you could probably weaken your position and still
reduce the number of blocking "regressions": an ICE in the
case of experimental code could be allowed PROVIDED there
is a diagnostic WHEN one is actually required.

In fact, that may be TOO weak: people do rely on QOI diagnostics
beyond those required by ISO.

> > BTW: did WG21 already pass this proposal?
> 
> Yes, a few meetings back.

Not my preferred model but better than nothing.

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: What is a regression?

2007-10-22 Thread Ian Lance Taylor
Jason Merrill <[EMAIL PROTECTED]> writes:

> I think that the release process for recent releases has given undue
> priority to bugs marked as regressions.  I agree that it's important
> for things that worked in the previous release to keep working in the
> new release.  But the regression tag is used for much more trivial
> things.

We had a discussion of these sorts of issues at the GCC mini-summit
back in April.  We didn't come to any conclusions.  But there are some
notes here:

http://gcc.gnu.org/ml/gcc/2007-04/msg00676.html


The goal is presumably: how can we produce the highest quality
release?  I agree that our single-minded focus on regressions is
misleading.  For example, it might be better to produce a more nuanced
list of bugs which must be fixed, which should be fixed, and which we
would like to fix.  Then we can set numeric targets for each level, to
be acheived before the release.  However, this would require a lot
more effort.  It would be a lot of work for one person--and who would
that person be?  If many people did it, then how would we keep
consistency?

Ian


Re: builtin_frame_address for stack pointer

2007-10-22 Thread skaller

On Mon, 2007-10-22 at 22:58 -0700, Andrew Pinski wrote:
> On 10/22/07, skaller <[EMAIL PROTECTED]> wrote:
> > Unfortunately, this appears not to work if -fomit-frame-pointer
> > is used on an x86.
> 
> What version of GCC?  Since this was fixed for 4.1.0, see
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8335 .
> 
> Thanks,
> Andrew Pinski

Ah .. I do not know, it is not me that is having this problem
but another developer. I'll ask. Thanks for the info.

BTW: you say this is fixed, I read the bug info but am still
unclear of the resolution.. will __builtin_frame_pointer(0)
always return an approximation to the stack pointer now,
even if -fomit-frame-pointer is set?



-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net