df_insn_refs_record's handling of global_regs[]

2007-10-16 Thread David Miller

I have a bug I'm trying to investigate where, starting in gcc-4.2.x,
the loop invariant pass considers a computation involving a global
register variable as invariant across a call.  The basic structure
of the code is:

register unsigned long regvar asm ("foo");

func(arg)
{
for (...) {
call();
*arg = expression(regvar);
}
}

The code is built with "-ffixed-foo".  Actually the specific
case is 64-bit sparc, and the register being used is "%g5"
so global_regs[%g5], call_used_regs[%g5] and fixed_regs[%g5]
will all be true.

loop-invariant.c decides that expression(regvar) is invariant
and can be moved outside of the loop, which is an illegal
transformation because calls clobber global register variables.

When I noticed this I said to myself, "Hmmm, that's peculiar..."

I then checked to see what changed in the loop invariant pass between
gcc-4.1.x and gcc-4.2.x, and mainly it was changed to use the dataflow
layer.

So I started looking at how the dataflow layer handles global register
variables wrt. calls.

When a CALL is encountered, df_insn_refs_record() only marks them as
used by going:

for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
  if (global_regs[i])
df_uses_record (...

I thought initially that this coude should be using "df_defs_record
(...", but the next few lines take care of that issue by iterating
over df_invalidated_by_call, which is initialized with the contents
of regs_invalidated_by_call and I validated that the global register
in use is included in the invalidation list by making some df_dump()
calls in a debugging session.

So the dataflow problem should show the loop-invariant pass that the
call inside the loop potentially changes 'regvar' and thus expressions
involving 'regvar' are not invariant.  But for some reason that isn't
happening.

I'm wondering if there is a bad interaction between how df-scan.c
cooks up these fake insn "locations" for things like
df_invalidated_by_call.

In such cases, DF_REF_LOC will be ®no_reg_rtx[], and in
loop-invariant.c it will:

1) Records the DF_REF_LOCs, via record_uses(), in the ->pos
   member of it's "struct use" data-structure.

2) Later writes directly to *u->pos in move_invariant_reg().

But perhaps we shouldn't even get that far with such fake
location expressions.

I'm trying to debug this further, but if anyone sees anything obvious
or has some suggestions for debugging this, let me know.

Thanks.



gcc/doc/md.texi buglet?

2007-10-16 Thread Thomas Sailer
md.texi of mainline as of now states at line 4451ff:

@cindex @[EMAIL PROTECTED] instruction pattern
@item @[EMAIL PROTECTED]
Similar to @[EMAIL PROTECTED] but for conditional addition.  Conditionally
move operand 2 or (operands 2 + operand 3) into operand 0 according to the
comparison in operand 1.  If the comparison is true, operand 2 is moved into
operand 0, otherwise (operand 2 + operand 3) is moved.

Now isn't it the other way round? that is if the condition is true,
operand 2 + operand 3 is stored, operand 2 otherwise? That's at least
how I read the code in ifcvt.c, and that's what my experiments with that
pattern resulted in.

Tom




Machine dependent Tree optimization?

2007-10-16 Thread Bingfeng Mei
Hello,
I am working on GCC4.2.1 porting to our VLIW processor. Our No. 1
priority is code size. I noticed the following code generation:

Source code:

  if (a == 0x1ff )
c = a + b;
  return c;


After tree copy propagation:


foo (a, b, c)
{
:
  if (a_2 == 511) goto ; else goto ;

:;
  c_5 = b_4 + 511;

  # c_1 = PHI ;
:;
  return c_1;

}

It will generate the following assembly code for our processor
tstieqw p0, r0, #0x1ff  //Compare r0 with 0x1ff and
write result to a predicate
p0. addwi r2, r1, #0x1ff//Predicated add
sbl [link]  :   movw r8, r2


In our processor, p0. addwi r2, r1, #0x1ff   is a long instruction
(64-bit)

Ideally, I don't want this copy propagation if the immediate is out of
certain range. Then it will generate the following code

tstieqw p0, r0, #0x1ff  //Compare r0 with 0x1ff and
write result to predicate
p0. addw r2, r1, r0  //Predicated add  (32-bit
instruciton)
sbl [link]  :   movw r8, r2

It is going to save us four bytes. 

Of couse, for processors without long/short instructions, this copy
propagation is benefiical for performance by reducing unnecessary
dependency. Therefore, whether to apply this copy propagation is machine
dependent to some degree.  

What I do now is to add some check in tree-ssa-copy.c and tree-ssa-dom.c
for our target. But this is not very clean. My question is whether there
is better way to implement such machine-dependent tree-level
optimization (like hooks in RTL level).  I believe there are other
processors that have the similar problem. What is common solution? 


Thanks,
Bingfeng Mei

Broadcom UK



Problem with too many virtual operands ( tree-ssa-operands.c:484)

2007-10-16 Thread Pranav Bhandarkar
Hi,
In the attached testcase due to an ivopts modification, while
rewriting the uses the compiler crashes in tree-ssa-operands.c because
the number of virtual operands of the modified stmt is much greater
than the thresholds controlled by OP_SIZE_{1,2,3} in
tree-ssa-operands.c.

I went through
http://gcc.gnu.org/ml/gcc-patches/2006-12/msg01269.html

and it seemed to me that these values (OP_SIZE etc) have been quite
experimentally set. I am wondering If these values should be
increased.

I did increase these values and the attached testcase compiled fine. I
examined the resulting ivopts dump and the modifications seems valid
to me. I have attached two dumps ( before ivopts - i.e 105t.cunroll
and after ivopts - i.e 107t.ivopts ) . Note that post ivopts dump was
generated only after changing OP_SIZE_3 to 700 ( a randomly high value
).

For a quick check of the dumps, note that
 D.1281_6 = Hoopster_ptr_17->Magic;

gets changed to

 D.1281_6 = MEM[index: ivtmp.840_5];

I am wondering If increasing OP_SIZE_{1,2,3} is the way to go. Partly
not convinced because it means that the problem could hit again with
nastier code.

TIA,
Pranav


testcase-min.i
Description: Binary data


testcase-min.i.105t.cunroll
Description: Binary data


testcase-min.i.107t.ivopts
Description: Binary data


Re: Machine dependent Tree optimization?

2007-10-16 Thread Ian Lance Taylor
"Bingfeng Mei" <[EMAIL PROTECTED]> writes:

> Of couse, for processors without long/short instructions, this copy
> propagation is benefiical for performance by reducing unnecessary
> dependency. Therefore, whether to apply this copy propagation is machine
> dependent to some degree.  
> 
> What I do now is to add some check in tree-ssa-copy.c and tree-ssa-dom.c
> for our target. But this is not very clean. My question is whether there
> is better way to implement such machine-dependent tree-level
> optimization (like hooks in RTL level).  I believe there are other
> processors that have the similar problem. What is common solution? 

This should normally be done at the RTL level by making long constants
more expensive in RTX_COSTS.  With luck that will let gcse pick this
up.

Ian


RE: Machine dependent Tree optimization?

2007-10-16 Thread Bingfeng Mei
Thanks. Do you mean the TARGET_RTX_COSTS hook?  Actually, I already have
made the long int more expensive in TARGET_RTX_COSTS function. It does
have effect for other optimizations (e.g., combine pass), but doesn't
work in the example mentioned in previous example. 

if (INTVAL(x) >= 0 && INTVAL(x) <= 255) {
*total = 1;
return true;
}
*total = 4;
return true; 


Bingfeng Mei

-Original Message-
From: Ian Lance Taylor [mailto:[EMAIL PROTECTED] 
Sent: 16 October 2007 15:32
To: Bingfeng Mei
Cc: gcc@gcc.gnu.org
Subject: Re: Machine dependent Tree optimization?

"Bingfeng Mei" <[EMAIL PROTECTED]> writes:

> Of couse, for processors without long/short instructions, this copy
> propagation is benefiical for performance by reducing unnecessary
> dependency. Therefore, whether to apply this copy propagation is
machine
> dependent to some degree.  
> 
> What I do now is to add some check in tree-ssa-copy.c and
tree-ssa-dom.c
> for our target. But this is not very clean. My question is whether
there
> is better way to implement such machine-dependent tree-level
> optimization (like hooks in RTL level).  I believe there are other
> processors that have the similar problem. What is common solution? 

This should normally be done at the RTL level by making long constants
more expensive in RTX_COSTS.  With luck that will let gcse pick this
up.

Ian




Library not loaded

2007-10-16 Thread Denis Tkachov

Hi all

I am having problem starting my application that is successfully built. I am
using boost to serialize/deserialize data. I have link boost library and my
project is built successfully, but I cannot run it.

Running the project (build&go in xcode) I receive this error:

dyld: Library not loaded: stage/lib/libboost_serialization-1_34_1.dylib
  Referenced from:
/Users/dtsachov/dev/evdp/temp/TestSerialization/build/Debug/TestSerialization
  Reason: image not found

Note - that is not a problem with boost, that is some problem with linking
to the library in runtime. I had the same problem with my 2 projects, one
uses another - one is a command line tool and another is a library. If the
library is dynamic library I am able to build the project but unable to run
it, getting the same error. When I made my library static as library (not
dynamic) I can run the application.

So I suggest this is some problem of locating the library in runtime.

Does anybody have an idea ?
Thank you in advance.

-- 
View this message in context: 
http://www.nabble.com/Library-not-loaded-tf4634681.html#a13235148
Sent from the gcc - Dev mailing list archive at Nabble.com.



double gimplification in C++ FE

2007-10-16 Thread Aldy Hernandez
Hi Jason.  Hi folks.

I'm in the process of converting the C++ FE to tuples.  In doing so I
have noticed that the C++ FE will frequently gimplify bits of a tree,
and then expect gimplify_expr() to gimplify the rest.  This seems
redundant, as gimplify_expr() more often than not will gimplify the
entire tree structure, without regard to what parts the C++ FE already
gimplified.

For example, while gimplifying a TRY_BLOCK in C++, we do:

genericize_try_block (tree *stmt_p)
{
  tree body = TRY_STMTS (*stmt_p);
  tree cleanup = TRY_HANDLERS (*stmt_p);

  gimplify_stmt (&body);< BOO HISS
  ...
  *stmt_p = build2 (TRY_CATCH_EXPR, void_type_node, body, cleanup);
}

Then, in gimplify.c:gimplify_expr():

case TRY_FINALLY_EXPR:
case TRY_CATCH_EXPR:
  gimplify_to_stmt_list (&TREE_OPERAND (*expr_p, 0));
  gimplify_to_stmt_list (&TREE_OPERAND (*expr_p, 1));
  ret = GS_ALL_DONE;
  break;

This behavior is common throughout C++.  The C++ FE calls gimplify_stmt
on bits of trees, but then gimplify_expr() has to gimplify again.

It seems to me that a better approach would be to pass the
tree structure as generic as we can (without calls to gimplify_stmt in
the C++ FE), and then let gimplify_expr do its job.

The reason I propose this is because with tuples, gimplify_expr()
accepts trees, not tuples.  So if we call gimplify_stmt in the C++ FE,
we end up with tuples which we cannot be passed to gimplify_expr.  So
it's better to leave things as generic trees, and let the gimplifier
proper do its magic.

Am I overlooking something?  Can I proceed with this approach?

Thanks.
Aldy


Re: double gimplification in C++ FE

2007-10-16 Thread Jason Merrill

Aldy Hernandez wrote:

I'm in the process of converting the C++ FE to tuples.  In doing so I
have noticed that the C++ FE will frequently gimplify bits of a tree,
and then expect gimplify_expr() to gimplify the rest.  This seems
redundant, as gimplify_expr() more often than not will gimplify the
entire tree structure, without regard to what parts the C++ FE already
gimplified.


Yes, the gimplifier often makes several passes over the same trees to 
get them completely lowered.  cp_gimplify_expr is a subroutine of the 
gimplifier.



This behavior is common throughout C++.  The C++ FE calls gimplify_stmt
on bits of trees, but then gimplify_expr() has to gimplify again.



It seems to me that a better approach would be to pass the
tree structure as generic as we can (without calls to gimplify_stmt in
the C++ FE), and then let gimplify_expr do its job.


Sure.  Another alternative would be to leave the calls to gimplify_stmt 
(or probably change them to gimplify_to_stmt_list) and return 
GS_ALL_DONE from cp_gimplify_expr.


Jason


Bad unwinder data for __kernel_sigtramp_rt64 in PPC 64 vDSO corrupts Condition Register

2007-10-16 Thread Andrew Haley
The symptom is that if you segfault and then throw an exception in the
segfault handler call-saved fields in the Condition Register are
corrupted.

The reason is that the unwinder data for CR in the vDSO is wrong.  The
line that affects the CR is here in
arch/powerpc/kernel/vdso64/sigtramp.S:

  rsave (70, 38*RSIZE)  /* cr */

This restores a 64-bit register from offset 38 in the sigcontext
register save area to DWARF Column 70.  This much is correct...

Unfortunately, gcc saves and restores only the *least significant*
32-bit half of CR on the stack.  As this is a big-endian machine, the
result is that when unwinding __kernel_sigtramp_rt64 the correctly
saved CR is written to the upper half of the word on the stack, not
the lower half, and the saved CR is overwritten with zeroes.

It is not immediately clear to me how to fix this: I think you would
need to find a DWARF expression that copies a halfword value.

Test case attached.  Tested on Kernel 2.6.18-8.1.10.el5.

Andrew.



#include 
#include 
#include 

class SegfaultException
{
};

void catch_segv (int)
{
  throw new SegfaultException;
}

void
segfault (int *p)
{
  fprintf (stderr, "%n", *p);
}


int main(int argc, char **argv)
{
  unsigned long cr, cr2;
  __asm__ __volatile__
("mtcrf   8, %0" : : "r" (0x2000): "cr4");

  struct sigaction sa;
  sa.sa_handler = catch_segv;
  sigemptyset (&sa.sa_mask);
  sa.sa_flags = SA_NODEFER;
  sigaction (SIGSEGV, &sa, NULL);

  __asm__ __volatile__
("mfcr %0" : "=r" (cr));
  fprintf (stderr, "cr = 0x%x\n", cr & 0xfff000);

  try
{
  segfault(NULL);
}
  catch (SegfaultException *a)
{
}

  __asm__ __volatile__
("mfcr %0" : "=r" (cr));
  fprintf (stderr, "cr = 0x%x\n", cr & 0xfff000);

  return 0;
}


Re: double gimplification in C++ FE

2007-10-16 Thread Aldy Hernandez
> Yes, the gimplifier often makes several passes over the same trees to get 
> them completely lowered.  cp_gimplify_expr is a subroutine of the 
> gimplifier.

Good, I just wanted to make sure I wasn't off my rocker or anything.

> Sure.  Another alternative would be to leave the calls to gimplify_stmt (or 
> probably change them to gimplify_to_stmt_list) and return GS_ALL_DONE from 
> cp_gimplify_expr.

Yes, in a few places it definitely seems better to completely gimplify
the given statement and return GS_ALL_DONE.  Will do so when it's
easier.

Heads up.

Thanks.
Aldy


Re: Plans for Linux ELF "i686+" ABI ? Like SPARC V8+ ?

2007-10-16 Thread Michael Meissner
On Tue, Oct 16, 2007 at 12:53:13AM +0200, Andi Kleen wrote:
> > Actually no.  In 32-bit mode, double is aligned on a 4 byte boundary, not 
> > an 8
> > byte boundary, unless you use -malign-double, which breaks the ABI.  This 
> > has
> > been a 'feature' of the original AT&T 386 System V ABI that Linux uses for
> > 32-bit x86 processors.  With the SCO mess, it may be hard to ever change 
> > that
> > ABI
> 
> My gcc doesn't agree with you (I actually checked before posting)
> 
> ~> cat t.c
> 
> int main(void)
> {
> double x;
> printf("%d\n", __alignof__(x));
> return 0;
> }
> ~> gcc -m32 -o t t.c
> t.c: In function ‘main’:
> t.c:5: warning: incompatible implicit declaration of built-in function 
> ‘printf’
> ~> ./t
> 8
> ~> 
> 

Doubles that are scalar variables are aligned on a 64-bit boundary, but doubles
that are within structures are only aligned to a 32-bit boundary, which comes
from the published i386 ABI from System V.  Here is the code in question from
gcc/config/i386/i386.h:

/* The published ABIs say that doubles should be aligned on word
   boundaries, so lower the alignment for structure fields unless
   -malign-double is set.  */

/* ??? Blah -- this macro is used directly by libobjc.  Since it
   supports no vector modes, cut out the complexity and fall back
   on BIGGEST_FIELD_ALIGNMENT.  */
#ifdef IN_TARGET_LIBS
#ifdef __x86_64__
#define BIGGEST_FIELD_ALIGNMENT 128
#else
#define BIGGEST_FIELD_ALIGNMENT 32
#endif
#else
#define ADJUST_FIELD_ALIGN(FIELD, COMPUTED) \
   x86_field_alignment (FIELD, COMPUTED)
#endif

And this is where we recompute the alignment in i386.c:

int
x86_field_alignment (tree field, int computed)
{
  enum machine_mode mode;
  tree type = TREE_TYPE (field);

  if (TARGET_64BIT || TARGET_ALIGN_DOUBLE)
return computed;
  mode = TYPE_MODE (TREE_CODE (type) == ARRAY_TYPE
? get_inner_array_type (type) : type);
  if (mode == DFmode || mode == DCmode
  || GET_MODE_CLASS (mode) == MODE_INT
  || GET_MODE_CLASS (mode) == MODE_COMPLEX_INT)
return MIN (32, computed);
  return computed;
}


-- 
Michael Meissner, AMD
90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
[EMAIL PROTECTED]




Re: Bad unwinder data for __kernel_sigtramp_rt64 in PPC 64 vDSO corrupts Condition Register

2007-10-16 Thread Jakub Jelinek
On Tue, Oct 16, 2007 at 06:02:13PM +0100, Andrew Haley wrote:
> The reason is that the unwinder data for CR in the vDSO is wrong.  The
> line that affects the CR is here in

According to __builtin_init_dwarf_reg_size_table on ppc64-linux
r0..r31, fp0..fp31, mq, lr, ctr, ap, vrsave, vscr, spe_acc, spefcsr, sfp
are 64-bit, v0..v31 128-bit and cr0..cr7, xer 32-bit.
So both kernel and gcc/config/rs6000/linux-unwind.h are wrong.

> arch/powerpc/kernel/vdso64/sigtramp.S:
> 
>   rsave (70, 38*RSIZE)/* cr */

This should just be changed to
/* Size of CR regs in DWARF unwind info.  */
#define CRSIZE  4
...
rsave (70, 38*RSIZE + (RSIZE - CRSIZE)) /* cr */

and similarly linux-unwind.h should do:

fs->regs.reg[R_CR2].loc.offset = (long) ®s->ccr - new_cfa;
/* CR? regs are just 32-bit and PPC is big-endian.  */
fs->regs.reg[R_CR2].loc.offset += sizeof (long) - 4;

Jakub


Re: Bad unwinder data for __kernel_sigtramp_rt64 in PPC 64 vDSO corrupts Condition Register

2007-10-16 Thread Andrew Haley
Jakub Jelinek writes:
 > On Tue, Oct 16, 2007 at 06:02:13PM +0100, Andrew Haley wrote:
 > > The reason is that the unwinder data for CR in the vDSO is wrong.  The
 > > line that affects the CR is here in
 > 
 > According to __builtin_init_dwarf_reg_size_table on ppc64-linux
 > r0..r31, fp0..fp31, mq, lr, ctr, ap, vrsave, vscr, spe_acc, spefcsr, sfp
 > are 64-bit, v0..v31 128-bit and cr0..cr7, xer 32-bit.
 > So both kernel and gcc/config/rs6000/linux-unwind.h are wrong.
 > 
 > > arch/powerpc/kernel/vdso64/sigtramp.S:
 > > 
 > >   rsave (70, 38*RSIZE) /* cr */
 > 
 > This should just be changed to
 > /* Size of CR regs in DWARF unwind info.  */
 > #define CRSIZE   4
 > ...
 > rsave (70, 38*RSIZE + (RSIZE - CRSIZE))  /* cr */
 > 
 > and similarly linux-unwind.h should do:
 > 
 > fs->regs.reg[R_CR2].loc.offset = (long) ®s->ccr - new_cfa;
 > /* CR? regs are just 32-bit and PPC is big-endian.  */
 > fs->regs.reg[R_CR2].loc.offset += sizeof (long) - 4;

Won't this generate an alignment fault?

Andrew.


Re: Library not loaded

2007-10-16 Thread Jan-Benedict Glaw
On Tue, 2007-10-16 07:47:51 -0700, Denis Tkachov <[EMAIL PROTECTED]> wrote:
> I am having problem starting my application that is successfully built. I am
> using boost to serialize/deserialize data. I have link boost library and my
> project is built successfully, but I cannot run it.


This is most probably not a problem with GCC itself, but with using it
(and the linker), so your question would be better handled at
<[EMAIL PROTECTED]>.

> Running the project (build&go in xcode) I receive this error:
>
> dyld: Library not loaded: stage/lib/libboost_serialization-1_34_1.dylib
>   Referenced from:
> /Users/dtsachov/dev/evdp/temp/TestSerialization/build/Debug/TestSerialization
>   Reason: image not found
> 
> Note - that is not a problem with boost, that is some problem with linking
> to the library in runtime. I had the same problem with my 2 projects, one
> uses another - one is a command line tool and another is a library. If the
> library is dynamic library I am able to build the project but unable to run
> it, getting the same error. When I made my library static as library (not
> dynamic) I can run the application.

Some additional hints were somewhat useful, like

  * Which compiler and linker is used? For which platform? MacOS I
guess?
  * How does the final linker call look like?
  * How does your dynamic linker configuration look like?
Specifically: Is your libboost_serialization-1_34_1 in the
library search path?

> So I suggest this is some problem of locating the library in runtime.

Most probably.

> Does anybody have an idea ?

You need to tell the dynamic linker where to actually find the
libraries. "stage/lib/..." looks like a relative path, but to what
location?

MfG, JBG

-- 
  Jan-Benedict Glaw  [EMAIL PROTECTED]  +49-172-7608481
Signature of: Friends are relatives you make for yourself.
the second  :


signature.asc
Description: Digital signature


Re: Bad unwinder data for __kernel_sigtramp_rt64 in PPC 64 vDSO corrupts Condition Register

2007-10-16 Thread Jakub Jelinek
On Tue, Oct 16, 2007 at 07:22:31PM +0100, Andrew Haley wrote:
>  > and similarly linux-unwind.h should do:
>  > 
>  > fs->regs.reg[R_CR2].loc.offset = (long) ®s->ccr - new_cfa;
>  > /* CR? regs are just 32-bit and PPC is big-endian.  */
>  > fs->regs.reg[R_CR2].loc.offset += sizeof (long) - 4;
> 
> Won't this generate an alignment fault?

Why?  The reg size is 32-bit, so it only should be read/written
as 32-bit value.
E.g. a brief look at _Unwind_RaiseException shows:
lwz 12,5656(1)
...
mtcrf 32,12  #,
ld 15,5368(1)#,
ld 16,5376(1)#,
mtcrf 16,12  #,
ld 17,5384(1)#,
ld 18,5392(1)#,
mtcrf 8,12   #,
so it shouldn't have any problems with 4 byte alignment (rather than 8 byte
alignment).

Jakub


Re: Bad unwinder data for __kernel_sigtramp_rt64 in PPC 64 vDSO corrupts Condition Register

2007-10-16 Thread Andrew Haley
Jakub Jelinek writes:
 > On Tue, Oct 16, 2007 at 07:22:31PM +0100, Andrew Haley wrote:
 > >  > and similarly linux-unwind.h should do:
 > >  > 
 > >  > fs->regs.reg[R_CR2].loc.offset = (long) ®s->ccr - new_cfa;
 > >  > /* CR? regs are just 32-bit and PPC is big-endian.  */
 > >  > fs->regs.reg[R_CR2].loc.offset += sizeof (long) - 4;
 > > 
 > > Won't this generate an alignment fault?
 > 
 > Why?  The reg size is 32-bit, so it only should be read/written
 > as 32-bit value.
 > E.g. a brief look at _Unwind_RaiseException shows:
 >  lwz 12,5656(1)
 > ...
 > mtcrf 32,12  #,
 > ld 15,5368(1)#,
 > ld 16,5376(1)#,
 > mtcrf 16,12  #,
 > ld 17,5384(1)#,
 > ld 18,5392(1)#,
 > mtcrf 8,12   #,
 > so it shouldn't have any problems with 4 byte alignment (rather than 8 byte
 > alignment).

Yes, I think it's OK.

I was thinking about uw_install_context_1, but that should be fine, as
it uses memcpy.

Andrew.


How do I upgrade gcc/g++ on a Mac

2007-10-16 Thread Gordon Prieur

Hi,

   We have 2 different Macs, both running 10.4 OS but different builds 
of the
same versions of gcc/g++/gdb. With the older set of tools we see gdb 
failures

debugging some C++ applications (faulty line table information). Debugging
the same program on with the later builds, works fine. We've tried the 
automatic

update service on the Mac but it doesn't update either g++ or gdb.

   Is there a way to get the latest gcc/g++/gdb on a Mac? We're not 
interested

in hearing that we should upgrade to Leopard (we plan to:-) because we're
more interested in what we can tell our customers than how anything else. We
provide a gdb-based C/C++ debugger in NetBeans 6.0 and this Mac bug makes
our debugger fail. If we could tell customers how to upgrade their compilers
then that would resolve the issue.

Thanks,
Gordon



Re: How do I upgrade gcc/g++ on a Mac

2007-10-16 Thread Andreas Tobler

Gordon Prieur wrote:
   We have 2 different Macs, both running 10.4 OS but different builds 
of the
same versions of gcc/g++/gdb. With the older set of tools we see gdb 
failures

debugging some C++ applications (faulty line table information). Debugging
the same program on with the later builds, works fine. We've tried the 
automatic

update service on the Mac but it doesn't update either g++ or gdb.

   Is there a way to get the latest gcc/g++/gdb on a Mac? We're not 
interested

in hearing that we should upgrade to Leopard (we plan to:-) because we're
more interested in what we can tell our customers than how anything 
else. We

provide a gdb-based C/C++ debugger in NetBeans 6.0 and this Mac bug makes
our debugger fail. If we could tell customers how to upgrade their 
compilers

then that would resolve the issue.


This is kind of off-topic on this list, but anyway. I'd update Xcode, 
which provides the tools you're mentioning. You'll get it from Apple. 
http://developer.apple.com/tools/download/

Maybe you have to register, don't know.
Actual release for 10.4.10 is 2.4.1.

Andreas



Re: Library not loaded

2007-10-16 Thread Andreas Tobler

Denis Tkachov wrote:

Hi all

I am having problem starting my application that is successfully built. I am
using boost to serialize/deserialize data. I have link boost library and my
project is built successfully, but I cannot run it.

Running the project (build&go in xcode) I receive this error:

dyld: Library not loaded: stage/lib/libboost_serialization-1_34_1.dylib
  Referenced from:
/Users/dtsachov/dev/evdp/temp/TestSerialization/build/Debug/TestSerialization
  Reason: image not found

Note - that is not a problem with boost, that is some problem with linking
to the library in runtime. I had the same problem with my 2 projects, one
uses another - one is a command line tool and another is a library. If the
library is dynamic library I am able to build the project but unable to run
it, getting the same error. When I made my library static as library (not
dynamic) I can run the application.

So I suggest this is some problem of locating the library in runtime.

Does anybody have an idea ?


You might set your DYLD_LIBRARY_PATH to point to your built lib?

with tcsh:
'setenv DYLD_LIBRARY_PATH /path-to-your-lib:$DYLD_LIBRARY_PATH'
Given that you already have a DYLD_LIBRARY_PATH, otherwise leave the 
post $DYLD_LIBRARY_PATH.


Andreas



RE: Plans for Linux ELF "i686+" ABI ? Like SPARC V8+ ?

2007-10-16 Thread Dave Korn
On 15 October 2007 23:53, Andi Kleen wrote:

> int main(void)
> {
> double x;
> printf("%d\n", __alignof__(x));
> return 0;
> }
> ~> gcc -m32 -o t t.c
> t.c: In function ‘main’:
> t.c:5: warning: incompatible implicit declaration of built-in function
> ‘printf’ 


  I find a call to an unprototyped stdargs function in the middle of a 
discussion about the niceties of ABIs amusingly ironic!  :-)


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Bad unwinder data for __kernel_sigtramp_rt64 in PPC 64 vDSO corrupts Condition Register

2007-10-16 Thread Alan Modra
On Tue, Oct 16, 2007 at 08:21:55PM +0200, Jakub Jelinek wrote:
> On Tue, Oct 16, 2007 at 06:02:13PM +0100, Andrew Haley wrote:
> > The reason is that the unwinder data for CR in the vDSO is wrong.  The
> > line that affects the CR is here in

My fault.

> According to __builtin_init_dwarf_reg_size_table on ppc64-linux
> r0..r31, fp0..fp31, mq, lr, ctr, ap, vrsave, vscr, spe_acc, spefcsr, sfp
> are 64-bit, v0..v31 128-bit and cr0..cr7, xer 32-bit.
> So both kernel and gcc/config/rs6000/linux-unwind.h are wrong.
> 
> > arch/powerpc/kernel/vdso64/sigtramp.S:
> > 
> >   rsave (70, 38*RSIZE)  /* cr */
> 
> This should just be changed to
> /* Size of CR regs in DWARF unwind info.  */
> #define CRSIZE4
> ...
> rsave (70, 38*RSIZE + (RSIZE - CRSIZE))   /* cr */
> 
> and similarly linux-unwind.h should do:
> 
> fs->regs.reg[R_CR2].loc.offset = (long) ®s->ccr - new_cfa;
> /* CR? regs are just 32-bit and PPC is big-endian.  */
> fs->regs.reg[R_CR2].loc.offset += sizeof (long) - 4;

This looks good to me.  I don't think we can change the unwinder to
use a different size for cr as that would break unwinding through
normal stack frames that save cr.

-- 
Alan Modra
Australia Development Lab, IBM


Re: Bad unwinder data for __kernel_sigtramp_rt64 in PPC 64 vDSO corrupts Condition Register

2007-10-16 Thread Benjamin Herrenschmidt

On Wed, 2007-10-17 at 12:58 +0930, Alan Modra wrote:
> On Tue, Oct 16, 2007 at 08:21:55PM +0200, Jakub Jelinek wrote:
> > On Tue, Oct 16, 2007 at 06:02:13PM +0100, Andrew Haley wrote:
> > > The reason is that the unwinder data for CR in the vDSO is wrong.  The
> > > line that affects the CR is here in
> 
> My fault.
> 
> > According to __builtin_init_dwarf_reg_size_table on ppc64-linux
> > r0..r31, fp0..fp31, mq, lr, ctr, ap, vrsave, vscr, spe_acc, spefcsr, sfp
> > are 64-bit, v0..v31 128-bit and cr0..cr7, xer 32-bit.
> > So both kernel and gcc/config/rs6000/linux-unwind.h are wrong.
> > 
> > > arch/powerpc/kernel/vdso64/sigtramp.S:
> > > 
> > >   rsave (70, 38*RSIZE)/* cr */
> > 
> > This should just be changed to
> > /* Size of CR regs in DWARF unwind info.  */
> > #define CRSIZE  4
> > ...
> > rsave (70, 38*RSIZE + (RSIZE - CRSIZE)) /* cr */
> > 
> > and similarly linux-unwind.h should do:
> > 
> > fs->regs.reg[R_CR2].loc.offset = (long) ®s->ccr - new_cfa;
> > /* CR? regs are just 32-bit and PPC is big-endian.  */
> > fs->regs.reg[R_CR2].loc.offset += sizeof (long) - 4;
> 
> This looks good to me.  I don't think we can change the unwinder to
> use a different size for cr as that would break unwinding through
> normal stack frames that save cr.

So the kernel fix would look like that right ? If you are ok, I'll
submit it tomorrow.

Index: linux-work/arch/powerpc/kernel/vdso64/sigtramp.S
===
--- linux-work.orig/arch/powerpc/kernel/vdso64/sigtramp.S   2007-10-17 
13:32:49.0 +1000
+++ linux-work/arch/powerpc/kernel/vdso64/sigtramp.S2007-10-17 
13:34:18.0 +1000
@@ -134,13 +134,16 @@ V_FUNCTION_END(__kernel_sigtramp_rt64)
 9:
 
 /* This is where the pt_regs pointer can be found on the stack.  */
-#define PTREGS 128+168+56
+#define PTREGS 128+168+56
 
 /* Size of regs.  */
-#define RSIZE 8
+#define RSIZE  8
+
+/* Size of CR reg in DWARF unwind info. */
+#define CRSIZE 4
 
 /* This is the offset of the VMX reg pointer.  */
-#define VREGS 48*RSIZE+33*8
+#define VREGS  48*RSIZE+33*8
 
 /* Describe where general purpose regs are saved.  */
 #define EH_FRAME_GEN \
@@ -178,7 +181,7 @@ V_FUNCTION_END(__kernel_sigtramp_rt64)
   rsave (31, 31*RSIZE);
\
   rsave (67, 32*RSIZE);/* ap, used as temp for nip */  
\
   rsave (65, 36*RSIZE);/* lr */
\
-  rsave (70, 38*RSIZE) /* cr */
+  rsave (70, 38*RSIZE + (RSIZE - CRSIZE)) /* cr */
 
 /* Describe where the FP regs are saved.  */
 #define EH_FRAME_FP \





Re: df_insn_refs_record's handling of global_regs[]

2007-10-16 Thread David Miller
From: David Miller <[EMAIL PROTECTED]>
Date: Tue, 16 Oct 2007 03:12:23 -0700 (PDT)

> I have a bug I'm trying to investigate where, starting in gcc-4.2.x,
> the loop invariant pass considers a computation involving a global
> register variable as invariant across a call.  The basic structure
> of the code is:

Here is the most simplified test case I could come up with,
compile it with "-m64 -Os" on sparc.  expression(regval) is
moved to before the loop by loop-invariant

register unsigned long regval asm("g5");

extern void cond_resched(void);

unsigned int var;

void *expression(unsigned long regval)
{
  void *ret;

  __asm__("" : "=r" (ret) : "0" (&var));
  return ret + regval;
}

void func(void **pp)
{
  int i;

  for (i = 0; i < 56; i++) {
cond_resched();
*pp = expression(regval);
  }
}


Re: df_insn_refs_record's handling of global_regs[]

2007-10-16 Thread Seongbae Park (박성배, 朴成培)
On 10/16/07, David Miller <[EMAIL PROTECTED]> wrote:
> From: David Miller <[EMAIL PROTECTED]>
> Date: Tue, 16 Oct 2007 03:12:23 -0700 (PDT)
>
> > I have a bug I'm trying to investigate where, starting in gcc-4.2.x,
> > the loop invariant pass considers a computation involving a global
> > register variable as invariant across a call.  The basic structure
> > of the code is:
>
> Here is the most simplified test case I could come up with,
> compile it with "-m64 -Os" on sparc.  expression(regval) is
> moved to before the loop by loop-invariant
>
> register unsigned long regval asm("g5");
>
> extern void cond_resched(void);
>
> unsigned int var;
>
> void *expression(unsigned long regval)
> {
>   void *ret;
>
>   __asm__("" : "=r" (ret) : "0" (&var));
>   return ret + regval;
> }
>
> void func(void **pp)
> {
>   int i;
>
>   for (i = 0; i < 56; i++) {
> cond_resched();
> *pp = expression(regval);
>   }
> }

loop-invariant.cc uses ud-chain.
So if there's something wrong with the chain,
it could go nuts.
Can you send me the rtl dump of loop2_invariant pass ?
-- 
#pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com";


Re: df_insn_refs_record's handling of global_regs[]

2007-10-16 Thread David Miller
From: "Seongbae Park (박성배, 朴成培)" <[EMAIL PROTECTED]>
Date: Tue, 16 Oct 2007 21:53:37 -0700

Annyoung haseyo, Park-sanseng-nim,

> loop-invariant.cc uses ud-chain.
> So if there's something wrong with the chain,
> it could go nuts.
> Can you send me the rtl dump of loop2_invariant pass ?

I have found the problem, and yes it has to do with the ud chains.

Because global registers are only marked via df_invalidated_by_call,
they get the DF_REF_MAY_CLOBBER flag.

This flag causes the dataflow problem solver to not add the global
register definitions to the generator set.  Specifically I am
talking about the code in df_rd_bb_local_compute_process_def(), it
says:

if (!(DF_REF_FLAGS (def)
  & (DF_REF_MUST_CLOBBER | DF_REF_MAY_CLOBBER)))
  bitmap_set_bit (bb_info->gen, DF_REF_ID (def));

Global registers don't get clobbered by calls, they are potentially
set as a side effect of calling them.  And they are set to valid
values we might actually depend upon as inputs later.

I tried a potential fix, which is to change df_insn_refs_record(),
such that it handles global registers instead like this:

if (global_regs[i])
  df_ref_record (dflow, regno_reg_rtx[i], ®no_reg_rtx[i],
 bb, insn, DF_REF_REG_DEF, 0, true);

and this made the illegal loop-invariant transformation no longer
occur in my test case.


Re: df_insn_refs_record's handling of global_regs[]

2007-10-16 Thread Seongbae Park (박성배, 朴成培)
On 10/16/07, David Miller <[EMAIL PROTECTED]> wrote:
> From: "Seongbae Park (박성배, 朴成培)" <[EMAIL PROTECTED]>
> Date: Tue, 16 Oct 2007 21:53:37 -0700
>
> Annyoung haseyo, Park-sanseng-nim,

:)

> > loop-invariant.cc uses ud-chain.
> > So if there's something wrong with the chain,
> > it could go nuts.
> > Can you send me the rtl dump of loop2_invariant pass ?
>
> I have found the problem, and yes it has to do with the ud chains.
>
> Because global registers are only marked via df_invalidated_by_call,
> they get the DF_REF_MAY_CLOBBER flag.
>
> This flag causes the dataflow problem solver to not add the global
> register definitions to the generator set.  Specifically I am
> talking about the code in df_rd_bb_local_compute_process_def(), it
> says:
>
> if (!(DF_REF_FLAGS (def)
>   & (DF_REF_MUST_CLOBBER | DF_REF_MAY_CLOBBER)))
>   bitmap_set_bit (bb_info->gen, DF_REF_ID (def));
>
> Global registers don't get clobbered by calls, they are potentially
> set as a side effect of calling them.  And they are set to valid
> values we might actually depend upon as inputs later.
>
> I tried a potential fix, which is to change df_insn_refs_record(),
> such that it handles global registers instead like this:
>
> if (global_regs[i])
>   df_ref_record (dflow, regno_reg_rtx[i], ®no_reg_rtx[i],
>  bb, insn, DF_REF_REG_DEF, 0, true);
>
> and this made the illegal loop-invariant transformation no longer
> occur in my test case.

Did you replace the DF_REF_REG_USE with DEF ?
If so, that's not correct.  We need to add DEF as well as USE:

diff -r fd0f94fbe89d gcc/df-scan.c
--- a/gcc/df-scan.c Wed Oct 10 03:32:43 2007 +
+++ b/gcc/df-scan.c Tue Oct 16 22:52:44 2007 -0700
@@ -3109,8 +3109,13 @@ df_get_call_refs (struct df_collection_r
  so they are recorded as used.  */
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
 if (global_regs[i])
-  df_ref_record (collection_rec, regno_reg_rtx[i],
-NULL, bb, insn, DF_REF_REG_USE, flags);
+  {
+df_ref_record (collection_rec, regno_reg_rtx[i],
+  NULL, bb, insn, DF_REF_REG_USE, flags);
+df_ref_record (collection_rec, regno_reg_rtx[i],
+  NULL, bb, insn, DF_REF_REG_DEF, flags);
+  }
+

   is_sibling_call = SIBLING_CALL_P (insn);
   EXECUTE_IF_SET_IN_BITMAP (df_invalidated_by_call, 0, ui, bi)


Then, we'll need to change the df_invalidated_by_call loop
not to add global_regs[] again (with MAY_CLOBBER bits).
-- 
#pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com";


There is a gentleman that.

2007-10-16 Thread Mariano Tidwell
It should also make it much easier for broadband suppliers to negotiate a.
The UK has come in for criticism for its slow rollout of broadband and for 
setting.



Re: df_insn_refs_record's handling of global_regs[]

2007-10-16 Thread David Miller
From: "Seongbae Park (박성배, 朴成培)" <[EMAIL PROTECTED]>
Date: Tue, 16 Oct 2007 22:56:49 -0700

> We need to add DEF as well as USE:
> 
> diff -r fd0f94fbe89d gcc/df-scan.c
> --- a/gcc/df-scan.c Wed Oct 10 03:32:43 2007 +
> +++ b/gcc/df-scan.c Tue Oct 16 22:52:44 2007 -0700
> @@ -3109,8 +3109,13 @@ df_get_call_refs (struct df_collection_r
>   so they are recorded as used.  */
>for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
>  if (global_regs[i])
> -  df_ref_record (collection_rec, regno_reg_rtx[i],
> -NULL, bb, insn, DF_REF_REG_USE, flags);
> +  {
> +df_ref_record (collection_rec, regno_reg_rtx[i],
> +  NULL, bb, insn, DF_REF_REG_USE, flags);
> +df_ref_record (collection_rec, regno_reg_rtx[i],
> +  NULL, bb, insn, DF_REF_REG_DEF, flags);
> +  }
> +
> 
>is_sibling_call = SIBLING_CALL_P (insn);
>EXECUTE_IF_SET_IN_BITMAP (df_invalidated_by_call, 0, ui, bi)
> 
> 
> Then, we'll need to change the df_invalidated_by_call loop
> not to add global_regs[] again (with MAY_CLOBBER bits).

Indeed.

I will do some regression testing of the following patch against
gcc-4.2.x:

--- ./gcc/df-scan.c.ORIG2007-10-16 02:07:46.0 -0700
+++ ./gcc/df-scan.c 2007-10-16 23:00:32.0 -0700
@@ -1584,12 +1584,19 @@ df_insn_refs_record (struct dataflow *df
 so they are recorded as used.  */
  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
if (global_regs[i])
- df_uses_record (dflow, ®no_reg_rtx[i],
- DF_REF_REG_USE, bb, insn, 
- 0);
+ {
+   df_uses_record (dflow, ®no_reg_rtx[i],
+   DF_REF_REG_USE, bb, insn, 0);
+   df_ref_record (dflow, regno_reg_rtx[i], ®no_reg_rtx[i],
+  bb, insn, DF_REF_REG_DEF, 0, true);
+ }
+
  EXECUTE_IF_SET_IN_BITMAP (df_invalidated_by_call, 0, ui, bi)
-   df_ref_record (dflow, regno_reg_rtx[ui], ®no_reg_rtx[ui], 
bb, 
-  insn, DF_REF_REG_DEF, DF_REF_MAY_CLOBBER, false);
+   {
+ if (!global_regs[ui])
+   df_ref_record (dflow, regno_reg_rtx[ui], 
®no_reg_rtx[ui], bb, 
+  insn, DF_REF_REG_DEF, DF_REF_MAY_CLOBBER, 
false);
+   }
}
}
 



Re: Plans for Linux ELF "i686+" ABI ? Like SPARC V8+ ?

2007-10-16 Thread Seongbae Park (박성배, 朴成培)
On 10/14/07, Darryl L. Miles <[EMAIL PROTECTED]> wrote:
>
> Hello,
>
> On SPARC there is an ABI that is V8+ which allows the linking (and
> mixing) of V8 ABI but makes uses of features of 64bit UltraSparc CPUs
> (that were not available in the older 32bit only CPUs).  Admittedly
> looking at the way this works it could be said that Sun had a certain
> about of  forward thinking when they developed their 32bit ABI (this is
> not true of the 32bit Intel IA32 ABIs that exist).

Sun didn't have much forward thinking with their 32-bit ABI.
It's just that their 32-bit ISA was relatively amenable
to 64-bit extension with 32-bit ABI compatibility,
which can not be said for IA32.

> Are there any plans for a plan a new Intel IA32 ABI that is designed
> solely to run on 64bit capable Intel IA32 CPUs (that support EMT64) ?
> Userspace would have 32bit memory addressing, but access to more
> registers, better function call conventions, etc...
>
> This would be aimed to replace the existing i386/i686 ABIs on the
> desktop and would not be looking to maintain backward compatibility
> directly.
>
> My own anecdotal evidence is that I've been using a x86_64 distribution
> (with dual 64bit and 32bit runtime support) for a few years now and have
> found performance to be lacking in my two largest footprint applications
> (my browser and my development IDE totaling 5Gb of footprint between
> them).   I recently converted both these applications from their 64bit
> versions to 32bit (they are the only 32bit applications running) and the
> overall interactive performance has improved considerably possibly due
> to the reduced memory footprint alone, a 4.5 Gb footprint 64bit
> application is equivalent to a 2 Gb footprint 32bit application in these
> application domains.
>
> Maybe someone knows of a white paper published to find out if the
> implications and benefit a movement in this direction would mean.  Maybe
> just using the existing 64bit ABI with 32bit void pointers (and long's)
> is as good a specification as any.
>
> RFCs,
>
> Darryl

More appropriate comparison is probably against MIPS,
with their two 32-bit ABIs (O32 and N32 ABI).
Essentially you're asking for N32 equivalent.

My bet is that most people simply don't care enough about
the performance differential.
-- 
#pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com";