Re: How to activate instruction scheduling in GCC?

2007-08-01 Thread petruk_gile

Thanks .. your reply is really helpful  ...

Btw, I checked the MIPS backend at MIPS.c, but I can't find the definition
of some functions such as: 

get_attr_hazard(), gen_hazard_nop (), etc. 

Anyone know where those functions defined? 




Ian Lance Taylor-3 wrote:
> 
> petruk_gile <[EMAIL PROTECTED]> writes:
> 
>> I'm a pure beginner in GCC, and currently working on a project to
>> implement
>> instruction scheduling for a new DSP processor. This processor doesn't
>> have
>> pipeline interlock, so the compiler HAVE to schedule the instruction
>> without
>> relying on hardware help anymore  
>> 
>> The problem is, I'm a very beginner in GCC. I think the scheduling in GCC
>> is
>> activated by INSN_SCHEDULING variable (in automatically generated file:
>> insn-attr.h), but I don't even know how to  activate this variable.
> 
> INSN_SCHEDULING will automatically be turned on if you have any
> define_insn_reservation clauses in your CPU.md file.  See the
> "Processor pipeline description" documentation in the gcc internals
> manual.
> 
> That said, the gcc scheduler unfortunately does not work very well for
> processors which do not have hardware interlocks.  The scheduler will
> lay out the instructions more or less optimally.  But the scheduler
> has no ability to insert nops when they are required to satisfy
> interlock constraints.
> 
> I know of two workable approachs.  You can either insert the required
> nops in the TARGET_MACHINE_DEPENDENT_REORG pass or in the
> TARGET_ASM_FUNCTION_PROLOGUE hook.  I personally prefer the latter
> approach, as it takes effect after all other instruction rearrangement
> is complete, but there are existing backends which use the former.
> 
> For an example of inserting nops in TARGET_MACHINE_DEPENDENT_REORG,
> see the MIPS backend, specifically mips_avoid_hazards.  For an example
> of inserting nops in TARGET_ASM_FUNCTION_PROLOGUE, see the FRV
> backend, specifically frv_pack_insns.
> 
> Ian
> 
> 

-- 
View this message in context: 
http://www.nabble.com/How-to-activate-instruction-scheduling-in-GCC--tf4167590.html#a11940780
Sent from the gcc - Dev mailing list archive at Nabble.com.



GCC 4.2.1 : testsuite says WARNING: program timed out

2007-08-01 Thread Dennis Clarke

Is there a way to allow the testsuite to just run regardless of howlong it
takes?

I am getting "program timed out" warnings for multiple tests :

Running
/export/home/dclarke/build/gcc-4.2.1/gcc/testsuite/gcc.c-torture/compile/compile.exp
...
WARNING: program timed out.
FAIL: gcc.c-torture/compile/20001226-1.c  -O1  (test for excess errors)
WARNING: program timed out.
FAIL: gcc.c-torture/compile/20001226-1.c  -O2  (test for excess errors)
WARNING: program timed out.
FAIL: gcc.c-torture/compile/20001226-1.c  -O3 -fomit-frame-pointer  (test
for excess errors)
WARNING: program timed out.
FAIL: gcc.c-torture/compile/20001226-1.c  -O3 -g  (test for excess errors)
WARNING: program timed out.


-
Dennis Clarke



Re: GCC 4.2.1 : testsuite says WARNING: program timed out

2007-08-01 Thread David Daney

Dennis Clarke wrote:

Is there a way to allow the testsuite to just run regardless of howlong it
takes?

I am getting "program timed out" warnings for multiple tests :

Running
/export/home/dclarke/build/gcc-4.2.1/gcc/testsuite/gcc.c-torture/compile/compile.exp
...
WARNING: program timed out.
FAIL: gcc.c-torture/compile/20001226-1.c  -O1  (test for excess errors)
WARNING: program timed out.
FAIL: gcc.c-torture/compile/20001226-1.c  -O2  (test for excess errors)
WARNING: program timed out.
FAIL: gcc.c-torture/compile/20001226-1.c  -O3 -fomit-frame-pointer  (test
for excess errors)
WARNING: program timed out.
FAIL: gcc.c-torture/compile/20001226-1.c  -O3 -g  (test for excess errors)
WARNING: program timed out.

  
You need a faster computer.  Those tests take a long time.  On slow 
systems they take longer than the default testsuite timeout to compile.  
You can probably safely ignore time outs for 20001226-1.c


David Daney


Re: GCC 4.2.1 : testsuite says WARNING: program timed out

2007-08-01 Thread Dennis Clarke

> Dennis Clarke wrote:
>> Is there a way to allow the testsuite to just run regardless of howlong it
>> takes?
>>
>> I am getting "program timed out" warnings for multiple tests :
>>
>> Running
>> /export/home/dclarke/build/gcc-4.2.1/gcc/testsuite/gcc.c-torture/compile/compile.exp
>> ...
>> WARNING: program timed out.
>> FAIL: gcc.c-torture/compile/20001226-1.c  -O1  (test for excess errors)
>> WARNING: program timed out.
>> FAIL: gcc.c-torture/compile/20001226-1.c  -O2  (test for excess errors)
>> WARNING: program timed out.
>> FAIL: gcc.c-torture/compile/20001226-1.c  -O3 -fomit-frame-pointer  (test
>> for excess errors)
>> WARNING: program timed out.
>> FAIL: gcc.c-torture/compile/20001226-1.c  -O3 -g  (test for excess errors)
>> WARNING: program timed out.
>>
>>
> You need a faster computer.

  Trust me .. I know.  :-)

  I do have access to top of the line Sun gear but I am running this
  experiment and this bootstrap on this machine.

  Thus, the question stands :

Is there a way to allow the testsuite to just run regardless of
how long it takes?


Dennis



Re: How to activate instruction scheduling in GCC?

2007-08-01 Thread petruk_gile

Sorry, no need already to bother with the last question, already knew that it
is (again) generated automatically from the Machine description file 



petruk_gile wrote:
> 
> Thanks .. your reply is really helpful  ...
> 
> Btw, I checked the MIPS backend at MIPS.c, but I can't find the definition
> of some functions such as: 
> 
> get_attr_hazard(), gen_hazard_nop (), etc. 
> 
> Anyone know where those functions defined? 
> 
> 
> 
> 
> Ian Lance Taylor-3 wrote:
>> 
>> petruk_gile <[EMAIL PROTECTED]> writes:
>> 
>>> I'm a pure beginner in GCC, and currently working on a project to
>>> implement
>>> instruction scheduling for a new DSP processor. This processor doesn't
>>> have
>>> pipeline interlock, so the compiler HAVE to schedule the instruction
>>> without
>>> relying on hardware help anymore  
>>> 
>>> The problem is, I'm a very beginner in GCC. I think the scheduling in
>>> GCC is
>>> activated by INSN_SCHEDULING variable (in automatically generated file:
>>> insn-attr.h), but I don't even know how to  activate this variable.
>> 
>> INSN_SCHEDULING will automatically be turned on if you have any
>> define_insn_reservation clauses in your CPU.md file.  See the
>> "Processor pipeline description" documentation in the gcc internals
>> manual.
>> 
>> That said, the gcc scheduler unfortunately does not work very well for
>> processors which do not have hardware interlocks.  The scheduler will
>> lay out the instructions more or less optimally.  But the scheduler
>> has no ability to insert nops when they are required to satisfy
>> interlock constraints.
>> 
>> I know of two workable approachs.  You can either insert the required
>> nops in the TARGET_MACHINE_DEPENDENT_REORG pass or in the
>> TARGET_ASM_FUNCTION_PROLOGUE hook.  I personally prefer the latter
>> approach, as it takes effect after all other instruction rearrangement
>> is complete, but there are existing backends which use the former.
>> 
>> For an example of inserting nops in TARGET_MACHINE_DEPENDENT_REORG,
>> see the MIPS backend, specifically mips_avoid_hazards.  For an example
>> of inserting nops in TARGET_ASM_FUNCTION_PROLOGUE, see the FRV
>> backend, specifically frv_pack_insns.
>> 
>> Ian
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/How-to-activate-instruction-scheduling-in-GCC--tf4167590.html#a11941887
Sent from the gcc - Dev mailing list archive at Nabble.com.



RE: GCC 4.2.1 : testsuite says WARNING: program timed out

2007-08-01 Thread Rupert Wood
Dennis Clarke wrote:

>Is there a way to allow the testsuite to just run regardless of
>how long it takes?

I think you need to pass "set timeout -1" into dejagnu. I'd suggest a larger 
positive timeout instead.

I forget the correct way to do this - I used to end up editing the .exp files 
in /usr/share/dejagnu.

Rup.



__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__


Re: [RFC] Improve Tree-SSA if-conversion - convergence of efforts

2007-08-01 Thread Tehila Meyzels
"Daniel Berlin" <[EMAIL PROTECTED]> wrote on 31/07/2007 18:00:57:

>
> I agree with you for conditional stores/loads.

Great!

>
> The unconditional store/load stuff, however, is exactly what
> tree-ssa-sink was meant to do, and belongs there (this is #3 above).
> I'm certainly going to fight tooth and nail against trying to shoehorn
> unconditional store sinking into if-conv.

Sometimes, store-sinking can cause performance degradations.
One reason for that, is increasing register pressure, due to extending life
range of registers.

In addition, in case we have a store followed by a branch, store sinking
result will be a branch followed by a store.
On some architectures, the former can be executed in parallel, as opposed
to the latter.
Thus, in this case, it worth executing store-sinking only when it helps the
if-conversion to get rid of the branch.

How do you suggest to solve this problem, in case store-sinking will be
part of the tree-sink pass?

Another point, what about (unconditional) load hoisting:
It's surely not related to sink pass, right?

Tehila.



creating low gimple code for gimplify_omp_atomic_pipeline

2007-08-01 Thread Razya Ladelsky
Hi,

In order to generate code for omp_atomic, I use force_gimple_operand which 
calls gimplify_omp_atomic; 
in some cases it calls gimplify_omp_atomic_pipeline, which expands the 
atomic operation to a
cycle (implementing it using atomic compare-and-swap primitive).
However, the cond_expr that is generated is structured, and needs to be 
lowered.

Any suggestions on how to create low gimple code for 
gimplify_omp_atomic_pipeline
cases?

Thanks,
Razya 


Re: AMD64 ABI compatibility

2007-08-01 Thread Kai Tietz
Hi Jan,

Jan Hubicka wrote on 31.07.2007 23:40:40:

> > Hi Kai,
> > 
> > so, could you resolve the remaining issues? Or have you kind of 
> > paused the project?
> > 
> > Cheers,
> > Nicolas
> > 
> > 
> > On Jul 12, 2007, at 2:14 , Kai Tietz wrote:
> > 
> > >Hi,
> > >
> > >I am nearly through :) The remaining macros left to be ported are
> > >REGPARM_MAX and SSE_REGPARM_MAX. The sysv_abi uses 6 regs and 8 sses,
> > >ms_abi uses 4 regs and 4 sse registers. The problem is for example 
> > >the use
> > >in i386.md of SSE_REGPARM_MAX without any hint, how to choose the 
> > >required
> > >abi. Do you have an idea how this could be done ?
> 
> This shoul not be dificult - ix86_regparm is used in
> ix86_function_regparm, init_cumulative_args, setup_incoming_varargs_64
> functions.  In all those cases you know the function declaration and
> thus you can take a look if it is call to different ABI and overwrite
> the value.

Ok, here is my update.

Cheers,
 i.A. Kai Tietz



|  (\_/)  This is Bunny. Copy and paste Bunny
| (='.'=) into your signature to help him gain
| (")_(") world domination.

--
  OneVision Software Entwicklungs GmbH & Co. KG
  Dr.-Leo-Ritter-Straße 9 - 93049 Regensburg
  Tel: +49.(0)941.78004.0 - Fax: +49.(0)941.78004.489 - www.OneVision.com
  Commerzbank Regensburg - BLZ 750 400 62 - Konto 6011050
  Handelsregister: HRA 6744, Amtsgericht Regensburg
  Komplementärin: OneVision Software Entwicklungs Verwaltungs GmbH
  Dr.-Leo-Ritter-Straße 9 – 93049 Regensburg
  Handelsregister: HRB 8932, Amtsgericht Regensburg - Geschäftsführer: 
Ulrike Döhler, Manuela Kluger

Index: gcc/gcc/calls.c
===
--- gcc.orig/gcc/calls.c
+++ gcc/gcc/calls.c
@@ -1187,6 +1187,7 @@ initialize_argument_information (int num
 static int
 compute_argument_block_size (int reg_parm_stack_space,
 struct args_size *args_size,
+tree fndecl, // 
 int preferred_stack_boundary ATTRIBUTE_UNUSED)
 {
   int unadjusted_args_size = args_size->constant;
@@ -1224,7 +1225,7 @@ compute_argument_block_size (int reg_par
 
  /* The area corresponding to register parameters is not to count in
 the size of the block we need.  So make the adjustment.  */
- if (!OUTGOING_REG_PARM_STACK_SPACE)
+ if (!OUTGOING_REG_PARM_STACK_SPACE (fndecl))
args_size->var
  = size_binop (MINUS_EXPR, args_size->var,
ssize_int (reg_parm_stack_space));
@@ -1245,7 +1246,7 @@ compute_argument_block_size (int reg_par
   args_size->constant = MAX (args_size->constant,
 reg_parm_stack_space);
 
-  if (!OUTGOING_REG_PARM_STACK_SPACE)
+  if (!OUTGOING_REG_PARM_STACK_SPACE (fndecl))
args_size->constant -= reg_parm_stack_space;
 }
   return unadjusted_args_size;
@@ -2036,7 +2037,7 @@ expand_call (tree exp, rtx target, int i
   reg_parm_stack_space = REG_PARM_STACK_SPACE (fndecl);
 #endif
 
-  if (!OUTGOING_REG_PARM_STACK_SPACE && reg_parm_stack_space > 0 && PUSH_ARGS)
+  if (!OUTGOING_REG_PARM_STACK_SPACE (fndecl) && reg_parm_stack_space > 0 && 
PUSH_ARGS)
 must_preallocate = 1;
 
   /* Set up a place to return a structure.  */
@@ -2442,7 +2443,7 @@ expand_call (tree exp, rtx target, int i
  /* Since we will be writing into the entire argument area,
 the map must be allocated for its entire size, not just
 the part that is the responsibility of the caller.  */
- if (!OUTGOING_REG_PARM_STACK_SPACE)
+ if (!OUTGOING_REG_PARM_STACK_SPACE (fndecl))
needed += reg_parm_stack_space;
 
 #ifdef ARGS_GROW_DOWNWARD
@@ -2541,7 +2542,7 @@ expand_call (tree exp, rtx target, int i
{
  rtx push_size
= GEN_INT (adjusted_args_size.constant
-  + (OUTGOING_REG_PARM_STACK_SPACE ? 0
+  + (OUTGOING_REG_PARM_STACK_SPACE (fndecl) ? 0
  : reg_parm_stack_space));
  if (old_stack_level == 0)
{
@@ -2712,7 +2713,7 @@ expand_call (tree exp, rtx target, int i
   /* If register arguments require space on the stack and stack space
 was not preallocated, allocate stack space here for arguments
 passed in registers.  */
-  if (OUTGOING_REG_PARM_STACK_SPACE && !ACCUMULATE_OUTGOING_ARGS
+  if (OUTGOING_REG_PARM_STACK_SPACE (fndecl) && !ACCUMULATE_OUTGOING_ARGS
  && must_preallocate == 0 && reg_parm_stack_space > 0)
anti_adjust_stack (GEN_INT (reg_parm_stack_space));
 
@@ -3537,7 +3538,7 @@ emit_library_call_value_1 (int retval, r
   args_size.constant = MAX (args_size.constant,
reg_parm_stack_space);
 
-  if (!OUTGOING_REG

Re: GCC 4.2.1 : testsuite says WARNING: program timed out

2007-08-01 Thread Christian Joensson
2007/8/1, Rupert Wood <[EMAIL PROTECTED]>:
> Dennis Clarke wrote:
>
> >Is there a way to allow the testsuite to just run regardless of
> >how long it takes?
>
> I think you need to pass "set timeout -1" into dejagnu. I'd suggest a larger 
> positive timeout instead.
>
> I forget the correct way to do this - I used to end up editing the .exp files 
> in /usr/share/dejagnu.

that's right, however, I recall some issues with, e.g., libstdc++
testsuite not using the system set in, if memory serves me right,
remote.exp.

-- 
Cheers,

/ChJ


Re: RFC: RTL sharing between decls and instructions

2007-08-01 Thread Ian Lance Taylor
Richard Sandiford <[EMAIL PROTECTED]> writes:

> gcc/
>   * emit-rtl.c (reset_used_decls): Rename to...
>   (set_used_decls): ...this.  Set the used flag rather than clearing it.
>   (unshare_all_rtl_again): Update accordingly.  Set flags on argument
>   DECL_RTLs rather than resetting them.

This is OK if it passes testing.  Your argument sounds right to me.

Thanks.

Ian


Re: GCC 4.2.1 : testsuite says WARNING: program timed out

2007-08-01 Thread Rask Ingemann Lambertsen
On Wed, Aug 01, 2007 at 03:57:19AM -0400, Dennis Clarke wrote:

> WARNING: program timed out.
> FAIL: gcc.c-torture/compile/20001226-1.c  -O1  (test for excess errors)

   It's in the archives:
http://gcc.gnu.org/ml/gcc/2006-09/msg00155.html>

-- 
Rask Ingemann Lambertsen


ICE on valid code, cse related

2007-08-01 Thread Pranav Bhandarkar
Hi,
I am working on a private port and getting an ICE in valid code. This
mainly is because of the following ( which is a part of the entire
dump of RTL of the source file)

(insn 13 8 14 2 /fc3/testcases/reduce/testcase-min.i:8 (set (reg:SI 138)
(const_int 0 [0x0])) 44 {*movsi} (expr_list:REG_LIBCALL_ID
(const_int 0 [0x0])
(nil)))

(insn 14 13 15 2 /fc3/testcases/reduce/testcase-min.i:8 (set (reg:SI 1 $c1)
(reg/f:SI 112 *fp*)) 44 {*movsi} (expr_list:REG_LIBCALL_ID
(const_int 1 [0x1])
(insn_list:REG_LIBCALL 17 (nil

(insn 15 14 16 2 /fc3/testcases/reduce/testcase-min.i:8 (set (reg:SI 2 $c2)
(reg:SI 138)) 44 {*movsi} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(nil)))

(call_insn 16 15 18 2 /fc3/testcases/reduce/testcase-min.i:8 (parallel [
(call (mem:SI (symbol_ref:SI ("__floatsisf") [flags 0x41])
[0 S4 A32])
(const_int 0 [0x0]))
(use (const_int 0 [0x0]))
(clobber (reg:SI 31 $link))
]) 41 {*call_direct} (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(expr_list:REG_EH_REGION (const_int -1 [0x])
(nil)))
(expr_list:REG_DEP_TRUE (use (reg:SI 2 $c2))
(expr_list:REG_DEP_TRUE (use (reg:SI 1 $c1))
(nil

(insn 18 16 17 2 /fc3/testcases/reduce/testcase-min.i:8 (clobber
(reg:SF 139)) -1 (expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(nil)))

(insn 17 18 19 2 /fc3/testcases/reduce/testcase-min.i:8 (set
(subreg:SI (reg:SF 139) 0)
(mem/c/i:SI (reg/f:SI 112 *fp*) [2 S4 A32])) 44 {*movsi}
(expr_list:REG_LIBCALL_ID (const_int 1 [0x1])
(insn_list:REG_RETVAL 14 (expr_list:REG_EQUAL (float:SF (reg:SI 138))
(nil


Note the REG_EQUAL note of insn 17. cse tries to replace reg:SI 138
with a constant and because of insn 13, the note becomes (float:SF
(const_int 0)) which in turn cse converts into

REG_EQUAL (const_double:SF 0 [0x0] 0.0 [0x0.0p+0])

and when CONST_DOUBLE_LOW is done on the above, the compiler crashes -

" internal compiler error: RTL check: expected code 'const_double' and
mode 'VOID', have code 'const_double' and mode 'SF' in plus_constant,
at explow.c:103"

i.e the compiler is crashing after converting a const_int to an SFmode value.

Could this possibly be a generic issue or a problem with my backend (
as in will I need to define movsf in my backend, which isnt defined at
present ) ?

Regret the rather verbose post.

Thanks in advance,
Pranav


Re: creating low gimple code for gimplify_omp_atomic_pipeline

2007-08-01 Thread Diego Novillo
On 8/1/07 8:07 AM, Razya Ladelsky wrote:

> Any suggestions on how to create low gimple code for 
> gimplify_omp_atomic_pipeline
> cases?

Interesting.  I think it's the first time we run into this problem.  I
don't see force_gimple_operand trying to emit low GIMPLE.  But we always
use it from the optimizers, so it should.

You cannot force the omp_atomic gimplifiers to emit low GIMPLE as those
are called by the GENERIC->GIMPLE conversion.  The easiest way to fix
this, I think, is to call lower_stmt() from force_gimple_operand() after
the call to gimplify_expr.  For this you'll need to setup a stmt
iterator on the resulting list of statements from gimplify_expr and call
lower_stmt on each of them (this should be implemented in gimple-low.c).

Longer term, I think we need to have an indicator of what level of
GIMPLE the function is in.  This way the various helpers like
force_gimple_operand can decide what to do.



RE: GCC 4.2.1 : testsuite says WARNING: program timed out

2007-08-01 Thread Dennis Clarke

> Dennis Clarke wrote:
>
>>Is there a way to allow the testsuite to just run regardless of
>>how long it takes?
>
> I think you need to pass "set timeout -1" into dejagnu. I'd suggest a larger
> positive timeout instead.
>
> I forget the correct way to do this - I used to end up editing the .exp
> files in /usr/share/dejagnu.

okay .. that sounds like a good hint.

Well .. the file in the default share/dejagnu directory look like so :


$ ls
baseboards framework.exp  mondfe.exp standard.exp   testglue.c
config ftp.expremote.exp stub-loader.c  tip.exp
debugger.exp   kermit.exp rlogin.exp target.exp util-defs.exp
dejagnu.explibexecrsh.exptargetdb.exp   utils.exp
dg.exp libgloss.exp   runtest.exptelnet.exp xsh.exp
$


somehow .. that can not be right.  Let's look in the GCC 4.2.1 objdir area
for files that end with .exp :

$ cd gcc-4.2.1-build
$ find . -type f | grep "\.exp"
./gcc/testsuite/gcc/site.exp
./gcc/site.exp
$

okay .. now we are getting somewhere.  Maybe :-\

$ cat ./gcc/testsuite/gcc/site.exp
## these variables are automatically generated by make ##
# Do not edit here. If you wish to override these values
# add them to the last section
set rootme "/opt/build/gcc-4.2.1-build/gcc"
set srcdir "/export/home/dclarke/build/gcc-4.2.1/gcc"
set host_triplet sparc-sun-solaris2.8
set build_triplet sparc-sun-solaris2.8
set target_triplet sparc-sun-solaris2.8
set target_alias sparc-sun-solaris2.8
set libiconv "/export/home/dclarke/local/lib/libiconv.so
-R/export/home/dclarke/local/lib"
set CFLAGS ""
set CXXFLAGS ""
set HOSTCC "cc"
set HOSTCFLAGS "-g"
set TESTING_IN_BUILD_TREE 1
set HAVE_LIBSTDCXX_V3 1
set tmpdir /opt/build/gcc-4.2.1-build/gcc/testsuite/gcc
set srcdir "${srcdir}/testsuite"
## All variables above are generated by configure. Do Not Edit ##
$

there is not much there that looks helpful ... and both those files look to
be the same :

$ ls -li ./gcc/testsuite/gcc/site.exp ./gcc/site.exp
263449 -rw-r--r--   1 dclarke  csw  759 Jul 31 20:58 ./gcc/site.exp
   1282849 -rw-r--r--   1 dclarke  csw  763 Jul 31 20:58
./gcc/testsuite/gcc/site.exp

$ diff ./gcc/testsuite/gcc/site.exp ./gcc/site.exp
17c17
< set tmpdir /opt/build/gcc-4.2.1-build/gcc/testsuite/gcc
---
> set tmpdir /opt/build/gcc-4.2.1-build/gcc/testsuite
$

great ... so then ... perhaps I do have to go back to the exp files in the
default dejagnu area ?

oh to heck with this ... perhaps I can tar up the whole objdir and move it
over to a 1.6GHz UltraSparc box and test it there .. but that defeats the
purpose.

Thanks for trying

Dennis


Re: [RFC] Improve Tree-SSA if-conversion - convergence of efforts

2007-08-01 Thread Daniel Berlin
On 8/1/07, Tehila Meyzels <[EMAIL PROTECTED]> wrote:
> "Daniel Berlin" <[EMAIL PROTECTED]> wrote on 31/07/2007 18:00:57:
>
> >
> > I agree with you for conditional stores/loads.
>
> Great!
>
> >
> > The unconditional store/load stuff, however, is exactly what
> > tree-ssa-sink was meant to do, and belongs there (this is #3 above).
> > I'm certainly going to fight tooth and nail against trying to shoehorn
> > unconditional store sinking into if-conv.
>
> Sometimes, store-sinking can cause performance degradations.
> One reason for that, is increasing register pressure, due to extending life
> range of registers.
>
> In addition, in case we have a store followed by a branch, store sinking
> result will be a branch followed by a store.
> On some architectures, the former can be executed in parallel, as opposed
> to the latter.
> Thus, in this case, it worth executing store-sinking only when it helps the
> if-conversion to get rid of the branch.
>

> How do you suggest to solve this problem, in case store-sinking will be
> part of the tree-sink pass?
>
Store sinking already *is* part of the tree-sink pass. It just only
sinks a small number of stores.
The solution to the problem that "sometimes you make things harder for
the target" is to fix that in the backend.  In this case, the
scheduler will take care of it.

All of our middle end optimizations will sometimes have bad effects
unless the backend fixes it up.Trying to guess what is going to
happen 55 passes down the line is a bad idea unless you happen to be a
very good psychic.

As a general rule of thumb, we are happy to make the backend as target
specific and ask as many target questions as you like.  The middle
end, not so much.  There are very few passes in the middle end that
can/should/do ask anything about the target.  Store sinking is not one
of them, and I see no good reason it should be.

> Another point, what about (unconditional) load hoisting:
> It's surely not related to sink pass, right?
>
PRE already will hoist unconditional loads out of loops, and in places
where it will eliminate redundancy.

It could also hoist loads in non-redundancy situations, it is simply
the case that it's current heuristic  does not think this is a good
idea.

Thus, if you wanted to do unconditional load hoisting, the thing to do
is to make a function like do_regular_insertion in tree-ssa-pre.c, and
call it from insert_aux.

We already have another heuristic for partially antic fully available
expressions, see do_partial_partial_insertion


The Linux binutils 2.17.50.0.18 is released

2007-08-01 Thread H.J. Lu
This is the beta release of binutils 2.17.50.0.18 for Linux, which is
based on binutils 2007 0731 in CVS on sourceware.org plus various
changes. It is purely for Linux.

All relevant patches in patches have been applied to the source tree.
You can take a look at patches/README to see what have been applied and
in what order they have been applied.

Starting from the 2.17.50.0.4 release, the default output section LMA
(load memory address) has changed for allocatable sections from being
equal to VMA (virtual memory address), to keeping the difference between
LMA and VMA the same as the previous output section in the same region.

For

.data.init_task : { *(.data.init_task) }

LMA of .data.init_task section is equal to its VMA with the old linker.
With the new linker, it depends on the previous output section. You
can use

.data.init_task : AT (ADDR(.data.init_task)) { *(.data.init_task) }

to ensure that LMA of .data.init_task section is always equal to its
VMA. The linker script in the older 2.6 x86-64 kernel depends on the
old behavior.  You can add AT (ADDR(section)) to force LMA of
.data.init_task section equal to its VMA. It will work with both old
and new linkers. The x86-64 kernel linker script in kernel 2.6.13 and
above is OK.

The new x86_64 assembler no longer accepts

monitor %eax,%ecx,%edx

You should use

monitor %rax,%ecx,%edx

or
monitor

which works with both old and new x86_64 assemblers. They should
generate the same opcode.

The new i386/x86_64 assemblers no longer accept instructions for moving
between a segment register and a 32bit memory location, i.e.,

movl (%eax),%ds
movl %ds,(%eax)

To generate instructions for moving between a segment register and a
16bit memory location without the 16bit operand size prefix, 0x66,

mov (%eax),%ds
mov %ds,(%eax)

should be used. It will work with both new and old assemblers. The
assembler starting from 2.16.90.0.1 will also support

movw (%eax),%ds
movw %ds,(%eax)

without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are
available at

http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch
http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch

The ia64 assembler is now defaulted to tune for Itanium 2 processors.
To build a kernel for Itanium 1 processors, you will need to add

ifeq ($(CONFIG_ITANIUM),y)
CFLAGS += -Wa,-mtune=itanium1
AFLAGS += -Wa,-mtune=itanium1
endif

to arch/ia64/Makefile in your kernel source tree.

Please report any bugs related to binutils 2.17.50.0.18 to [EMAIL PROTECTED]

and

http://www.sourceware.org/bugzilla/

Changes from binutils 2.17.50.0.17:

1. Update from binutils 2007 0731.
2. Switching from GPLv2 to GPLv3.
3. Add a new ELF linker option, --build-id, to generate a unique
per-binary identifier embedded in a note section.
4. Remove COFF/x86-64 from PE-COFF/x86-64.
5. Fix a "nm -l" crash on DWARF info. PR 4797.
6. Match symbol type when creating symbol aliase in ELF shared library. 
7. Fix addr2line on relocatable linux kernel. PR 4756.
8. Change disassembler to print addend as signed.
9. Support section alignment from 128 to 8192 bytes for PE-COFF.
10. Add attribute section to ELF linker.
11. Fix ELF linker to meet gABI alignment requirement. PR 4701.
12. Add support for reading in debug information via a .gnu_debuglink
section.
13. Fix string merge for ia64 linker. PR 4590.
14. Add --common to size to display total size for *COM* syms.
15. Fix "strip --strip-unneeded" on relocatable files. PR 4716.
16. Fix "objcopy/strip --only-keep-debug" for SHT_NOTE sections.
17. Fix objdump -S with unit-at-a-time.
18. Properly handle "-shared -pie" in linker. PR 4409.
19. Fix x86 disassembler in Intel mode for various SIMD instruction.
PRs 4667/4834.
20. Update x86-64 assembler to long nop sequence by default.
21. Fix --32 for x86-64 mingw assembler.
22. Fix a memory corruption in assembler.  PR 4722.
22. Properly support 64bit PE-COFF on hosts where long isn't 64bit.
23. Add #line in generated linker source files.
24. Fix linker crash on SIZEOF.  PR 4782.
27. Add CR16 support.
28. Add windmc tool for Windows.
29. Generate x86 instruction/register definitions from ascii tables.
30. Fix strip for Solaris. PR 4712.
31. Fix various mips bugs.
32. Fix various ppc bugs.
33. Fix various spu bugs.
34. Fix various xtensa bugs.

Changes from binutils 2.17.50.0.16:

1. Update from binutils 2007 0615.
2. Preserve section alignment for copy relocation.  PR 4504.
3. Properly fix regression with objcopy --only-keep-debug.  PR 4479.
4. Fix ELF eh frame handling.  PR 4497.
5. Fix ia64 string merge.  PR 4590.
5. Don't use PE target on EFI files nor EFI target on PE files.
6. Speed up linker with many input files.
7. Support cross compiling windres.  PR 2737.
8. Fix various windres bugs.
9. Fix various arms bugs.
10. Fix various m68k bugs.
11. Fix various mips bugs.
12. Fix various ppc bugs.
13. Fix various sparc bugs.
14

Re: ICE on valid code, cse related

2007-08-01 Thread Ian Lance Taylor
"Pranav Bhandarkar" <[EMAIL PROTECTED]> writes:

> Note the REG_EQUAL note of insn 17. cse tries to replace reg:SI 138
> with a constant and because of insn 13, the note becomes (float:SF
> (const_int 0)) which in turn cse converts into
> 
> REG_EQUAL (const_double:SF 0 [0x0] 0.0 [0x0.0p+0])

That seems OK at first glance.

> and when CONST_DOUBLE_LOW is done on the above, the compiler crashes -
> 
> " internal compiler error: RTL check: expected code 'const_double' and
> mode 'VOID', have code 'const_double' and mode 'SF' in plus_constant,
> at explow.c:103"
> 
> i.e the compiler is crashing after converting a const_int to an SFmode value.

Who is calling CONST_DOUBLE_LOW on this value?

Ian


Re: Workshop on GCC for Research in Embedded and Parallel Systems

2007-08-01 Thread Ayal Zaks
==
  CALL FOR PAPERS - One Final Week Extension
  GREPS '07
Workshop on GCC for Research in Embedded and Parallel Systems
   Brasov, Romania, September 16, 2007
  http://sysrun.haifa.il.ibm.com/hrl/greps2007/
 in conjunction with
   PACT '07
 http://pactconf.org
==

We are honored to have two prominent keynote speakers:

Paul H J Kelly
  Imperial College, London
  Title: GCC in software performance research: just plug in

Benoit Dupont de Dinechin
  ST Microelectronics, Grenoble, France
  Title: GCC for Embedded VLIW Processors: Why Not?

A final extension of one week has been granted: submissions are due August
7,
2007; They are to be 6-12 pages long, for review purposes.
The submission site is http://papers.haifa.il.ibm.com/greps2007/
For more details see http://sysrun.haifa.il.ibm.com/hrl/greps2007/

Early notification of the intention to participate (submit and/or attend)
would be helpful.

Important Dates
Papers due: August 7, 2007
Acceptance notices: August 13, 2007
Workshop date: September 16, 2007



Ayal Zaks/Haifa/IBM wrote on 26/06/2007 18:08:43:

> Acronymed GREPS (... just what you were looking for), is to be held on
> September 16 in Brasov, Romania, co-located with PACT. We'd like to bring
this
> workshop to your attention; the submission site is now open until July
24,
> after the upcoming GCC Developers' Summit. For more details see
http://sysrun.
> haifa.il.ibm.com/hrl/greps2007/
>
> Thanks, Albert and Ayal.



Re: [tuples] heads up. you need to specify --enable-checking

2007-08-01 Thread Diego Novillo
On 8/1/07 12:37 PM, Diego Novillo wrote:

> So, when configuring the branch make sure you specify --enable-checking.

Oh, never mind.  Andrew pointed out that it's much easier to just modify
version.c as we usually do on branches.  Silly me.

No need to explicitly --enable-checking now.  Apologies for the noise.


Re: ICE on valid code, cse related

2007-08-01 Thread Pranav Bhandarkar
> Who is calling CONST_DOUBLE_LOW on this value?
plus_constant calls CONST_DOUBLE_LOW on this value.

simplify_binary_operation_1 calls plus_constant ( while trying to
simplify PLUS on (const_double:SF 0 [0x0] 0.0 [0x0.0p+0]) & (const_int
-2147483648 [0x8000]) ), which in turn calls CONST_DOUBLE_LOW.

Thanks,
Pranav


[tuples] heads up. you need to specify --enable-checking

2007-08-01 Thread Diego Novillo

I just got tricked by my change to DEV-PHASE.  Since the branch no
longer says 'experimental' but it specifies the branch name and the
mainline merge revision number, configure is defaulting to
--enable-checking=release.

So, when configuring the branch make sure you specify --enable-checking.


Re: [RFC] Improve Tree-SSA if-conversion - convergence of efforts

2007-08-01 Thread Ayal Zaks
"Daniel Berlin" <[EMAIL PROTECTED]> wrote on 01/08/2007 18:27:35:

> On 8/1/07, Tehila Meyzels <[EMAIL PROTECTED]> wrote:
> > "Daniel Berlin" <[EMAIL PROTECTED]> wrote on 31/07/2007 18:00:57:
> >
> > >
> > > I agree with you for conditional stores/loads.
> >
> > Great!
> >
> > >
> > > The unconditional store/load stuff, however, is exactly what
> > > tree-ssa-sink was meant to do, and belongs there (this is #3 above).
> > > I'm certainly going to fight tooth and nail against trying to
shoehorn
> > > unconditional store sinking into if-conv.
> >
> > Sometimes, store-sinking can cause performance degradations.
> > One reason for that, is increasing register pressure, due to extending
life
> > range of registers.
> >
> > In addition, in case we have a store followed by a branch, store
sinking
> > result will be a branch followed by a store.
> > On some architectures, the former can be executed in parallel, as
opposed
> > to the latter.
> > Thus, in this case, it worth executing store-sinking only when it helps
the
> > if-conversion to get rid of the branch.
> >
>
> > How do you suggest to solve this problem, in case store-sinking will be
> > part of the tree-sink pass?
> >
> Store sinking already *is* part of the tree-sink pass. It just only
> sinks a small number of stores.
> The solution to the problem that "sometimes you make things harder for
> the target" is to fix that in the backend.  In this case, the
> scheduler will take care of it.
>
> All of our middle end optimizations will sometimes have bad effects
> unless the backend fixes it up.Trying to guess what is going to
> happen 55 passes down the line is a bad idea unless you happen to be a
> very good psychic.
>
> As a general rule of thumb, we are happy to make the backend as target
> specific and ask as many target questions as you like.  The middle
> end, not so much.  There are very few passes in the middle end that
> can/should/do ask anything about the target.  Store sinking is not one
> of them, and I see no good reason it should be.
>
> > Another point, what about (unconditional) load hoisting:
> > It's surely not related to sink pass, right?
> >
> PRE already will hoist unconditional loads out of loops, and in places
> where it will eliminate redundancy.
>
> It could also hoist loads in non-redundancy situations, it is simply
> the case that it's current heuristic  does not think this is a good
> idea.
>

Hoisting a non-redundant load speculatively above an if may indeed be a bad
idea, unless that if gets converted as a result (and possibly even then
...).  Are we in agreement then that unconditional load/store motion for
the sake of redundancy elimination continues to belong to PRE/tree-sink,
and that conditional load/store motion for the sake of conditional-branch
elimination better be coordinated by if-cvt?

Ayal.

> Thus, if you wanted to do unconditional load hoisting, the thing to do
> is to make a function like do_regular_insertion in tree-ssa-pre.c, and
> call it from insert_aux.
>
> We already have another heuristic for partially antic fully available
> expressions, see do_partial_partial_insertion



Re: printing cfg

2007-08-01 Thread Diego Novillo
On 8/1/07 3:03 PM, Bob Rossi wrote:

> Is there a way to make it show the actual expressions in the code
> instead?

Other than changing the code in tree-cfg.c:tree_cfg2vcg(), not really.
Also, this dump is fairly static in that it only happens right after the
CFG is built for the first time (before any optimizations).

> Also, is there a native way to display this information using
> dot instead?

Perhaps it would be easier to post-process the dumps that contain basic
block information (-fdump-tree-all-blocks).  I generally use the
attached script to get the CFG out of an arbitrary pass.  It's very
simplistic, but it could be adapted to do what you want.
#!/bin/sh
#
# (C) 2005 Free Software Foundation
# Contributed by Diego Novillo <[EMAIL PROTECTED]>.
#
# This script is Free Software, and it can be copied, distributed and
# modified as defined in the GNU General Public License.  A copy of
# its license can be downloaded from http://www.gnu.org/copyleft/gpl.html

if [ "$1" = "" ] ; then
echo "usage: $0 file"
echo
echo "Generates a GraphViz .dot graph file from 'file'."
echo "It assumes that 'file' has been generated with -fdump-tree-...-blocks"
echo
exit 1
fi

file=$1
out=$file.dot
echo "digraph cfg {"> $out
echo "  node [shape=box]"   >>$out
echo '  size="11,8.5"'  >>$out
echo>>$out
(grep -E '# BLOCK|# PRED:|# SUCC:' $file |  \
sed -e 's:\[\([0-9\.%]*\)*\]::g;s:([a-z_,]*)::g' |  \
awk '{  #print $0;  \
if ($2 == "BLOCK")  \
{   \
bb = $3;\
print "\t", bb, "[label=\"", bb, "\", style=filled, 
color=gray]";   \
}   \
else if ($2 == "PRED:") \
{   \
for (i = 3; i <= NF; i++)   \
print "\t", $i, "->", bb, ";";  \
}   \
}') >> $out
echo "}">> $out


printing cfg

2007-08-01 Thread Bob Rossi
Hi,

I'm trying to print the cfg so that I can visualize it. I have a simple
file,
  $ cat foo.c 
  int
  foo (int param)
  {
param++;
if (param)
  param++;
return param;
  }

I run the command,
  $ gcc -fdump-tree-vcg-blocks -c foo.c
and then I run,
  xvcg *.vcg
which displays a picture of the cfg. It appears for some reason, that
the expressions in the basic blocks just show things like,
  modify_expr (4)
  cond_expr (5)

Is there a way to make it show the actual expressions in the code
instead? Also, is there a native way to display this information using
dot instead?

Thanks,
Bob Rossi


Re: ICE on valid code, cse related

2007-08-01 Thread Ian Lance Taylor
"Pranav Bhandarkar" <[EMAIL PROTECTED]> writes:

> > Who is calling CONST_DOUBLE_LOW on this value?
> plus_constant calls CONST_DOUBLE_LOW on this value.
> 
> simplify_binary_operation_1 calls plus_constant ( while trying to
> simplify PLUS on (const_double:SF 0 [0x0] 0.0 [0x0.0p+0]) & (const_int
> -2147483648 [0x8000]) ), which in turn calls CONST_DOUBLE_LOW.

How can we have a PLUS on a CONST_DOUBLE and a CONST_INT?  That does
not make sense, as there is no MODE argument that could make this work
correctly.  From your description, MODE must be some integer mode, in
which case it is wrong to be using a CONST_DOUBLE in SFmode.

(I don't know where the bug is; I'm just trying to help pin it down.)

Ian


Re: [RFC] Improve Tree-SSA if-conversion - convergence of efforts

2007-08-01 Thread Daniel Berlin
On 8/1/07, Ayal Zaks <[EMAIL PROTECTED]> wrote:
> "Daniel Berlin" <[EMAIL PROTECTED]> wrote on 01/08/2007 18:27:35:
>
> > On 8/1/07, Tehila Meyzels <[EMAIL PROTECTED]> wrote:
> > > "Daniel Berlin" <[EMAIL PROTECTED]> wrote on 31/07/2007 18:00:57:
> > >
> > > >
> > > > I agree with you for conditional stores/loads.
> > >
> > > Great!
> > >
> > > >
> > > > The unconditional store/load stuff, however, is exactly what
> > > > tree-ssa-sink was meant to do, and belongs there (this is #3 above).
> > > > I'm certainly going to fight tooth and nail against trying to
> shoehorn
> > > > unconditional store sinking into if-conv.
> > >
> > > Sometimes, store-sinking can cause performance degradations.
> > > One reason for that, is increasing register pressure, due to extending
> life
> > > range of registers.
> > >
> > > In addition, in case we have a store followed by a branch, store
> sinking
> > > result will be a branch followed by a store.
> > > On some architectures, the former can be executed in parallel, as
> opposed
> > > to the latter.
> > > Thus, in this case, it worth executing store-sinking only when it helps
> the
> > > if-conversion to get rid of the branch.
> > >
> >
> > > How do you suggest to solve this problem, in case store-sinking will be
> > > part of the tree-sink pass?
> > >
> > Store sinking already *is* part of the tree-sink pass. It just only
> > sinks a small number of stores.
> > The solution to the problem that "sometimes you make things harder for
> > the target" is to fix that in the backend.  In this case, the
> > scheduler will take care of it.
> >
> > All of our middle end optimizations will sometimes have bad effects
> > unless the backend fixes it up.Trying to guess what is going to
> > happen 55 passes down the line is a bad idea unless you happen to be a
> > very good psychic.
> >
> > As a general rule of thumb, we are happy to make the backend as target
> > specific and ask as many target questions as you like.  The middle
> > end, not so much.  There are very few passes in the middle end that
> > can/should/do ask anything about the target.  Store sinking is not one
> > of them, and I see no good reason it should be.
> >
> > > Another point, what about (unconditional) load hoisting:
> > > It's surely not related to sink pass, right?
> > >
> > PRE already will hoist unconditional loads out of loops, and in places
> > where it will eliminate redundancy.
> >
> > It could also hoist loads in non-redundancy situations, it is simply
> > the case that it's current heuristic  does not think this is a good
> > idea.
> >
>
> Hoisting a non-redundant load speculatively above an if may indeed be a bad
> idea, unless that if gets converted as a result (and possibly even then
> ...).  Are we in agreement then that unconditional load/store motion for
> the sake of redundancy elimination continues to belong to PRE/tree-sink,
> and that conditional load/store motion for the sake of conditional-branch
> elimination better be coordinated by if-cvt?
>

Yes.
My only issue here is duplication of code that exists in other passes,
not one of who/when/why things get done.

IE it is easier to use PRE's infrastructure to do the unconditional
load elimination, but still only do more than redundancy elimination
when you will if-convert branches, then it would be to write a new
pass.  Your new pass would end up probably missing loads that PRE goes
to trouble to get, and would duplicate a lot of the safety computation
PRE already knows how to do.

Of course, if you only see yourself moving 1 or two loads per
function, it may be quicker to do just those in their own pass
controlled by ifcvt.  But if you are going to try to if-convert every
branch, and every load inside those branches, you really don't want to
try to make your computation as efficient as PRE makes it.

A similar situation exists for unconditional store sinking/tree-ssa-sink.


Re: AMD64 ABI compatibility

2007-08-01 Thread Nicolas Alt

Kai,
did you make your diff against the current CVS checkout or against  
your first patch? Should your changes already work for some cases? I  
would like to test if they produce the right instructions. However, I  
do not have enough insight into gcc to work on it myself.

Thanks,
Nicolas


On Aug 1, 2007, at 5:48 , Kai Tietz wrote:


Hi Jan,

Jan Hubicka wrote on 31.07.2007 23:40:40:


Hi Kai,

so, could you resolve the remaining issues? Or have you kind of
paused the project?

Cheers,
Nicolas


On Jul 12, 2007, at 2:14 , Kai Tietz wrote:


Hi,

I am nearly through :) The remaining macros left to be ported are
REGPARM_MAX and SSE_REGPARM_MAX. The sysv_abi uses 6 regs and 8  
sses,

ms_abi uses 4 regs and 4 sse registers. The problem is for example
the use
in i386.md of SSE_REGPARM_MAX without any hint, how to choose the
required
abi. Do you have an idea how this could be done ?


This shoul not be dificult - ix86_regparm is used in
ix86_function_regparm, init_cumulative_args,  
setup_incoming_varargs_64

functions.  In all those cases you know the function declaration and
thus you can take a look if it is call to different ABI and overwrite
the value.


Ok, here is my update.

Cheers,
 i.A. Kai Tietz



|  (\_/)  This is Bunny. Copy and paste Bunny
| (='.'=) into your signature to help him gain
| (")_(") world domination.

-- 


  OneVision Software Entwicklungs GmbH & Co. KG
  Dr.-Leo-Ritter-Straße 9 - 93049 Regensburg
  Tel: +49.(0)941.78004.0 - Fax: +49.(0)941.78004.489 -  
www.OneVision.com

  Commerzbank Regensburg - BLZ 750 400 62 - Konto 6011050
  Handelsregister: HRA 6744, Amtsgericht Regensburg
  Komplementärin: OneVision Software Entwicklungs Verwaltungs GmbH
  Dr.-Leo-Ritter-Straße 9 – 93049 Regensburg
  Handelsregister: HRB 8932, Amtsgericht Regensburg - Geschäftsführer:
Ulrike Döhler, Manuela Kluger






Re: Semicolons at the end of member function definitions

2007-08-01 Thread Mark Mitchell
Volker Reichelt wrote:

> 2007-03-26  Dirk Mueller  <[EMAIL PROTECTED]>
> 
>* parser.c (cp_parser_member_declaration): Pedwarn
>about stray semicolons after member declarations.
> 

> It makes
> 
>   struct A
>   {
>  void foo() {};
>   }

That is indeed still legal in the current working draft.  (The reason
that I copied the grammar productions above the parser functions was so
that it would be easy to check things like this...)

> Therefore, IMHO the patch is wrong and should be reverted.

Yes, please go ahead and revert it.  And, if you have time, please add a
test-case specifically for this case.  The previous patch removed
semicolons from lots of valid code, but probably none of those test
cases were specifically for this case.

Thanks,

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


missing libtool sources?

2007-08-01 Thread DJ Delorie

ltmain.sh starts with this line:

# Generated from ltmain.m4sh; do not edit by hand

but we don't seem to have ltmain.m4sh in the source tree.