How to handle loop iterator variable?

2008-05-09 Thread Sandeep Maram
Hi,

Consider 2 for loops as given below.

for (i = 1; i < N ; i++)
{
  a[i] = 1;
}
  for (i = 1; i < N ; i++)
{
  j = j + a[i];
}

Their corresponding GIMPLE code looks like.

loop_2 (header = 6, latch = 7, niter = , upper_bound = 999, estimate = 999)
{
  bb_6 (preds = {bb_5 bb_7 }, succs = {bb_7 bb_8 })
  {
  :
# j_24 = PHI <0(5), j_12(7)>
# i_23 = PHI <1(5), i_13(7)>
# VUSE  { a }
D.1189_10 = a[i_23];
D.1190_11 = (unsigned int) D.1189_10;
j_12 = D.1190_11 + j_24;
i_13 = i_23 + 1;
if (i_13 <= 999)
  goto ;
else
  goto ;

  }
  bb_7 (preds = {bb_6 }, succs = {bb_6 })
  {
  :
goto ;

  }
}
loop_1 (header = 3, latch = 4, niter = , upper_bound = 999, estimate = 999)
{
  bb_3 (preds = {bb_4 bb_2 }, succs = {bb_4 bb_5 })
  {
  :
# a_25 = PHI 
# i_22 = PHI 
# a_19 = VDEF  { a }
a[i_22] = 1;
i_7 = i_22 + 1;
if (i_7 <= 999)
  goto ;
else
  goto ;

  }
  bb_4 (preds = {bb_3 }, succs = {bb_3 })
  {
  :
goto ;

  }
}

Now I have transferred all statements from loop_2 to loop_1. i.e from
bb_6 to bb_3.
Using code :

 block_stmt_iterator bsi_a, bsi_a_last, bsi_b, bsi_b_last, bsi;
  bsi_b = bsi_start (loop_b->header);
  bsi_b_last = bsi_last (loop_b->header);
  bsi_prev (&bsi_b_last);

  /* Transfer all the statements one by one.  */
  for (bsi = bsi_start (loop_a->header); !bsi_end_p (bsi);)
{
if ((TREE_CODE (bsi_stmt (bsi)) != COND_EXPR) && (TREE_CODE
(bsi_stmt (bsi)) != LABEL_EXPR))
   {
 update_stmt (bsi_stmt (bsi));
 bsi_move_before (&bsi, &bsi_b_last);
 fprintf (stderr, " transferred one statement. \n ");
 fprintf (dump_file, " transferred one statement. \n ");
 update_stmt (bsi_stmt (bsi));
   }
else
   {
 bsi_next (&bsi);
   }
 }


Now the GIMPLE codes look like.


 loop_2 (header = 6, latch = 7, niter = , upper_bound = 999, estimate = 999)
{
  bb_6 (preds = {bb_5 bb_7 }, succs = {bb_7 bb_8 })
  {
  :
# j_24 = PHI <0(5), j_12(7)>
# i_23 = PHI <1(5), i_13(7)>
if (i_13 <= 999)
  goto ;
else
  goto ;

  }
  bb_7 (preds = {bb_6 }, succs = {bb_6 })
  {
  :
goto ;

  }
}
loop_1 (header = 3, latch = 4, niter = , upper_bound = 999, estimate = 999)
{
  bb_3 (preds = {bb_4 bb_2 }, succs = {bb_4 bb_5 })
  {
  :
# a_25 = PHI 
# i_22 = PHI 
# a_19 = VDEF  { a }
a[i_22] = 1;
# VUSE  { a }
D.1189_10 = a[i_23];
D.1190_11 = (unsigned int) D.1189_10;
j_12 = D.1190_11 + j_24;
i_13 = i_23 + 1;
i_7 = i_22 + 1;
if (i_7 <= 999)
  goto ;
else
  goto ;

  }
  bb_4 (preds = {bb_3 }, succs = {bb_3 })
  {
  :
goto ;

  }
}

Now I get an internal compiler error saying
error: definition in block 6 does not dominate use in block 3
for SSA_NAME: i_23 in statement:
# VUSE 
D.1189_10 = a[i_23];

i_23 is the loop iterator of loop_2 . i_22 is the loop iterator of loop_1.

How can I rename i_23 as i_22?

Thanks,
Sandeep.


Re: Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?

2008-05-09 Thread Paolo Bonzini



I'd like to implement something similar for MaverickCrunch, using the
integer 32-bit MAC functions, but there is no reciprocal estimate
function on the MaverickCrunch.  I guess a lookup table could be
implemented, but how many entries will need to be generated, and how
accurate will it have to be IEEE754 compliant (in the swdiv routine)?


I think sh does something like that.  It is quite a mess, as it has half 
a dozen ways to implement division.


The idea is to use integer arithmetic to compute the right exponent, and 
the lookup table to estimate the mantissa.  I used something like this 
for square root:


1) shift the entire FP number by 1 to the right (logical right shift)
2) sum 0x2000 so that the exponent is still offset by 64
3) extract the 8 bits from 14 to 22 and look them up in a 256-entry, 
32-bit table

4) sum the value (as a 32-bit integer!) with the content of the table
5) perform 2 Newton-Raphson iterations as necessary

example, 3.9921875

byte representation = 0x407F8000
shift right = 0x203FC000
sum = 0x403FC000
extract bits = 255
lookup table value = -4194312 = -0x48
adjusted value = 16r3FFFBFF8, which is the square root

the table is simply making sure that if the rightmost 14 bits of the 
mantissa is zero the return value is right.  by summing the content of 
the lookup table, you can of course interpolate between the values.


With a 12-bit table (i.e. 16 kilobytes instead of just one) you will 
only need 1 iteration.


The algorithm will have to be adjusted for reciprocal (subtracting the 
FP number from 16r7F00 or better 16r7EFF should do the trick for 
the first two steps; and since you don't shift right by one you'll use 
bits 15-23).


Here is a sample program to generate the table.  It's written in 
Smalltalk (sorry :-P), it should not be hard to understand (but remember 
that indices are 1-based).  To double check, the first entries of the 
table are 1 -32512 -64519 -96026.


| a int adj table |
table := ##(| table a val estim |  table := Array new: 256.
0 to: 255 do: [ :i |
   a := ByteArray new: 4.
   "Create number"
   a intAt: 1 put: (i bitShift: 15).
   a at: 1 put: 64.
   val := (a floatAt: 1) reciprocal.

   "Perform estimation"
   a intAt: 1 put: (16r7EFF - (a intAt: 1)).
   estim := a intAt: 1.

   "Compute delta with actual value and store it"
   a floatAt: 1 put: val.
   table at: i + 1 put: ((a intAt: 1) - estim)
].
table).

"Here we do the actual calculation. `self' is the number
 to be reciprocated."

a := ByteArray new: 4.
a floatAt: 1 put: self.

"Perform estimation as above"
int := 16r7EFF - (a intAt: 1).

"Extract bits 15-23 and access the table."
adj := table at: ((a intAt: 1) // 32768 \\ 256) + 1.

"Sum the delta and convert from 32-bit integer to float"
a intAt: 1 put: (int + adj).
^(a floatAt: 1)


Also, where should I be sticking such an instruction / table?  Should I
put it in the kernel, and trap an invalid instruction?  Alternatively,
should I put it in libgcc


Yes, you could do this.

Paolo


RFC: Optimize caller-saved register

2008-05-09 Thread H.J. Lu
Hi,

Currently we save the entire register content for caller-saved
register, even though
only lower 4/8 bytes are used, as in the case of SSE math without
vectorizer. Is it
possible to only save the used portion of register content for
caller-saved register?

Thanks.


H.J.


Re: How to handle loop iterator variable?

2008-05-09 Thread Sebastian Pop
On Fri, May 9, 2008 at 2:09 AM, Sandeep Maram <[EMAIL PROTECTED]> wrote:
> i_23 is the loop iterator of loop_2 . i_22 is the loop iterator of loop_1.
>
> How can I rename i_23 as i_22?
>

In lambda-code.c:1858 you have some code that does a similar renaming:

FOR_EACH_IMM_USE_STMT (stmt, imm_iter, oldiv_def)
  FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
propagate_value (use_p, newiv);

Sebastian


gcc-4.4-20080509 is now available

2008-05-09 Thread gccadmin
Snapshot gcc-4.4-20080509 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20080509/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.4 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 135128

You'll find:

gcc-4.4-20080509.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.4-20080509.tar.bz2 C front end and core compiler

gcc-ada-4.4-20080509.tar.bz2  Ada front end and runtime

gcc-fortran-4.4-20080509.tar.bz2  Fortran front end and runtime

gcc-g++-4.4-20080509.tar.bz2  C++ front end and runtime

gcc-java-4.4-20080509.tar.bz2 Java front end and runtime

gcc-objc-4.4-20080509.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.4-20080509.tar.bz2The GCC testsuite

Diffs from 4.4-20080502 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.4
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: RFH: Building and testing gimple-tuples-branch

2008-05-09 Thread Kaz Kojima
Diego Novillo <[EMAIL PROTECTED]> wrote:
> So, for folks with free cycles to spare, could you build the branch on 
> your favourite target and report bugs?  Bugzilla and/or email reports 
> are OK.  If you are creating a bugzilla report, please add my address to 
> the CC field.

With the attached patch, SH cross can be built for gimple-tuples-branch.
The testresult for sh4-unknown-linux-gnu cross x86-linux is:

  http://gcc.gnu.org/ml/gcc-testresults/2008-05/msg00801.html

Regards,
kaz
--
*  config/sh/sh.c (sh_gimplify_va_arg_expr): Change pre_p and
post_p types to gimple_seq *.

diff -uprN ORIG/gimple-tuples-branch/gcc/config/sh/sh.c 
LOCAL/gimple-tuples-branch/gcc/config/sh/sh.c
--- ORIG/gimple-tuples-branch/gcc/config/sh/sh.c2008-05-09 
17:39:13.0 +0900
+++ LOCAL/gimple-tuples-branch/gcc/config/sh/sh.c   2008-05-09 
18:12:37.0 +0900
@@ -261,7 +261,7 @@ static bool sh_strict_argument_naming (C
 static bool sh_pretend_outgoing_varargs_named (CUMULATIVE_ARGS *);
 static tree sh_build_builtin_va_list (void);
 static void sh_va_start (tree, rtx);
-static tree sh_gimplify_va_arg_expr (tree, tree, tree *, tree *);
+static tree sh_gimplify_va_arg_expr (tree, tree, gimple_seq *, gimple_seq *);
 static bool sh_pass_by_reference (CUMULATIVE_ARGS *, enum machine_mode,
  const_tree, bool);
 static bool sh_callee_copies (CUMULATIVE_ARGS *, enum machine_mode,
@@ -7255,8 +7255,8 @@ find_sole_member (tree type)
 /* Implement `va_arg'.  */
 
 static tree
-sh_gimplify_va_arg_expr (tree valist, tree type, tree *pre_p,
-tree *post_p ATTRIBUTE_UNUSED)
+sh_gimplify_va_arg_expr (tree valist, tree type, gimple_seq *pre_p,
+gimple_seq *post_p ATTRIBUTE_UNUSED)
 {
   HOST_WIDE_INT size, rsize;
   tree tmp, pptr_type_node;


Deprecation?!

2008-05-09 Thread Dave Higginbotham
I'm getting a " warning: deprecated conversion from string constant to
‘char*’" message in g++ (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7).

I've always understood there is no such thing as deprecation in C++ (and
have been proud of this concept). What gives? Does the standards
committee allow deprecation - have I been wrong all along - or is the
Ubuntu team trying way too hard to be like Microsoft (who declares
deprecations in their latest compilers)?


Thanks,