fixed rot on the wide-int branch.

2013-08-20 Thread Kenneth Zadeck

Index: gcc/optabs.c
===
--- gcc/optabs.c(revision 201884)
+++ gcc/optabs.c(working copy)
@@ -867,7 +867,8 @@ expand_subword_shift (enum machine_mode
   outof_input, const1_rtx, 0, unsignedp, methods);
   if (shift_mask == BITS_PER_WORD - 1)
 {
-  tmp = immed_wide_int_const (wide_int::minus_one (op1_mode), 
op1_mode);

+  tmp = immed_wide_int_const
+(wide_int::minus_one (GET_MODE_PRECISION (op1_mode)), op1_mode);
   tmp = simplify_expand_binop (op1_mode, xor_optab, op1, tmp,
0, true, methods);
 }
Index: gcc/recog.c
===
--- gcc/recog.c(revision 201884)
+++ gcc/recog.c(working copy)
@@ -1187,8 +1187,7 @@ const_scalar_int_operand (rtx op, enum m
   /* Multiword partial int.  */
   HOST_WIDE_INT x
 = CONST_WIDE_INT_ELT (op, CONST_WIDE_INT_NUNITS (op) - 1);
-  return (wide_int::sext (x, prec & (HOST_BITS_PER_WIDE_INT - 1))
-  == x);
+  return (sext_hwi (x, prec & (HOST_BITS_PER_WIDE_INT - 1)) == x);
 }
 }
   return 1;



patch for wide-int branch

2013-08-22 Thread Kenneth Zadeck
cleaned up version of convert_modes that handles all constants in a 
uniform manner.


This is clean on x86-64.   Will test on other platforms tomorrow.

kenny

Index: gcc/expr.c
===
--- gcc/expr.c(revision 201884)
+++ gcc/expr.c(working copy)
@@ -710,64 +710,32 @@ convert_modes (enum machine_mode mode, e
   if (mode == oldmode)
 return x;

-  /* There is one case that we must handle specially: If we are
- converting a CONST_INT into a mode whose size is larger than
- HOST_BITS_PER_WIDE_INT and we are to interpret the constant as
- unsigned, gen_lowpart will do the wrong if the constant appears
- negative.  What we want to do is make the high-order word of the
- constant zero, not all ones.  */
-
-  if (unsignedp && GET_MODE_CLASS (mode) == MODE_INT
-  && GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT
-  && CONST_INT_P (x) && INTVAL (x) < 0)
+  if (CONST_SCALAR_INT_P (x)
+  && GET_MODE_CLASS (mode) == MODE_INT
+  && (oldmode == VOIDmode || GET_MODE_CLASS (oldmode) == MODE_INT))
 {
-  wide_int val = std::make_pair (x, mode);
-  /* We need to zero extend VAL.  */
-  if (oldmode != VOIDmode)
-val = val.zext (GET_MODE_PRECISION (oldmode));
-
-  return immed_wide_int_const (val, mode);
+  wide_int w = std::make_pair (x, mode);
+  /* If the caller did not tell us the old mode, then there is
+ not much to do with respect to canonization.  */
+  if (oldmode != VOIDmode
+  && GET_MODE_PRECISION (mode) > GET_MODE_PRECISION (oldmode))
+w = w.ext (GET_MODE_PRECISION (oldmode), unsignedp ? UNSIGNED : 
SIGNED);

+  return immed_wide_int_const (w, mode);
 }

   /* We can do this with a gen_lowpart if both desired and current modes
  are integer, and this is either a constant integer, a register, or a
- non-volatile MEM.  Except for the constant case where MODE is no
- wider than HOST_BITS_PER_WIDE_INT, we must be narrowing the 
operand.  */

+ non-volatile MEM. */
+  if (GET_MODE_CLASS (mode) == MODE_INT
+  && GET_MODE_CLASS (oldmode) == MODE_INT
+  && GET_MODE_PRECISION (mode) <= GET_MODE_PRECISION (oldmode)
+  && ((MEM_P (x) && !MEM_VOLATILE_P (x) && direct_load[(int) mode])
+  || (REG_P (x)
+  && (!HARD_REGISTER_P (x)
+  || HARD_REGNO_MODE_OK (REGNO (x), mode))
+  && TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (x)

-  if ((CONST_INT_P (x)
-   && GET_MODE_PRECISION (mode) <= HOST_BITS_PER_WIDE_INT)
-  || (GET_MODE_CLASS (mode) == MODE_INT
-  && GET_MODE_CLASS (oldmode) == MODE_INT
-  && (CONST_SCALAR_INT_P (x)
-  || (GET_MODE_PRECISION (mode) <= GET_MODE_PRECISION (oldmode)
-  && ((MEM_P (x) && ! MEM_VOLATILE_P (x)
-   && direct_load[(int) mode])
-  || (REG_P (x)
-  && (! HARD_REGISTER_P (x)
-  || HARD_REGNO_MODE_OK (REGNO (x), mode))
-  && TRULY_NOOP_TRUNCATION_MODES_P (mode,
-GET_MODE (x
-{
-  /* ?? If we don't know OLDMODE, we have to assume here that
- X does not need sign- or zero-extension.   This may not be
- the case, but it's the best we can do.  */
-  if (CONST_INT_P (x) && oldmode != VOIDmode
-  && GET_MODE_PRECISION (mode) > GET_MODE_PRECISION (oldmode))
-{
-  HOST_WIDE_INT val = INTVAL (x);
-
-  /* We must sign or zero-extend in this case.  Start by
- zero-extending, then sign extend if we need to.  */
-  val &= GET_MODE_MASK (oldmode);
-  if (! unsignedp
-  && val_signbit_known_set_p (oldmode, val))
-val |= ~GET_MODE_MASK (oldmode);
-
-  return gen_int_mode (val, mode);
-}
-
-  return gen_lowpart (mode, x);
-}
+   return gen_lowpart (mode, x);

   /* Converting from integer constant into mode is always equivalent to an
  subreg operation.  */



Re: wide-int branch now up for public comment and review

2013-08-23 Thread Kenneth Zadeck

On 08/23/2013 11:02 AM, Richard Sandiford wrote:

Hi Kenny,

This is the first time I've looked at the implementation of wide-int.h
(rather than just looking at the rtl changes, which as you know I like
in general), so FWIW here are some comments on wide-int.h.  I expect
a lot of them overlap with Richard B.'s comments.

I also expect many of them are going to be annoying, sorry, but this
first one definitely will.  The coding conventions say that functions
should be defined outside the class:

 http://gcc.gnu.org/codingconventions.html

and that opening braces should be on their own line, so most of the file
needs to be reformatted.  I went through and made that change with the
patch below, in the process of reading through.  I also removed "SGN
must be SIGNED or UNSIGNED." because it seemed redundant when those are
the only two values available.  The patch fixes a few other coding standard
problems and typos, but I've not made any actual code changes (or at least,
I didn't mean to).
I had started the file with the functions outside of the class and mike 
had stated putting them in the class, and so i went with putting them in 
the class since many of them were one liners and so having them out of 
line just doubled the size of everything. however, we did not look at 
the coding conventions and that really settles the argument.  Thanks for 
doing this.

Does it look OK to install?

you can check it in.

I'm still unsure about these "infinite" precision types, but I understand
the motivation and I have no objections.  However:


 * Code that does widening conversions.  The canonical way that
   this is performed is to sign or zero extend the input value to
   the max width based on the sign of the type of the source and
   then to truncate that value to the target type.  This is in
   preference to using the sign of the target type to extend the
   value directly (which gets the wrong value for the conversion
   of large unsigned numbers to larger signed types).




I don't understand this particular reason.  Using the sign of the source
type is obviously right, but why does that mean we need "infinite" precision,
rather than just doubling the precision of the source?
in a sense, "infinite" does not really mean infinite, it really just 
means large enough so that you never loose any information from the 
top.For widening all that you really need to be "infinite" is one 
more bit larger than the destination type.   We could have had an abi 
where you specified the precision of every operation based on the 
precisions of the inputs.   Instead, for these kinds of operations, we 
decided to sniff the port and determine a fixed width that was large 
enough for everything that was needed for the port. We call that number 
infinite.  This sort of follows the convention that double-int was used 
with were infinite was 128 bits, but with our design/implementation, we 
(hopefully) have no bugs where the size of the datatypes needed never 
runs into implementation limits.



   * When a constant that has an integer type is converted to a
 wide-int it comes in with precision 0.  For these constants the
 top bit does accurately reflect the sign of that constant; this
 is an exception to the normal rule that the signedness is not
 represented.  When used in a binary operation, the wide-int
 implementation properly extends these constants so that they
 properly match the other operand of the computation.  This allows
 you write:

tree t = ...
wide_int x = t + 6;

 assuming t is a int_cst.

This seems dangerous.  Not all code that uses "unsigned HOST_WIDE_INT"
actually wants it to be an unsigned value.  Some code uses it to avoid
the undefinedness of signed overflow.  So these overloads could lead
to us accidentally zero-extending what's conceptually a signed value
without any obvious indication that that's happening.  Also, hex constants
are unsigned int, but it doesn't seem safe to assume that 0x8000 was
meant to be zero-extended.

I realise the same thing can happen if you mix "unsigned int" with
HOST_WIDE_INT, but the point is that you shouldn't really do that
in general, whereas we're defining these overloads precisely so that
a mixture can be used.

I'd prefer some explicit indication of the sign, at least for anything
other than plain "int" (so that the compiler will complain about uses
of "unsigned int" and above).


There is a part of me that finds this scary and a part of me that feels 
that the concern is largely theoretical.It does make it much easier 
to read and understand the code to be able to write "t + 6" rather than 
"wide_int (t) + wide_int::from_uhwi" (6) but of course you loose some 
control over how 6 is converted.



   Note that the bits above the precision are not defined and the
   algorithms used here are careful not to depend on their value.  In
   particular, values that come in from r

Re: wide-int branch now up for public comment and review

2013-08-24 Thread Kenneth Zadeck

On 08/24/2013 08:05 AM, Richard Sandiford wrote:

Richard Sandiford  writes:

I wonder how easy it would be to restrict this use of "zero precision"
(i.e. flexible precision) to those where primitive types like "int" are
used as template arguments to operators, and require a precision when
constructing a wide_int.  I wouldn't have expected "real" precision 0
(from zero-width bitfields or whatever) to need any special handling
compared to precision 1 or 2.

I tried the last bit -- requiring a precision when constructing a
wide_int -- and it seemed surprising easy.  What do you think of
the attached?  Most of the forced knock-on changes seem like improvements,
but the java part is a bit ugly.  I also went with "wide_int (0, prec).cmp"
for now, although I'd like to add static cmp, cmps and cmpu alongside
leu_p, etc., if that's OK.  It would then be possible to write
"wide_int::cmp (0, ...)" and avoid the wide_int construction altogether.

I wondered whether you might also want to get rid of the build_int_cst*
functions, but that still looks a long way off, so I hope using them in
these two places doesn't seem too bad.

This is just an incremental step.  I've also only run it through a
subset of the testsuite so far, but full tests are in progress...
So i am going to make two high level comments here and then i am going 
to leave the ultimate decision to the community.   (1) I am mildly in 
favor of leaving prec 0 stuff the way that it is (2) my guess is that 
richi also will favor this.   My justification for (2) is because he had 
a lot of comments about this before he went on leave and this is 
substantially the way that it was when he left. Also, remember that one 
of his biggest dislikes was having to specify precisions.


However, this question is really bigger than this branch which is why i 
hope others will join in, because this really comes down to how do we 
want the compiler to look when it is fully converted to c++.   It has 
taken me a while to get used to writing and reading code like this where 
there is a lot of c++ magic going on behind the scenes.   And with that 
magic comes the responsibility of the programmer to get it right.  There 
were/are a lot of people in the gcc community that did not want go down 
the c++ pathway for exactly this reason.  However, i am being converted.


The rest of my comments are small comments about the patch, because some 
of it should be done no matter how the decision is made.

=
It is perfectly fine to add the static versions of the cmp functions and 
the usage of those functions in this patch looks perfectly reasonable.




Thanks,
Richard


Index: gcc/fold-const.c
===
--- gcc/fold-const.c2013-08-24 12:11:08.085684013 +0100
+++ gcc/fold-const.c2013-08-24 01:00:00.0 +0100
@@ -8865,15 +8865,16 @@ pointer_may_wrap_p (tree base, tree offs
if (bitpos < 0)
  return true;
  
+  int precision = TYPE_PRECISION (TREE_TYPE (base));

if (offset == NULL_TREE)
-wi_offset = wide_int::zero (TYPE_PRECISION (TREE_TYPE (base)));
+wi_offset = wide_int::zero (precision);
else if (TREE_CODE (offset) != INTEGER_CST || TREE_OVERFLOW (offset))
  return true;
else
  wi_offset = offset;
  
bool overflow;

-  wide_int units = wide_int::from_shwi (bitpos / BITS_PER_UNIT);
+  wide_int units = wide_int::from_shwi (bitpos / BITS_PER_UNIT, precision);
total = wi_offset.add (units, UNSIGNED, &overflow);
if (overflow)
  return true;
So this is a part of the code that really should have been using 
addr_wide_int rather that wide_int.  It is doing address arithmetic with 
bit positions.Because of this, the precision that the calculations 
should have been done with the precision of 3 + what comes out of the 
type.   The addr_wide_int has a fixed precision that is guaranteed to be 
large enough for any address math on the machine.



Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c 2013-08-24 12:11:08.085684013 +0100
+++ gcc/gimple-ssa-strength-reduction.c 2013-08-24 01:00:00.0 +0100
@@ -777,7 +777,6 @@ restructure_reference (tree *pbase, tree
  {
tree base = *pbase, offset = *poffset;
max_wide_int index = *pindex;
-  wide_int bpu = BITS_PER_UNIT;
tree mult_op0, t1, t2, type;
max_wide_int c1, c2, c3, c4;
  
@@ -786,7 +785,7 @@ restructure_reference (tree *pbase, tree

|| TREE_CODE (base) != MEM_REF
|| TREE_CODE (offset) != MULT_EXPR
|| TREE_CODE (TREE_OPERAND (offset, 1)) != INTEGER_CST
-  || !index.umod_floor (bpu).zero_p ())
+  || !index.umod_floor (BITS_PER_UNIT).zero_p ())
  return false;
  
t1 = TREE_OPERAND (base, 0);

@@ -822,7 +821,7 @@ restructure_reference (tree *pbase, tree
c2 = 0;
  }
  
-  c4 = index.udiv_floor (bpu);

+  c4 = index.udiv_floor (BITS_PER_UNIT);
  
this

Re: wide-int branch now up for public comment and review

2013-08-24 Thread Kenneth Zadeck

On 08/24/2013 02:16 PM, Florian Weimer wrote:

On 08/13/2013 10:57 PM, Kenneth Zadeck wrote:

1) The 4 files that hold the wide-int code itself.  You have seen a
lot of this code before except for the infinite precision
templates.  Also the classes are more C++ than C in their flavor.
In particular, the integration with trees is very tight in that an
int-cst or regular integers can be the operands of any wide-int
operation.


Are any of these conversions lossy?  Maybe some of these constructors 
should be made explicit?


It depends, there is nothing wrong with lossy conversions as long as you 
know what you are doing.


Re: wide-int branch now up for public comment and review

2013-08-24 Thread Kenneth Zadeck

fixed with the enclosed patch.

On 08/23/2013 11:02 AM, Richard Sandiford wrote:

/* Return true if THIS is negative based on the interpretation of SGN.
For UNSIGNED, this is always false.  This is correct even if
precision is 0.  */
inline bool
wide_int::neg_p (signop sgn) const

It seems odd that you have to pass SIGNED here.  I assume you were doing
it so that the caller is forced to confirm signedness in the cases where
a tree type is involved, but:

* neg_p kind of implies signedness anyway
* you don't require this for minus_one_p, so the interface isn't consistent
* at the rtl level signedness isn't a property of the "type" (mode),
   so it seems strange to add an extra hoop there




Index: gcc/ada/gcc-interface/decl.c
===
--- gcc/ada/gcc-interface/decl.c	(revision 201967)
+++ gcc/ada/gcc-interface/decl.c	(working copy)
@@ -7479,7 +7479,7 @@ annotate_value (tree gnu_size)
 	  tree op1 = TREE_OPERAND (gnu_size, 1);
 	  wide_int signed_op1
 	= wide_int::from_tree (op1).sforce_to_size (TYPE_PRECISION (sizetype));
-	  if (signed_op1.neg_p (SIGNED))
+	  if (signed_op1.neg_p ())
 	{
 	  op1 = wide_int_to_tree (sizetype, -signed_op1);
 	  pre_op1 = annotate_value (build1 (NEGATE_EXPR, sizetype, op1));
Index: gcc/c-family/c-ada-spec.c
===
--- gcc/c-family/c-ada-spec.c	(revision 201967)
+++ gcc/c-family/c-ada-spec.c	(working copy)
@@ -2197,7 +2197,7 @@ dump_generic_ada_node (pretty_printer *b
 	{
 	  wide_int val = node;
 	  int i;
-	  if (val.neg_p (SIGNED))
+	  if (val.neg_p ())
 	{
 	  pp_minus (buffer);
 	  val = -val;
Index: gcc/config/sparc/sparc.c
===
--- gcc/config/sparc/sparc.c	(revision 201967)
+++ gcc/config/sparc/sparc.c	(working copy)
@@ -10624,7 +10624,7 @@ sparc_fold_builtin (tree fndecl, int n_a
 	  overall_overflow |= overall_overflow;
 	  tmp = e0.add (tmp, SIGNED, &overflow);
 	  overall_overflow |= overall_overflow;
-	  if (tmp.neg_p (SIGNED))
+	  if (tmp.neg_p ())
 		{
 		  tmp = tmp.neg (&overflow);
 		  overall_overflow |= overall_overflow;
Index: gcc/expr.c
===
--- gcc/expr.c	(revision 201967)
+++ gcc/expr.c	(working copy)
@@ -6718,7 +6718,7 @@ get_inner_reference (tree exp, HOST_WIDE
   if (offset)
 {
   /* Avoid returning a negative bitpos as this may wreak havoc later.  */
-  if (bit_offset.neg_p (SIGNED))
+  if (bit_offset.neg_p ())
 {
 	  addr_wide_int mask
 	= addr_wide_int::mask (BITS_PER_UNIT == 8
Index: gcc/fold-const.c
===
--- gcc/fold-const.c	(revision 201967)
+++ gcc/fold-const.c	(working copy)
@@ -183,13 +183,13 @@ div_if_zero_remainder (const_tree arg1,
 	 precision by 1 bit, iff the top bit is set.  */
   if (sgn == UNSIGNED)
 	{
-	  if (warg1.neg_p (SIGNED))
+	  if (warg1.neg_p ())
 	warg1 = warg1.force_to_size (warg1.get_precision () + 1, sgn);
 	  sgn = SIGNED;
 	}
   else
 	{
-	  if (warg2.neg_p (SIGNED))
+	  if (warg2.neg_p ())
 	warg2 = warg2.force_to_size (warg2.get_precision () + 1, sgn2);
 	}
 }
@@ -979,7 +979,7 @@ int_const_binop_1 (enum tree_code code,
 
 case RSHIFT_EXPR:
 case LSHIFT_EXPR:
-  if (arg2.neg_p (SIGNED))
+  if (arg2.neg_p ())
 	{
 	  arg2 = -arg2;
 	  if (code == RSHIFT_EXPR)
@@ -999,7 +999,7 @@ int_const_binop_1 (enum tree_code code,
   
 case RROTATE_EXPR:
 case LROTATE_EXPR:
-  if (arg2.neg_p (SIGNED))
+  if (arg2.neg_p ())
 	{
 	  arg2 = -arg2;
 	  if (code == RROTATE_EXPR)
@@ -7180,7 +7180,7 @@ fold_plusminus_mult_expr (location_t loc
   /* As we canonicalize A - 2 to A + -2 get rid of that sign for
 	 the purpose of this canonicalization.  */
   if (TYPE_SIGN (TREE_TYPE (arg1)) == SIGNED
-	  && wide_int (arg1).neg_p (SIGNED)
+	  && wide_int (arg1).neg_p ()
 	  && negate_expr_p (arg1)
 	  && code == PLUS_EXPR)
 	{
@@ -12323,7 +12323,7 @@ fold_binary_loc (location_t loc,
 	  && TYPE_SIGN (type) == SIGNED
 	  && TREE_CODE (arg1) == INTEGER_CST
 	  && !TREE_OVERFLOW (arg1)
-	  && wide_int (arg1).neg_p (SIGNED)
+	  && wide_int (arg1).neg_p ()
 	  && !TYPE_OVERFLOW_TRAPS (type)
 	  /* Avoid this transformation if C is INT_MIN, i.e. C == -C.  */
 	  && !sign_bit_p (arg1, arg1))
Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c	(revision 201967)
+++ gcc/gimple-ssa-strength-reduction.c	(working copy)
@@ -1824,7 +1824,7 @@ cand_abs_increment (slsr_cand_t c)
 {
   max_wide_int increment = cand_increment (c);
 
-  if (!address_arithmetic_p && increment.neg_p (SIGNED))
+  if (!address_arithmetic_p && increment.neg_p ())
 increment = -increment;
 
   return increment;
@@ -1872,7 +1872,7 @@

Re: wide-int branch now up for public comment and review

2013-08-24 Thread Kenneth Zadeck



The patch goes for (1) but (2) seems better to me, for a few reasons:

* As above, constants coming from rtl are already in the right form,
so if you create a wide_int from an rtx and only query it, no explicit
extension is needed.

* Things like logical operations and right shifts naturally preserve
the sign-extended form, so only a subset of write operations need
to take special measures.

right now the internals of wide-int do not keep the bits above the
precision clean.   as you point out this could be fixed by changing
lshift, add, sub, mul, div (and anything else i have forgotten about)
and removing the code that cleans this up on exit.   I actually do not
really care which way we go here but if we do go on keeping the bits
clean above the precision inside wide-int, we are going to have to clean
the bits in the constructors from rtl, or fix some/a lot of bugs.

But if you want to go with the stay clean plan you have to start clean,
so at the rtl level this means copying. and it is the not copying trick
that pushed me in the direction we went.

At the tree level, this is not an issue.   There are no constructors for
tree-csts that do not have a type and before we copy the rep from the
wide-int to the tree, we clean the top bits.   (BTW - If i had to guess
what the bug is with the missing messages on the mips port, it would be
because the front ends HAD a bad habit of creating constants that did
not fit into a type and then later checking to see if there were any
interesting bits above the precision in the int-cst.  This now does not
work because we clean out those top bits on construction but it would
not surprise me if we missed the fixed point constant path.)   So at the
tree level, we could easily go either way here, but there is a cost at
the rtl level with doing (2).

TBH, I think we should do (2) and fix whatever bugs you saw with invalid
rtx constants.

luckily (or perhaps unluckily) if you try the test it fails pretty 
quickly - building gcclib on the x86-64.   I have enclosed a patch to 
check this.you can try it your self and see if you really think this 
is right path.


good luck, i fear you may need it.

on the other hand, if it is just a few quick bugs, then i will agree 
that (2) is right path.


kenny


Index: gcc/wide-int.cc
===
--- gcc/wide-int.cc	(revision 201968)
+++ gcc/wide-int.cc	(working copy)
@@ -171,6 +171,10 @@ wide_int_ro::from_rtx (const rtx_mode_t
 case CONST_INT:
   result.val[0] = INTVAL (x);
   result.len = 1;
+
+  if (prec != HOST_BITS_PER_WIDE_INT)
+	gcc_assert (result.val[0] == sext_hwi (result.val[0], prec));
+
 #ifdef DEBUG_WIDE_INT
   debug_wh ("wide_int:: %s = from_rtx ("HOST_WIDE_INT_PRINT_HEX")\n",
 		result, INTVAL (x));


Re: wide-int branch now up for public comment and review

2013-08-25 Thread Kenneth Zadeck

On 08/25/2013 02:42 AM, Richard Sandiford wrote:

Kenneth Zadeck  writes:

On 08/24/2013 08:05 AM, Richard Sandiford wrote:

Richard Sandiford  writes:

I wonder how easy it would be to restrict this use of "zero precision"
(i.e. flexible precision) to those where primitive types like "int" are
used as template arguments to operators, and require a precision when
constructing a wide_int.  I wouldn't have expected "real" precision 0
(from zero-width bitfields or whatever) to need any special handling
compared to precision 1 or 2.

I tried the last bit -- requiring a precision when constructing a
wide_int -- and it seemed surprising easy.  What do you think of
the attached?  Most of the forced knock-on changes seem like improvements,
but the java part is a bit ugly.  I also went with "wide_int (0, prec).cmp"
for now, although I'd like to add static cmp, cmps and cmpu alongside
leu_p, etc., if that's OK.  It would then be possible to write
"wide_int::cmp (0, ...)" and avoid the wide_int construction altogether.

I wondered whether you might also want to get rid of the build_int_cst*
functions, but that still looks a long way off, so I hope using them in
these two places doesn't seem too bad.

This is just an incremental step.  I've also only run it through a
subset of the testsuite so far, but full tests are in progress...

So i am going to make two high level comments here and then i am going
to leave the ultimate decision to the community.   (1) I am mildly in
favor of leaving prec 0 stuff the way that it is (2) my guess is that
richi also will favor this.   My justification for (2) is because he had
a lot of comments about this before he went on leave and this is
substantially the way that it was when he left. Also, remember that one
of his biggest dislikes was having to specify precisions.

Hmm, but you seem to be talking about zero precision in general.
(I'm going to call it "flexible precision" to avoid confusion with
the zero-width bitfield stuff.)
I have tried to purge the zero width bitfield case from my mind. it was 
an ugly incident in the conversion.




Whereas this patch is specifically
about constructing flexible-precision _wide_int_ objects.  I think
wide_int objects should always have a known, fixed precision.
This is where we differ.  I do not.   The top level idea is really 
motivated by richi, but i have come to appreciate his criticism. Many of 
the times, the specification of the precision is simply redundant and it 
glops up the code.



Note that fixed_wide_ints can still use primitive types in the
same way as before, since there the precision is inherent to the
fixed_wide_int.  The templated operators also work in the same
way as before.  Only the construction of wide_int proper is affected.

As it stands you have various wide_int operators that cannot handle two
flexible-precision inputs.  This means that innocent-looking code like:

   extern wide_int foo (wide_int);
   wide_int bar () { return foo (0); }

ICEs when combined with equally innocent-looking code like:

   wide_int foo (wide_int x) { return x + 1; }

So in practice you have to know when calling a function whether any
paths in that function will try applying an operator with a primitive type.
If so, you need to specify a precison when constructing the wide_int
argument.  If not you can leave it out.  That seems really unclean.
my wife, who is a lawyer, likes to quote an old Brittish chancellor: 
"hard cases make bad law".
The fact that you occasionally  have to specify one should not be 
justification for throwing out the entire thing.




The point of this template stuff is to avoid constructing wide_int objects
from primitive integers whereever possible.  And I think the fairly
small size of the patch shows that you've succeeded in doing that.
But I think we really should specify a precision in the handful of cases
where a wide_int does still need to be constructed directly from
a primitive type.

Thanks,
Richard

As i said earlier.Lets see what others in the community feel about this.


wide-int branch

2013-08-25 Thread Kenneth Zadeck
cleaned up code to get around tree-vrp issue.   added some code that 
richard is going to play with to see how hard it would be to clean up 
rtl constants.


kenny
Index: gcc/wide-int.cc
===
--- gcc/wide-int.cc	(revision 201968)
+++ gcc/wide-int.cc	(working copy)
@@ -171,6 +171,12 @@ wide_int_ro::from_rtx (const rtx_mode_t
 case CONST_INT:
   result.val[0] = INTVAL (x);
   result.len = 1;
+
+#if 0
+  if (prec != HOST_BITS_PER_WIDE_INT)
+	gcc_assert (result.val[0] == sext_hwi (result.val[0], prec));
+#endif
+
 #ifdef DEBUG_WIDE_INT
   debug_wh ("wide_int:: %s = from_rtx ("HOST_WIDE_INT_PRINT_HEX")\n",
 		result, INTVAL (x));
Index: gcc/wide-int.h
===
--- gcc/wide-int.h	(revision 201968)
+++ gcc/wide-int.h	(working copy)
@@ -1843,7 +1843,6 @@ wide_int_ro::zforce_to_size (unsigned in
 inline HOST_WIDE_INT
 wide_int_ro::sign_mask () const
 {
-  int i = len - 1;
   if (precision < HOST_BITS_PER_WIDE_INT)
 {
   /* We don't allow a int:0 inside a struct to get this far,
@@ -1853,14 +1852,13 @@ wide_int_ro::sign_mask () const
 	  >> (HOST_BITS_PER_WIDE_INT - 1));
 }
 
-  /* VRP appears to be badly broken and this is a very ugly fix.  */
-  if (i >= 0)
-return val[i] >> (HOST_BITS_PER_WIDE_INT - 1);
-
-  gcc_unreachable ();
-#if 0
-  return val[len - 1] >> (HOST_BITS_PER_WIDE_INT - 1);
-#endif
+  /* TREE_VRP is not able to see that it is not possible for len to be
+ 0.  So without this test, it warns about this which causes
+ bootstrap failures.  */
+  if (len < 1)
+gcc_unreachable ();
+  else
+return val[len - 1] >> (HOST_BITS_PER_WIDE_INT - 1);
 }
 
 /* Return THIS & C.  */


Re: wide-int branch now up for public comment and review

2013-08-25 Thread Kenneth Zadeck

On 08/25/2013 06:55 PM, Mike Stump wrote:

On Aug 25, 2013, at 12:26 AM, Richard Sandiford  
wrote:

(2) Adding a new namespace, wi, for the operators.  So far this
just contains the previously-static comparison functions
and whatever else was needed to avoid cross-dependencies
between wi and wide_int_ro (except for the debug routines).

It seems reasonable; I don't see anything I object to.  Seems like most of the 
time, the code is shorter (though, you use wi, which is fairly short).  It 
doesn't seem any more complex, though, knowing how to spell the operation 
wide_int:: v wi:: is confusing on the client side.  I'm torn between this and 
the nice things that come with the patch.


(3) Removing the comparison member functions and using the static
ones everywhere.

I've love to have richi weigh in (or someone else that wants to play the role 
of C++ coding expert)…  I'd defer to them…


The idea behind using a namespace rather than static functions
is that it makes it easier to separate the core, tree and rtx bits.

Being able to separate core, tree and rtx bits gets a +1 in my book.  I do 
understand the beauty of this.


IMO wide-int.h shouldn't know about trees and rtxes, and all routines
related to them should be in tree.h and rtl.h instead.  But using
static functions means that you have to declare everything in one place.
Also, it feels odd for wide_int to be both an object and a home
of static functions that don't always operate on wide_ints, e.g. when
comparing a CONST_INT against 16.

Yes, though, does wi feel odd being a home for comparing a CONST_INT and 16?  
:-)
on the other hand, how else are you going to do this.i have not seen 
anyone sign up to make an oo version of rtl and even if they did that, 
the consts are just a small part of it.


i agree that it is odd, but then again, it is actually nice to have a 
largely similar interface for trees and rtl.





I realise I'm probably not being helpful here.

Iterating on how we want to code to look like is reasonable.  Prettying it up 
where it needs it, is good.

Indeed, if the code is as you like, and as richi likes, we'll then our mission 
is just about complete.  :-)  For this patch, I'd love to defer to richi (or 
someone that has a stronger opinion than I do) to say, better, worse…




wide-int branch updated

2013-08-26 Thread Kenneth Zadeck

fixed fits_uhwi_p.

tested on x86-64.

kenny
Index: gcc/wide-int.h
===
--- gcc/wide-int.h	(revision 201985)
+++ gcc/wide-int.h	(working copy)
@@ -1650,7 +1650,7 @@ wide_int_ro::fits_shwi_p () const
 inline bool
 wide_int_ro::fits_uhwi_p () const
 {
-  return len == 1 || (len == 2 && val[1] == 0);
+  return (len == 1 && val[0] >= 0) || (len == 2 && val[1] == 0);
 }
 
 /* Return the signed or unsigned min of THIS and C.  */


wide-int branch updated.

2013-08-27 Thread Kenneth Zadeck

removed all knowledge of SHIFT_COUNT_TRUNCATED from wide-int

both Richard Biener and Richard Sandiford had commented negatively about 
this.


fixed bug with wide-int::fits_uhwi_p.

kenny


Index: gcc/fold-const.c
===
--- gcc/fold-const.c	(revision 201985)
+++ gcc/fold-const.c	(working copy)
@@ -992,9 +992,9 @@ int_const_binop_1 (enum tree_code code,
 	/* It's unclear from the C standard whether shifts can overflow.
 	   The following code ignores overflow; perhaps a C standard
 	   interpretation ruling is needed.  */
-	res = op1.rshift (arg2, sign, GET_MODE_BITSIZE (TYPE_MODE (type)), TRUNC);
+	res = op1.rshift (arg2, sign, GET_MODE_BITSIZE (TYPE_MODE (type)));
   else
-	res = op1.lshift (arg2, GET_MODE_BITSIZE (TYPE_MODE (type)), TRUNC);
+	res = op1.lshift (arg2, GET_MODE_BITSIZE (TYPE_MODE (type)));
   break;
   
 case RROTATE_EXPR:
Index: gcc/simplify-rtx.c
===
--- gcc/simplify-rtx.c	(revision 201985)
+++ gcc/simplify-rtx.c	(working copy)
@@ -3507,9 +3507,7 @@ rtx
 simplify_const_binary_operation (enum rtx_code code, enum machine_mode mode,
  rtx op0, rtx op1)
 {
-#if TARGET_SUPPORTS_WIDE_INT == 0
   unsigned int width = GET_MODE_PRECISION (mode);
-#endif
 
   if (VECTOR_MODE_P (mode)
   && code != VEC_CONCAT
@@ -3787,40 +3785,45 @@ simplify_const_binary_operation (enum rt
 	  break;
 
 	case LSHIFTRT:
-	  if (wide_int (std::make_pair (op1, mode)).neg_p ())
-	return NULL_RTX;
-
-	  result = wop0.rshiftu (pop1, bitsize, TRUNC);
-	  break;
-	  
 	case ASHIFTRT:
-	  if (wide_int (std::make_pair (op1, mode)).neg_p ())
-	return NULL_RTX;
-
-	  result = wop0.rshifts (pop1, bitsize, TRUNC);
-	  break;
-	  
 	case ASHIFT:
-	  if (wide_int (std::make_pair (op1, mode)).neg_p ())
-	return NULL_RTX;
-
-	  result = wop0.lshift (pop1, bitsize, TRUNC);
-	  break;
-	  
 	case ROTATE:
-	  if (wide_int (std::make_pair (op1, mode)).neg_p ())
-	return NULL_RTX;
-
-	  result = wop0.lrotate (pop1);
-	  break;
-	  
 	case ROTATERT:
-	  if (wide_int (std::make_pair (op1, mode)).neg_p ())
-	return NULL_RTX;
+	  {
+	wide_int wop1 = pop1;
+	if (wop1.neg_p ())
+	  return NULL_RTX;
 
-	  result = wop0.rrotate (pop1);
-	  break;
+	if (SHIFT_COUNT_TRUNCATED)
+	  wop1 = wop1.umod_trunc (width); 
 
+	switch (code)
+	  {
+	  case LSHIFTRT:
+		result = wop0.rshiftu (wop1, bitsize);
+		break;
+		
+	  case ASHIFTRT:
+		result = wop0.rshifts (wop1, bitsize);
+		break;
+		
+	  case ASHIFT:
+		result = wop0.lshift (wop1, bitsize);
+		break;
+		
+	  case ROTATE:
+		result = wop0.lrotate (wop1);
+		break;
+		
+	  case ROTATERT:
+		result = wop0.rrotate (wop1);
+		break;
+
+	  default:
+		gcc_unreachable ();
+	  }
+	break;
+	  }
 	default:
 	  return NULL_RTX;
 	}
Index: gcc/wide-int.cc
===
--- gcc/wide-int.cc	(revision 201985)
+++ gcc/wide-int.cc	(working copy)
@@ -2404,7 +2404,12 @@ wide_int_ro::divmod_internal_2 (unsigned
 
 
 /* Do a truncating divide DIVISOR into DIVIDEND.  The result is the
-   same size as the operands.  SIGN is either SIGNED or UNSIGNED.  */
+   same size as the operands.  SIGN is either SIGNED or UNSIGNED.  If
+   COMPUTE_QUOTIENT is set, the quotient is computed and returned.  If
+   it is not set, the result is undefined.  If COMPUTE_REMAINDER is
+   set, the remaineder is returned in remainder.  If it is not set,
+   the remainder is undefined.  If OFLOW is not null, it is set to the
+   overflow value.  */
 wide_int_ro
 wide_int_ro::divmod_internal (bool compute_quotient,
 			  const HOST_WIDE_INT *dividend,
@@ -2470,6 +2475,10 @@ wide_int_ro::divmod_internal (bool compu
   quotient.precision = dividend_prec;
   remainder->precision = dividend_prec;
 
+  /* Initialize the incoming overflow if it has been provided.  */
+  if (oflow)
+*oflow = false;
+
   /* If overflow is set, just get out.  There will only be grief by
  continuing.  */
   if (overflow)
Index: gcc/wide-int.h
===
--- gcc/wide-int.h	(revision 201995)
+++ gcc/wide-int.h	(working copy)
@@ -259,19 +259,6 @@ const int addr_max_bitsize = 64;
 const int addr_max_precision
   = ((addr_max_bitsize + 4 + HOST_BITS_PER_WIDE_INT - 1) & ~(HOST_BITS_PER_WIDE_INT - 1));
 
-enum ShiftOp {
-  NONE,
-  /* There are two uses for the wide-int shifting functions.  The
- first use is as an emulation of the target hardware.  The
- second use is as service routines for other optimizations.  The
- first case needs to be identified by passing TRUNC as the value
- of ShiftOp so that shift amount is properly handled according to the
- SHIFT_COUNT_TRUNCATED flag.  For the second case, the shift
- amount is always truncated by the bytesize of the mode of
- THIS.  */
-  TRUNC
-};
-
 /*

Re: wide-int branch updated

2013-08-27 Thread Kenneth Zadeck
you are about an hour behind in reading your email.   I had just 
committed a patch that is very close to this.


On 08/27/2013 02:31 PM, Richard Sandiford wrote:

Kenneth Zadeck  writes:

fixed fits_uhwi_p.

tested on x86-64.

kenny

Index: gcc/wide-int.h
===
--- gcc/wide-int.h  (revision 201985)
+++ gcc/wide-int.h  (working copy)
@@ -1650,7 +1650,7 @@ wide_int_ro::fits_shwi_p () const
  inline bool
  wide_int_ro::fits_uhwi_p () const
  {
-  return len == 1 || (len == 2 && val[1] == 0);
+  return (len == 1 && val[0] >= 0) || (len == 2 && val[1] == 0);
  }

With upper bits being undefined, it doesn't seem safe to check
val[0] or val[1] like this.   I was thinking along the lines of:

inline bool
wide_int_ro::fits_uhwi_p () const
{
   if (precision <= HOST_BITS_PER_WIDE_INT)
 return true;
   if (len == 1)
 return val[0] >= 0;
   if (precision < HOST_BITS_PER_WIDE_INT * 2)
 return ((unsigned HOST_WIDE_INT) val[1]
<< (HOST_BITS_PER_WIDE_INT * 2 - precision)) == 0;
   return val[1] == 0;
}

Since we don't have a sign, everything HWI-sized or smaller fits in a
uhwi without loss of precision.

I've tested the above on x86_64-linux-gnu FWIW, in case it looks OK.

Thanks,
Richard




wide-int branch: cleaned up comparison functions.

2013-08-27 Thread Kenneth Zadeck
Removed the redundant implementations of several comparison function by 
just forwarding the oo version to the static version.   Added static 
versions of cmp, cmpu and cmps.


kenny
Index: gcc/wide-int.h
===
--- gcc/wide-int.h	(revision 202033)
+++ gcc/wide-int.h	(working copy)
@@ -443,12 +443,18 @@ public:
 
   template 
   int cmp (const T &, signop) const;
+  template 
+  static int cmp (const T1 &, const T2 &, signop);
 
   template 
   int cmps (const T &) const;
+  template 
+  static int cmps (const T1 &, const T2 &);
 
   template 
   int cmpu (const T &) const;
+  template 
+  static int cmpu (const T1 &, const T2 &);
 
   bool only_sign_bit_p (unsigned int) const;
   bool only_sign_bit_p () const;
@@ -1137,38 +1143,7 @@ template 
 inline bool
 wide_int_ro::operator == (const T &c) const
 {
-  bool result;
-  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
-  const HOST_WIDE_INT *s;
-  unsigned int cl;
-  unsigned int p1, p2;
-
-  p1 = precision;
-
-  s = to_shwi1 (ws, &cl, &p2, c);
-  check_precision (&p1, &p2, true, false);
-
-  if (p1 == 0)
-/* There are prec 0 types and we need to do this to check their
-   min and max values.  */
-result = (len == cl) && (val[0] == s[0]);
-  else if (p1 < HOST_BITS_PER_WIDE_INT)
-{
-  unsigned HOST_WIDE_INT mask = ((HOST_WIDE_INT)1 << p1) - 1;
-  result = (val[0] & mask) == (s[0] & mask);
-}
-  else if (p1 == HOST_BITS_PER_WIDE_INT)
-result = val[0] == s[0];
-  else
-result = eq_p_large (val, len, p1, s, cl);
-
-  if (result)
-gcc_assert (len == cl);
-
-#ifdef DEBUG_WIDE_INT
-  debug_vwa ("wide_int_ro:: %d = (%s == %s)\n", result, *this, s, cl, p2);
-#endif
-  return result;
+  return wide_int_ro::eq_p (*this, c);
 }
 
 /* Return true if C1 == C2.  If both parameters have nonzero precisions,
@@ -1219,31 +1194,7 @@ template 
 inline bool
 wide_int_ro::lts_p (const T &c) const
 {
-  bool result;
-  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
-  const HOST_WIDE_INT *s;
-  unsigned int cl;
-  unsigned int p1, p2;
-
-  p1 = precision;
-  s = to_shwi1 (ws, &cl, &p2, c);
-  check_precision (&p1, &p2, false, true);
-
-  if (p1 <= HOST_BITS_PER_WIDE_INT
-  && p2 <= HOST_BITS_PER_WIDE_INT)
-{
-  gcc_assert (cl != 0);
-  HOST_WIDE_INT x0 = sext_hwi (val[0], p1);
-  HOST_WIDE_INT x1 = sext_hwi (s[0], p2);
-  result = x0 < x1;
-}
-  else
-result = lts_p_large (val, len, p1, s, cl, p2);
-
-#ifdef DEBUG_WIDE_INT
-  debug_vwa ("wide_int_ro:: %d = (%s lts_p %s\n", result, *this, s, cl, p2);
-#endif
-  return result;
+  return wide_int_ro::lts_p (*this, c);
 }
 
 /* Return true if C1 < C2 using signed comparisons.  */
@@ -1283,30 +1234,7 @@ template 
 inline bool
 wide_int_ro::ltu_p (const T &c) const
 {
-  bool result;
-  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
-  const HOST_WIDE_INT *s;
-  unsigned int cl;
-  unsigned int p1, p2;
-
-  p1 = precision;
-  s = to_shwi1 (ws, &cl, &p2, c);
-  check_precision (&p1, &p2, false, true);
-
-  if (p1 <= HOST_BITS_PER_WIDE_INT
-  && p2 <= HOST_BITS_PER_WIDE_INT)
-{
-  unsigned HOST_WIDE_INT x0 = zext_hwi (val[0], p1);
-  unsigned HOST_WIDE_INT x1 = zext_hwi (s[0], p2);
-  result = x0 < x1;
-}
-  else
-result = ltu_p_large (val, len, p1, s, cl, p2);
-
-#ifdef DEBUG_WIDE_INT
-  debug_vwa ("wide_int_ro:: %d = (%s ltu_p %s)\n", result, *this, s, cl, p2);
-#endif
-  return result;
+  return wide_int_ro::ltu_p (*this, c); 
 }
 
 /* Return true if C1 < C2 using unsigned comparisons.  */
@@ -1524,27 +1452,37 @@ wide_int_ro::ge_p (const T1 &c1, const T
 return geu_p (c1, c2);
 }
 
-/* Returns -1 if THIS < C, 0 if THIS == C and 1 if A > C using
+/* Returns -1 if THIS < C, 0 if THIS == C and 1 if THIS > C using
signed compares.  */
 template 
 inline int
 wide_int_ro::cmps (const T &c) const
 {
+  return wide_int_ro::cmps (*this, c);
+}
+
+/* Returns -1 if C1 < C2, 0 if C1 == C2 and 1 if C1 > C2 using
+   signed compares.  */
+template 
+inline int
+wide_int_ro::cmps (const T1 &c1, const T2 &c2)
+{
   int result;
-  HOST_WIDE_INT ws[WIDE_INT_MAX_ELTS];
-  const HOST_WIDE_INT *s;
-  unsigned int cl;
-  unsigned int prec;
+  HOST_WIDE_INT ws1[WIDE_INT_MAX_ELTS];
+  HOST_WIDE_INT ws2[WIDE_INT_MAX_ELTS];
+  const HOST_WIDE_INT *s1, *s2;  /* Returned data */
+  unsigned int cl1, cl2; /* array lengths  */
+  unsigned int p1, p2;   /* precisions */
 
-  s = to_shwi1 (ws, &cl, &prec, c);
-  if (prec == 0)
-prec = precision;
+  s1 = to_shwi1 (ws1, &cl1, &p1, c1);
+  s2 = to_shwi1 (ws2, &cl2, &p2, c2);
+  check_precision (&p1, &p2, false, true);
 
-  if (precision <= HOST_BITS_PER_WIDE_INT
-  && prec <= HOST_BITS_PER_WIDE_INT)
+  if (p1 <= HOST_BITS_PER_WIDE_INT
+  && p2 <= HOST_BITS_PER_WIDE_INT)
 {
-  HOST_WIDE_INT x0 = sext_hwi (val[0], precision);
-  HOST_WIDE_INT x1 = sext_hwi (s[0], prec);
+  HOST_WIDE_INT x0 = sext_hwi (s1[0], p1);
+  HOST_WIDE_INT x1 = 

Re: wide-int branch updated.

2013-08-28 Thread Kenneth Zadeck

On 08/28/2013 03:45 AM, Richard Biener wrote:

On Tue, 27 Aug 2013, Kenneth Zadeck wrote:


removed all knowledge of SHIFT_COUNT_TRUNCATED from wide-int

both Richard Biener and Richard Sandiford had commented negatively about
this.

fixed bug with wide-int::fits_uhwi_p.

  inline bool
  wide_int_ro::fits_uhwi_p () const
  {
-  return (len == 1 && val[0] >= 0) || (len == 2 && val[1] == 0);
+  return (precision <= HOST_BITS_PER_WIDE_INT)
+|| (len == 1 && val[0] >= 0)
+|| (len == 2 && (precision >= 2 * HOST_BITS_PER_WIDE_INT) && (val[1]
==
0))
+|| (len == 2 && (sext_hwi (val[1], precision &
(HOST_BITS_PER_WIDE_INT
- 1)) == 0));
  }

it now get's scary ;)  Still wrong for precision == 0?
no, because anything that comes in at precision 0 is a canonized sign 
extended number already.   the precision 0 just means that it is safe to 
be any precision.

;)

I wonder what it's semantic is ... in double_int we simply require
high == 0 (thus, negative numbers are not allowed).  with
precision <= HOST_BITS_PER_WIDE_INT you allow negative numbers.

Matching what double-int fits_uhwi does would be

(len == 1 && ((signed HOST_WIDE_INT)val[0]) >= 0)

it is signed so i am matching this part.

|| (len == 2 && val[1] == 0)
so this does not work.   say i had a precision 70 bit wide-int. The bits 
above the precision are undefined, so i have to clear it out.   This is 
what the two lines at len 2 are for.   However if the precision is 
greater than 2 hwi's then we can do something this simple.


kenny

(I don't remember off-hand the signedness of val[], but eventually
you missed the conversion to signed)

Now, what double-int does is supposed to match
host_integerp (..., 1) which I think it does.

Richard.




Re: wide-int branch updated

2013-08-28 Thread Kenneth Zadeck

On 08/28/2013 03:50 AM, Richard Biener wrote:

On Tue, 27 Aug 2013, Richard Sandiford wrote:


Kenneth Zadeck  writes:

fixed fits_uhwi_p.

tested on x86-64.

kenny

Index: gcc/wide-int.h
===
--- gcc/wide-int.h  (revision 201985)
+++ gcc/wide-int.h  (working copy)
@@ -1650,7 +1650,7 @@ wide_int_ro::fits_shwi_p () const
  inline bool
  wide_int_ro::fits_uhwi_p () const
  {
-  return len == 1 || (len == 2 && val[1] == 0);
+  return (len == 1 && val[0] >= 0) || (len == 2 && val[1] == 0);
  }

With upper bits being undefined, it doesn't seem safe to check

Err - we're back at upper bits being undefined?  Ugh.  Then
having both fits_uhwi_p and fits_shwi_p doesn't make sense.


yes, that is the problem.   Richard is going to look into what it might 
take to make rtl always have the upper bits canonized as sign 
extended.   if he/we can do this, then we will change everything so that 
it is always canonized above the precision.   Every one thinks this is a 
good plan, assuming you do not have to rewrite ever back end to do it.





val[0] or val[1] like this.   I was thinking along the lines of:

inline bool
wide_int_ro::fits_uhwi_p () const
{
   if (precision <= HOST_BITS_PER_WIDE_INT)
 return true;
   if (len == 1)
 return val[0] >= 0;
   if (precision < HOST_BITS_PER_WIDE_INT * 2)
 return ((unsigned HOST_WIDE_INT) val[1]
<< (HOST_BITS_PER_WIDE_INT * 2 - precision)) == 0;
   return val[1] == 0;
}

Since we don't have a sign, everything HWI-sized or smaller fits in a
uhwi without loss of precision.

Which then means fits_hwi_p () is simply

   if (precision <= HOST_BITS_PER_WIDE_INT)
 return true;
   zext ();  // ick, modification in a predicate!  stupid undefined bits!
   return len == 1 || (len == 2 && val[1] == 0);

we do not ever do this.  where are you looking.


but if upper bits are undefined then the whole encoding and len
business is broken.

no it is not.


btw - welcome back.  we have missed you.

Richard.




Re: wide-int branch now up for public comment and review

2013-08-28 Thread Kenneth Zadeck



   Note that the bits above the precision are not defined and the
   algorithms used here are careful not to depend on their value.  In
   particular, values that come in from rtx constants may have random
   bits.

Which is a red herring.  It should be fixed.  I cannot even believe
that sentence given the uses of CONST_DOUBLE_LOW/CONST_DOUBLE_HIGH
or INTVAL/UINTVAL.  I don't see accesses masking out 'undefined' bits
anywhere.


Richi,

you asked for the entire patch as a branch so that you could look at the 
whole picture.
It is now time for you to do that.I understand it is very large and 
it will take some time for you to get your head around the whole 
thing.   But remember this:


The vast majority of the clients are dealing with intermediate code that 
has explicit promotions.Not only the rtl level, but the majority of 
the tree level takes inputs that have explicit precisions that have been 
matched and wants to see an explicit precision result.   For this code, 
doing the fixed precision thing where you do not ever ask about what is 
behind the curtain is a very good match.


However, there are parts of the compiler, all within the tree or gimple 
level, that do not have this view.   For this, there are two templates 
export an interface that behaves in a manner very similar to what 
double-int does when the precision was smaller than 128 bits.  (we 
discovered a large number of bugs when using double int for timode 
because they did make an infinite precision assumption, but that is 
another story.)  All numbers are converted to signed numbers that are 
extended based on their input type and the math is performed in a large 
enough field so that they never push near the end.   We know what the 
end is because we sniff the port.


At this point we looked at the pass we were converting and we used the 
appropriate implementation that match the style of coding and the 
algorithm. i.e. we made no substantive changes.  As mentioned in my 
earlier mail, i plan to change in the future tree-ssa-ccp to use the 
fixed precision form, but this is a change that is motivated by being 
able to find more constants with the proper representation than what is 
beautiful.   But this is the only case where i think the rep should be 
substantially changed.


I know you find the fixed precision stuff unappealing.   But the truth 
is that you wanted us to make this patch so that it did as little damage 
to the way the compiler worked as possible and given that so much of the 
compiler actually does fixed precision math, this is the path of least 
resistance.


If it is reasonable to change the rtl, we may change that, but the truth 
is that the clients never see this so it is not as much of an issue as 
you are making it.   Now you can see all of the clients, see this for 
yourself.


Kenny




Re: wide-int branch now up for public comment and review

2013-08-28 Thread Kenneth Zadeck

On 08/28/2013 12:45 PM, Mike Stump wrote:

On Aug 28, 2013, at 5:48 AM, Richard Biener  wrote:

Only if the precision is > HOST_BITS_PER_WIDE_INT.  If the precision
is HOST_BITS_PER_WIDE_INT then both are { -1U }.

That wasn't my understanding on how things work.

You are thinking about prec==0 numbers.  These are useful and important for 
ease of use of the wide-int package.  They allow one to do:

   wide_int w = …;

   w = w + 6;
   w = w - 3;
   w = w + (unsigned HOST_WIDE_INT)~0;

and extend the constant out to the precision of the other side.  This is a very 
narrow feature and not a general property of a wide_int.  In general, 
signedness of a wide_int is an external feature of wide_int.  We only permit 
prec==0 numbers for ease of use, and ease of use needs to track the sign, in 
the general case.  Now, one is free to have a precision that allows the sign to 
be stored, this is available to the user, if they want.  They merely are not 
forced to do this.  For example, RTL largely doesn't want or need a sign.


Right, that's what the constructors, from_* and to_* routines do.

I wonder where the from_tree and to_tree ones are?

tree t;
wide_int w = t;

wide_int_to_tree needs an additional type, so, the spelling is not as short out 
of necessity.

i made wide_int_to_tree a function that lives in tree.[ch], not a member 
function of wide-int.This seemed to be consistent with the way other 
things were done.if you want it to be a member function, that is 
certainly doable.




Are they
from_double_int / wide_int_to_tree (what's wide_int_to_infinite_tree?)

I think wide_int_to_infinite_tree is leftover junk.  I removed it:

diff --git a/gcc/wide-int.h b/gcc/wide-int.h
index 86be20a..83c2170 100644
--- a/gcc/wide-int.h
+++ b/gcc/wide-int.h
@@ -4203,8 +4203,6 @@ wide_int_ro::to_shwi2 (HOST_WIDE_INT *s ATTRIBUTE_UNUSED,
  /* tree related routines.  */
  
  extern tree wide_int_to_tree (tree type, const wide_int_ro &cst);

-extern tree wide_int_to_infinite_tree (tree type, const wide_int_ro &cst,
-  unsigned int prec);
  extern tree force_fit_type_wide (tree, const wide_int_ro &, int, bool);
  
  /* real related routines.  */






Re: wide-int branch now up for public comment and review

2013-08-28 Thread Kenneth Zadeck



   Note that the bits above the precision are not defined and the
   algorithms used here are careful not to depend on their value.  In
   particular, values that come in from rtx constants may have random
   bits.

Which is a red herring.  It should be fixed.  I cannot even believe
that sentence given the uses of CONST_DOUBLE_LOW/CONST_DOUBLE_HIGH
or INTVAL/UINTVAL.  I don't see accesses masking out 'undefined' bits
anywhere.

I can agree with you that this could be fixed.   But it is not necessary 
to fix it.   The rtl level and most of the tree level has existed for a 
long time by doing math within the precision.


you do not see the masking at the rtl level because the masking is not 
necessary.if you never look beyond the precision you just do not 
care.There is the issue with divide and mod and quite frankly the 
code on the trunk scares me to death.   my code at the rtl level makes 
sure that everything is clean since i can see if it is a udiv or a div 
that is enough info to clean the value up.



I have a feeling I'm rehashing a past debate, sorry, but rtx constants can't
have random bits.  The upper bits must be a sign extension of the value.
There's exactly one valid rtx for each (value, mode) pair.  If you saw
something different then that sounds like a bug.  The rules should already
be fairly well enforced though, since something like (const_int 128) --
or (const_int 256) -- will not match a QImode operand.

See.  We're saved ;)
this is richard's theory.   There is a block of code at wide-int.cc:175 
that is ifdefed out that checks to see if the rtl consts are 
canonical.   if you bring it in, building the libraries fails very 
quickly.   The million dollar question is this the only bug or is this 
the first bug of a 1000 bugs.   Your comment about not seeing any 
masking cuts both ways.   There is very little code on the way in if you 
create ints with GEN_INT that makes sure someone has done the right 
thing on that side.   So my guess is that there are a lot of failures 
and a large number of them will be in the ports.


If richard is right then there will be changes.  The internals of 
wide-int will be changed so that everything is maintained in canonical 
form rather than just doing the canonization on the outside.   This will 
clean up things like fits_uhwi_p and a lot more of the wide-int internals.


But i think that if you want to change the compiler to use infinite 
precision arithmetic, you really ought to have a better reason that you 
think that it is cleaner.   Because, it buys you nothing for most of the 
compiler and it is slower AND it is a much bigger change to the compiler 
than the one we want to make.




This is probably the part of the representation that I disagree most with.
There seem to be two main ways we could hande the extension to whole HWIs:

(1) leave the stored upper bits undefined and extend them on read
(2) keep the stored upper bits in extended form

The patch goes for (1) but (2) seems better to me, for a few reasons:

I agree whole-heartedly.
my statement is above.   if we can do 2 then we should.   But I do not 
think that it is worth several people-years to do this.



* As above, constants coming from rtl are already in the right form,
   so if you create a wide_int from an rtx and only query it, no explicit
   extension is needed.

* Things like logical operations and right shifts naturally preserve
   the sign-extended form, so only a subset of write operations need
   to take special measures.

* You have a public interface that exposes the underlying HWIs
   (which is fine with me FWIW), so it seems better to expose a fully-defined
   HWI rather than only a partially-defined HWI.

E.g. zero_p is:

   HOST_WIDE_INT x;

   if (precision && precision < HOST_BITS_PER_WIDE_INT)
 x = sext_hwi (val[0], precision);
   else if (len == 0)
 {
   gcc_assert (precision == 0);
   return true;
 }
   else
 x = val[0];

   return len == 1 && x == 0;

but I think it really ought to be just:

   return len == 1 && val[0] == 0;

Yes!

But then - what value does keeping track of a 'precision' have
in this case?  It seems to me it's only a "convenient carrier"
for

   wide_int x = wide-int-from-RTX (y);
   machine_mode saved_mode = mode-available? GET_MODE (y) : magic-mode;
   ... process x ...
   RTX = RTX-from-wide_int (x, saved_mode);

that is, wide-int doesn't do anything with 'precision' but you
can extract it later to not need to remember a mode you were
interested in?

Oh, and of course some operations require a 'precision', like rotate.
As i have said, when the operands and result are all precision correct, 
as they are generally in both rtl and tree, then the default fixed 
precision interface is really very clean.I know you object to some 
implementation issues, but this is why we put the branch up, so you can 
see what you get from the other side.




   When the precision is 0, all the bits in the LEN elements of
   VE

Re: wide-int branch now up for public comment and review

2013-08-29 Thread Kenneth Zadeck

On 08/29/2013 04:42 AM, Richard Biener wrote:

On Wed, 28 Aug 2013, Kenneth Zadeck wrote:


Note that the bits above the precision are not defined and the
algorithms used here are careful not to depend on their value.  In
particular, values that come in from rtx constants may have random
bits.

Which is a red herring.  It should be fixed.  I cannot even believe
that sentence given the uses of CONST_DOUBLE_LOW/CONST_DOUBLE_HIGH
or INTVAL/UINTVAL.  I don't see accesses masking out 'undefined' bits
anywhere.


I can agree with you that this could be fixed.   But it is not necessary to
fix it.   The rtl level and most of the tree level has existed for a long time
by doing math within the precision.

you do not see the masking at the rtl level because the masking is not
necessary.if you never look beyond the precision you just do not care.
There is the issue with divide and mod and quite frankly the code on the trunk
scares me to death.   my code at the rtl level makes sure that everything is
clean since i can see if it is a udiv or a div that is enough info to clean
the value up.


I have a feeling I'm rehashing a past debate, sorry, but rtx constants
can't
have random bits.  The upper bits must be a sign extension of the value.
There's exactly one valid rtx for each (value, mode) pair.  If you saw
something different then that sounds like a bug.  The rules should already
be fairly well enforced though, since something like (const_int 128) --
or (const_int 256) -- will not match a QImode operand.

See.  We're saved ;)

this is richard's theory.   There is a block of code at wide-int.cc:175 that
is ifdefed out that checks to see if the rtl consts are canonical.   if you
bring it in, building the libraries fails very quickly.   The million dollar
question is this the only bug or is this the first bug of a 1000 bugs.   Your
comment about not seeing any masking cuts both ways.   There is very little
code on the way in if you create ints with GEN_INT that makes sure someone has
done the right thing on that side.   So my guess is that there are a lot of
failures and a large number of them will be in the ports.

Well, clearly RTL code _expects_ constants to be properly extended.  I
can reproduce the bug and I simply guess that's a matter of mismatching
modes here (or performing an implicit truncation without properly
extending the constant).

 at /space/rguenther/src/svn/wide-int/gcc/combine.c:10086
10086  GEN_INT
(count));
(gdb) l
10081
10082 mask_rtx = GEN_INT (nonzero_bits (varop, GET_MODE
(varop)));
10083
10084 mask_rtx
10085   = simplify_const_binary_operation (code,
result_mode, mask_rtx,
10086  GEN_INT
(count));

uses of GEN_INT are frowned upon ... for exactly that reason - the
mask_rtx is not a proper RTL constant for SImode.
over time, the GEN_INTs will go away at the portable rtl level  as more 
of the code is transitioned to use wide-int.The port story is not so 
good.   Any port that uses TI or beyond will likely evolve to using 
wide-int for the math and unifying the cases between CONST_INT and 
CONST_WIDE_INT.  (the wide-int constructor from rtl takes either and the 
constructor to rtl looks at the constant.) But a port like the mips that 
has no TI will likely never change.


Btw, all this isn't a reason to not have a well-defined wide-int
representation.  You just have (as you have for trees) to properly
canonize the representation at the time you convert from RTL
constants to wide-int constants.
The wide-int representation is completely well defined.This is what 
i cannot see why you do not understand.   You do not need to look above 
the precision so there is nothing undefined!!   I know this bothers 
you very badly and i will agree that if it is easy to clean up rtl to 
match this, then it would simplify the code somewhat.But it is not 
undefined or random.   Furthermore, cleaning it will not change the abi 
so if rtl gets cleaner, we can change this.



In the long run we want a uniform representation of constants
so we can do zero-copying - but it looks like we now have
three different representations - the tree one (sign or zero
extended dependent on sign), RTL (garbage as you show) and
wide-int (always sign-extended).
You wanted this to look more like double-int, now that it does, you now 
to see the flaws in it.Do i ever get to win?   I do not think anyone 
would have been served well by trying to make trees have only signed 
constants in the way that double-int did.  I will admit that the 
implementation would be cleaner if we could just change everything else 
in the compiler to match a super clean wide-int design.   I took the 
other approach and bottled as much of the ugliness behind the a api and 
tried to keep the rest of the compiler

Re: wide-int branch now up for public comment and review

2013-08-29 Thread Kenneth Zadeck

On 08/28/2013 05:08 AM, Richard Biener wrote:

On Sun, 25 Aug 2013, Mike Stump wrote:


On Aug 25, 2013, at 12:26 AM, Richard Sandiford  
wrote:

(2) Adding a new namespace, wi, for the operators.  So far this
just contains the previously-static comparison functions
and whatever else was needed to avoid cross-dependencies
between wi and wide_int_ro (except for the debug routines).

It seems reasonable; I don't see anything I object to.  Seems like most of the 
time, the code is shorter (though, you use wi, which is fairly short).  It 
doesn't seem any more complex, though, knowing how to spell the operation 
wide_int:: v wi:: is confusing on the client side.  I'm torn between this and 
the nice things that come with the patch.


(3) Removing the comparison member functions and using the static
ones everywhere.

I've love to have richi weigh in (or someone else that wants to play the
role of C++ coding expert)?  I'd defer to them?

Yeah - wi::lt (a, b) is much better than a.lt (b) IMHO.  It mimics how
the standard library works.


The idea behind using a namespace rather than static functions
is that it makes it easier to separate the core, tree and rtx bits.

Being able to separate core, tree and rtx bits gets a +1 in my book.  I
do understand the beauty of this.

Now, if you look back in discussions I wanted a storage
abstraction anyway.  Basically the interface is

class wide_int_storage
{
   int precision ();
   int len ();
   element_t get (unsigned);
   void set (unsigned, element_t);
};

and wide_int is then templated like

template 
class wide_int : public storage
{
};

where RTX / tree storage classes provide read-only access to their
storage and a rvalue integer rep to its value.

You can look at my example draft implementation I posted some
months ago.  But I'll gladly wiggle on the branch to make it
more like above (easy step one: don't access the wide-int members
directly but via accessor functions)
you are of course welcome to do this.   But there were two questions 
that i never got an answer to.


1) how does doing this help in any way.   Are their clients that will 
use this?


2) isn't this just going to slow wide-int down?  It seems clear that 
there is a real performance cost in that now there is going to be an 
extra step of indirection on each access to the underlying data structures.


I will point out we have cut down on the copying by sharing the 
underlying value from trees or rtl when the value is short lived. But 
given that most constants only take a single hwi to represent, getting 
rid of the copying seems like a very small payback for the increase in 
access costs.



IMO wide-int.h shouldn't know about trees and rtxes, and all routines
related to them should be in tree.h and rtl.h instead.  But using
static functions means that you have to declare everything in one place.
Also, it feels odd for wide_int to be both an object and a home
of static functions that don't always operate on wide_ints, e.g. when
comparing a CONST_INT against 16.

Indeed - in my sample the wide-int-rtx-storage and wide-int-tree-storage
storage models were declared in rtl.h and tree.h and wide-int.h did
know nothing about them.

this could be done for us too, it has nothing to do with the storage model.

Yes, though, does wi feel odd being a home for comparing a CONST_INT and
16?  :-)


I realise I'm probably not being helpful here.

Iterating on how we want to code to look like is reasonable.  Prettying
it up where it needs it, is good.

Indeed, if the code is as you like, and as richi likes, we'll then our
mission is just about complete.  :-)  For this patch, I'd love to defer
to richi (or someone that has a stronger opinion than I do) to say,
better, worse?

The comparisons?  Better.

richard s is making these static.

Thanks,
Richard.




wide-int branch: fixed mips regression

2013-08-29 Thread Kenneth Zadeck

Fixed

FAIL: gcc.dg/fixed-point/int-warning.c  (test for warnings, line 12)

on


 mips64-unknown-linux-gnu

Index: gcc/tree.c
===
--- gcc/tree.c	(revision 202088)
+++ gcc/tree.c	(working copy)
@@ -8531,11 +8531,11 @@ bool
 int_fits_type_p (const_tree c, const_tree type)
 {
   tree type_low_bound, type_high_bound;
-  bool ok_for_low_bound, ok_for_high_bound, unsc;
+  bool ok_for_low_bound, ok_for_high_bound;
   wide_int wc, wd;
+  signop sgn_c = TYPE_SIGN (TREE_TYPE (c));
 
   wc = c;
-  unsc = TYPE_UNSIGNED (TREE_TYPE (c));
 
 retry:
   type_low_bound = TYPE_MIN_VALUE (type);
@@ -8555,17 +8555,17 @@ retry:
   if (type_low_bound && TREE_CODE (type_low_bound) == INTEGER_CST)
 {
   wd = type_low_bound;
-  if (unsc != TYPE_UNSIGNED (TREE_TYPE (type_low_bound)))
+  if (sgn_c != TYPE_SIGN (TREE_TYPE (type_low_bound)))
 	{
-	  int c_neg = (!unsc && wc.neg_p ());
-	  int t_neg = (unsc && wd.neg_p ());
+	  int c_neg = (sgn_c == SIGNED && wc.neg_p ());
+	  int t_neg = (sgn_c == UNSIGNED && wd.neg_p ());
 
 	  if (c_neg && !t_neg)
 	return false;
 	  if ((c_neg || !t_neg) && wc.ltu_p (wd))
 	return false;
 	}
-  else if (wc.lt_p (wd, TYPE_SIGN (TREE_TYPE (type_low_bound
+  else if (wc.lt_p (wd, sgn_c))
 	return false;
   ok_for_low_bound = true;
 }
@@ -8576,17 +8576,17 @@ retry:
   if (type_high_bound && TREE_CODE (type_high_bound) == INTEGER_CST)
 {
   wd = type_high_bound;
-  if (unsc != TYPE_UNSIGNED (TREE_TYPE (type_high_bound)))
+  if (sgn_c != TYPE_SIGN (TREE_TYPE (type_high_bound)))
 	{
-	  int c_neg = (!unsc && wc.neg_p ());
-	  int t_neg = (unsc && wd.neg_p ());
+	  int c_neg = (sgn_c == SIGNED && wc.neg_p ());
+	  int t_neg = (sgn_c == UNSIGNED && wd.neg_p ());
 
 	  if (t_neg && !c_neg)
 	return false;
 	  if ((t_neg || !c_neg) && wc.gtu_p (wd))
 	return false;
 	}
-  else if (wc.gt_p (wd, TYPE_SIGN (TREE_TYPE (type_high_bound
+  else if (wc.gt_p (wd, sgn_c))
 	return false;
   ok_for_high_bound = true;
 }
@@ -8600,7 +8600,7 @@ retry:
   /* Perform some generic filtering which may allow making a decision
  even if the bounds are not constant.  First, negative integers
  never fit in unsigned types, */
-  if (TYPE_UNSIGNED (type) && !unsc && wc.neg_p ())
+  if (TYPE_UNSIGNED (type) && sgn_c == SIGNED && wc.neg_p ())
 return false;
 
   /* Second, narrower types always fit in wider ones.  */
@@ -8608,7 +8608,7 @@ retry:
 return true;
 
   /* Third, unsigned integers with top bit set never fit signed types.  */
-  if (! TYPE_UNSIGNED (type) && unsc && wc.neg_p ())
+  if (!TYPE_UNSIGNED (type) && sgn_c == UNSIGNED && wc.neg_p ())
 return false;
 
   /* If we haven't been able to decide at this point, there nothing more we
Index: gcc/fold-const.c
===
--- gcc/fold-const.c	(revision 202088)
+++ gcc/fold-const.c	(working copy)
@@ -1690,7 +1690,7 @@ fold_convert_const_int_from_fixed (tree
   /* Given a fixed-point constant, make new constant with new type,
  appropriately sign-extended or truncated.  */
   t = force_fit_type (type, wide_int::from_double_int (temp, 
-		   TYPE_PRECISION (type)), 
+		   HOST_BITS_PER_DOUBLE_INT), 
 		  -1,
 		  (temp.is_negative ()
 		   && (TYPE_UNSIGNED (type)


converting wide-int so that it uses its own type rather than hwi.

2013-08-30 Thread Kenneth Zadeck

richi,

on further thought, this is going to be a huge task.   The problem is at 
the edges.   right now we share the rep of the array with the tree-csts 
and rtl (though the rtl sharing may go away to support canonization).  
So if the hwi rep inside of wide int changes then we will have to 
implement copying with reblocking at that interface or push the type 
into there and change all of the fits_*hwi and to_*hwi interfaces to fit 
this different type.


i think i can get at least as good and perhaps better test coverage by 
changing the rep of hwi for a port.There will also be fallout work 
there, but it will be productive, in that it is just changing code from 
only doing the hwi case to supporting all precisions.


Kenny


Re: [RFC] Changes to the wide-int classes

2013-09-01 Thread Kenneth Zadeck
My main discomfort is the double-int part.  At this point in time the 
only usage of double-int left on the branch is because the fixed ints 
use it as their underlying representation.  I think that rather than 
doing a lot of work to accommodate this, it would be better to just 
change the fixed representation.


There is no place for exactly two HWIs in the machine independent parts 
of the compiler, and the truth is that there is no (except for the fixed 
stuff) use of double ints remaining in the ports. Even if, by sniffing 
the ports, we end up with a flavor of wide-int that has two wide-ints as 
its rep, it should not be be called double-int or expose what it's rep is.


I do understand that richi had requested this in some of the early 
discussions but given that there are virtually no clients for it left, 
lets let this die.


As for the rest of the patch, this is for the c++ experts to hash out.   
I have no problems with it but i am not an expert and so if the 
consensus likes it, then we can go in this direction.


 small bugs below this line.
bottom of frag 3 of gcc/cp/init.c is wrong:   you replaced 
rshift...lshift with lshift...lshift.


i will finish reading this tomorrow, but i wanted to get some comments 
in for the early shift.i stopped reading at line 1275.


kenny

On 09/01/2013 03:21 PM, Richard Sandiford wrote:

This is an RFC and patch for an alternative way of organising the
wide-int classes, along the lines I mentioned earlier.  The main points
are below, each with a "headline" and a bit of extra waffle that can be
skipped if too long:

* As Richard requested, the main wide int class is parameterised by the
   storage:

 template 
 class GTY(()) generic_wide_int : public storage

* As Richard also requested, double_int is now implemented in terms of
   the wide-int routines.

   This didn't work out quite as elegantly as I'd hoped due to conflicting
   requirements.  double_int is used in unions and so needs to be a POD,
   whereas the fancy things we want to allow for wide_int and fixed_wide_int
   mean that they need to have constructors.  The patch therefore keeps
   double_int as the basic storage class and defines double_int_ext as
   the wide-int class.  All the double_int methods therefore need to be
   kept, but are now simple wrappers around the wi:: routines.

   double_int_ext and fixed_wide_int  are
   assignment-compatible.

   This is just to show that it's possible though.  It probably isn't
   very efficient...

* wide-int.h no longer includes tree.h, rtl.h or double-int.h.

   The rtx and machine_mode routines are now in rtl.h, and the
   tree-related ones are in tree.h.  double-int.h now depends on
   wide-int.h, as described above.

* wide-int.h no longer includes tm.h.

   This is done by adding a new MAX_BITS_PER_UNIT to machmode.def,
   so that the definition of MAX_BITSIZE_MODE_ANY_MODE no longer relies on
   BITS_PER_UNIT.  Although I think we usually assume that BITS_PER_UNIT
   is a constant, that wouldn't necessarily be true if we ever did support
   multi-target compilers in future.  MAX_BITS_PER_UNIT is logically
   the maximum value of BITS_PER_UNIT for any compiled-in target and must
   be a constant.

* Precision 0 is no longer a special marker for primitive types like ints.
   It's only used for genuine 0-width integers.

* The wide-int classes are now relatively light-weight.  All the real
   work is done by wi:: routines.

   There are still operator methods for addition, multiplication, etc.,
   but they just forward to the associated wi:: routine.  I also reluctantly
   kept and_not and or_not as operator-like methods for now, although I'd
   like to get rid of them and just keep the genuine operators.  The problem
   is that I'd have liked the AND routine to be "wi::and", but of course that
   isn't possible with "and" being a keyword, so I went for "wi::bit_and"
   instead.  Same for "not" and "wi::bit_not", and "or" and "wi::bit_or".
   Then it seemed like the others should be bit_* too, and "wi::bit_and_not"
   just seems a bit unwieldly...

   Hmm, if we decide to forbid the use of "and" in gcc, perhaps we could
   #define it to something safe.  But that would probably be too confusing.
   I'm sure those who like stepping through gcc with gdb are going to hate
   this patch already, without that to make things worse...

* fixed_wide_int  now only has the number of HWIs required by N.
   This makes addr_wide_int significantly smaller.

* fixed_wide_int  doesn't have a precision field; the precision
   is always taken directly from the template argument.

   This isn't a win sizewise, since the length and precision fitted snugly
   into a HWI slot, but it means that checks for particular precisions
   can be resolved at compile time.  E.g. the fast single-HWI paths are
   now dropped when dealing with fixed_wide_ints.

* Each integer type is classifed as one of: FLEXIBLE_PRECISION,
   CONST_PRECISION and VAR_PRECISION.

   FL

Re: [RFC] Changes to the wide-int classes

2013-09-02 Thread Kenneth Zadeck

On 09/02/2013 05:35 AM, Richard Biener wrote:

On Mon, 2 Sep 2013, Richard Sandiford wrote:


Kenneth Zadeck  writes:

There is no place for exactly two HWIs in the machine independent parts
of the compiler,

I totally agree.  In fact I was taking that so much for granted that
I didn't even think to add a rider about it, sorry.  I didn't mean
to imply that we should keep double_int around.

I think the reason for doing this is to prove that it can be done
(so that the wide_int code isn't too big a change for the tree level)
and to make it easier to merge the wide-int patches into trunk piecemeal
if we need to.

Note that changing the tree rep to non-double_int is easy.  Also note
that I only want 'double_int' to stay around to have a fast type
that can work on two HWIs for the code that need more precision
than a HWI.  The only important cases I know of are in
get_inner_reference and get_ref_base_and_extent and friends.  Those
are heavily used (and the only double-int function callers that
even show up in regular cc1 profiles).

So if wide_int_fixed<2> ('2' better be replaced with
'number-that-gets-me-twice-target-sizetype-precision')
works for those cases then fine (and we can drop double-ints).
This is what the addr_wide_int is for - it sniffs the port and is 
guaranteed to big enough to hold the largest pointer, + 4 bits (3 bits 
for bitposition and 1 bit so that unsigned things come over with no 
loss) then that is rounded up to the next hwi.   (i will admit that the 
sniffing code needs a little work but that can be fixed without changing 
the api).


kenny



Re: [RFC] Changes to the wide-int classes

2013-09-04 Thread Kenneth Zadeck

Richi, and the rest of the community,

Richard Sandiford has proposed a set of patches that change the wide-int 
api in a significant way. We think that we really need some input from 
the community as to if this is what we want using C++ in gcc is going to 
look like. There are, as I see it, two issues that are raised by these 
patches:


1) Visibility. In my original wide-int patch, there were things that 
were public and things that were private. In general, the implementation 
details were private, but also the privacy was used to enforce the 
readonlyness of the representation. A wide int is an object and that 
object has value that cannot be changed. In Richard's restructuring, 
everything is visible in the wi namespace. The privacy of the internals 
can only be enforced by review of the maintainers.


It is possible restore the privacy of the original patches but this 
seems to involve using a private class inside the namespace and using 
C++ friends.


Adding visibility adds extra code

namespace wi {
class impl {
private:
add_large() { }
}
friend add(…);
};

that is roughly 1 line per client that uses an internal routine for the friend 
declaration and 5 extra lines for the other stuff.

Richard worries that using friends is not within the scope acceptable 
C++ mechanisms to use within the gcc community. We are, of course, open 
to other suggestions.


My personal view is that privacy is important and I do not want to let 
that go. In past comments, Richi, asked that we remove all of the public 
method calls that modified the internals of a wide-int. We did this and 
we generally are happy that we did this. So there is the question is the 
privacy important and if it is what is the proper mechanism to implement it.


2) In my original patch, almost all of the operations either used 
operators or method calls. In richard's patch, those method calls have 
all become regular function calls. This is not just a syntactic change. 
The wide-int operations are fairly generic. The operations are not just 
defined on wide-ints but also on tree and rtx constants. In my patch the 
tree and rtx arguments were not converted to wide-ints. instead, the 
code looked inside of the representation of the tree and rtl and just 
operated on that. However, in my patch the first parameter had to be a 
wide-int to make the method call. By making everything a regular 
function call, richard's patch avoids having to make the wide-int object 
just to make the method call. This seems like a big win, but it means 
that the wide-int code does not "look like" the double-int code anymore. 
I am in favor of this change, but it i (we) fear that there will be 
pushback from those that wanted this to look more oo-like as double-int 
currently does.


comments suggestions, and questions are appreciated.

thanks,

kenny



Re: [wide-int] Fix mode choice in convert_modes

2013-09-09 Thread Kenneth Zadeck

On 09/09/2013 09:56 AM, Richard Sandiford wrote:

Richard Biener  writes:

On Mon, Sep 9, 2013 at 3:11 PM, Richard Sandiford
 wrote:

Enabling the CONST_INT assert showed that this branch-local code was
passing the wrong mode to std::make_pair.  The mode of X is OLDMODE,
and we're supposed to be converting it to MODE.

In the case where OLDMODE is VOIDmode, I think we have to assume
that every bit of the input is significant.

Tested on x86_64-linux-gnu.  OK to install?

This and the other patches that I just committed to trunk are enough
for a bootstrap and regression test to pass on x86_64-linux-gnu with
the assertion for a canonical CONST_INT enabled.  I think it would be
better to leave the assertion out until after the merge though,
so we can deal with any fallout separately.

If consuming code assumes CONST_INTs are canonical we better leave it
in though.

Yeah, I think the patch Kenny's working on is going to canonise CONST_INTs
via the scratch array, to avoid making any assumptions.  I think we should
keep that for the merge and only swap it for an assert later.

Thanks,
Richard

yes, this is what my patch does now.


patch to make wide-int assume that everything is canonical beyond the precision?

2013-09-09 Thread Kenneth Zadeck
i still have more testing to go on this but it seems to be ok.my 
current problem is that the branch seems to have a fair number of 
failures so i want to get that cleaned up before i do further testing 
and commit this.


I left the old rtl constructor ifdefed out.   We are likely to want them 
back soon, given how will richards work on cleaning up the rtl seems to 
have gone.


Kenny
Index: gcc/rtl.h
===
--- gcc/rtl.h	(revision 202389)
+++ gcc/rtl.h	(working copy)
@@ -1422,6 +1422,7 @@ wi::int_traits ::get_precisi
   return GET_MODE_PRECISION (x.second);
 }
 
+#if 0
 inline wi::storage_ref
 wi::int_traits ::decompose (HOST_WIDE_INT *,
 	unsigned int precision,
@@ -1437,13 +1438,72 @@ wi::int_traits ::decompose (
   return wi::storage_ref (&CONST_WIDE_INT_ELT (x.first, 0),
 			  CONST_WIDE_INT_NUNITS (x.first), precision);
   
+#if TARGET_SUPPORTS_WIDE_INT != 0
 case CONST_DOUBLE:
   return wi::storage_ref (&CONST_DOUBLE_LOW (x.first), 2, precision);
+#endif
   
 default:
   gcc_unreachable ();
 }
 }
+#else
+/* For now, assume that the storage is not canonical, i.e. that there
+   are bits above the precision that are not all zeros or all ones.
+   If this is fixed in rtl, then we will not need the calls to
+   force_to_size.  */
+inline wi::storage_ref
+wi::int_traits ::decompose (HOST_WIDE_INT *scratch,
+	unsigned int precision,
+	const rtx_mode_t &x)
+{
+  int len;
+  int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  int blocks_needed = (precision + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT;
+
+  gcc_checking_assert (precision == get_precision (x));
+  switch (GET_CODE (x.first))
+{
+case CONST_INT:
+  len = 1;
+  if (small_prec)
+	scratch[0] = sext_hwi (INTVAL (x.first), precision);
+  else
+	scratch = &INTVAL (x.first);
+  break;
+  
+case CONST_WIDE_INT:
+  len = CONST_WIDE_INT_NUNITS (x.first);
+  if (small_prec && blocks_needed == len - 1)
+	{
+	  int i;
+	  for (i = 0; i < len - 1; i++)
+	scratch[i] = CONST_WIDE_INT_ELT (x.first, i);
+	  scratch[len - 1] = sext_hwi (CONST_WIDE_INT_ELT (x.first, i), small_prec);
+	}
+  else
+	scratch = &CONST_WIDE_INT_ELT (x.first, 0);
+  break;
+  
+#if TARGET_SUPPORTS_WIDE_INT == 0
+case CONST_DOUBLE:
+  len = 2;
+  if (small_prec)
+	{
+	  scratch[0] = CONST_DOUBLE_LOW (x.first);
+	  scratch[1] = sext_hwi (CONST_DOUBLE_HIGH (x.first), small_prec);
+	}
+  else
+	scratch = &CONST_DOUBLE_LOW (x.first);
+  break;
+#endif  
+
+default:
+  gcc_unreachable ();
+}
+  return wi::storage_ref (scratch, len, precision);
+}
+#endif
 
 namespace wi
 {
Index: gcc/wide-int.cc
===
--- gcc/wide-int.cc	(revision 202389)
+++ gcc/wide-int.cc	(working copy)
@@ -48,6 +48,9 @@ static const HOST_WIDE_INT zeros[WIDE_IN
   (PREC ? (((PREC) + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT) : 1)
 #define SIGN_MASK(X) (((HOST_WIDE_INT)X) >> (HOST_BITS_PER_WIDE_INT - 1))
 
+/* Return the value a VAL[I] if I < LEN, otherwise, return 0 or -1
+   based on the top existing bit of VAL. */
+
 static unsigned HOST_WIDE_INT
 safe_uhwi (const HOST_WIDE_INT *val, unsigned int len, unsigned int i)
 {
@@ -304,10 +307,10 @@ wi::force_to_size (HOST_WIDE_INT *val, c
   if (precision > xprecision)
 {
   /* Expanding.  */
-  unsigned int small_xprecision = xprecision % HOST_BITS_PER_WIDE_INT;
-
   if (sgn == UNSIGNED)
 	{
+	  unsigned int small_xprecision = xprecision % HOST_BITS_PER_WIDE_INT;
+
 	  if (small_xprecision && len == BLOCKS_NEEDED (xprecision))
 	val[len - 1] = zext_hwi (val[len - 1], small_xprecision);
 	  else if (val[len - 1] < 0)
@@ -320,11 +323,6 @@ wi::force_to_size (HOST_WIDE_INT *val, c
 		val[len++] = 0;
 	}
 	}
-  /* We have to do this because we cannot guarantee that there is
-	 not trash in the top block of an uncompressed value.  For a
-	 compressed value, all the bits are significant.  */
-  else if (small_xprecision && len == BLOCKS_NEEDED (xprecision))
-	val[len - 1] = sext_hwi (val[len - 1], small_xprecision);
 }
   else if (precision < xprecision)
 /* Contracting.  */
@@ -352,27 +350,18 @@ selt (const HOST_WIDE_INT *a, unsigned i
 	return 0;
 }
 
-  if (small_prec && index == blocks_needed - 1)
-{
-  /* The top block is partially outside of the precision.  */
-  if (sgn == SIGNED)
-	return sext_hwi (a[index], small_prec);
-  else
-	return zext_hwi (a[index], small_prec);
-}
-  return a[index];
+  if (sgn == UNSIGNED && small_prec && index == blocks_needed - 1)
+return zext_hwi (a[index], small_prec);
+  else
+return a[index];
 }
 
-/* Find the hignest bit represented in a wide int.  This will in
+/* Find the highest bit represented in a wide int.  This will in
general have the same value as the sign bit.  */

patch to canonize small wide-ints.

2013-09-17 Thread Kenneth Zadeck

Richi,

This patch canonizes the bits above the precision for wide ints with 
types or modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT.


I expect that most of the changes in rtl.h will go away.   in 
particular, when we decide that we can depend on richard's patch to 
clean up rtl constants, then the only thing that will be left will be 
the addition of the TARGET_SUPPORTS_WIDE_INT test.


I do believe that there is one more conserved force in the universe than 
what physicist's generally consider: it is uglyness.  There is a lot of 
truth and beauty in the patch but in truth there is a lot of places 
where the uglyness is just moved someplace else.


in the pushing the ugly around dept, trees and wide-ints are not 
canonized the same way.I spent several days going down the road 
where it tried to have them be the same, but it got very ugly having 32 
bit unsigned int csts have the upper 32 bits set.   So now 
wide_int_to_tree and the wide-int constructors from tree-cst are now 
more complex.


i think that i am in favor of this patch, especially in conjunction with 
richards cleanup, but only mildly.


There is also some cleanup where richard wanted the long lines addressed.

Ok to commit to the wide-int branch?

kenny

Index: gcc/emit-rtl.c
===
--- gcc/emit-rtl.c	(revision 202389)
+++ gcc/emit-rtl.c	(working copy)
@@ -579,8 +579,6 @@ immed_wide_int_const (const wide_int &v,
   if (len < 2 || prec <= HOST_BITS_PER_WIDE_INT)
 return gen_int_mode (v.elt (0), mode);
 
-  wide_int copy = v;
-  wi::clear_undef (copy, SIGNED);
 #if TARGET_SUPPORTS_WIDE_INT
   {
 unsigned int i;
@@ -599,12 +597,12 @@ immed_wide_int_const (const wide_int &v,
 CWI_PUT_NUM_ELEM (value, len);
 
 for (i = 0; i < len; i++)
-  CONST_WIDE_INT_ELT (value, i) = copy.elt (i);
+  CONST_WIDE_INT_ELT (value, i) = v.elt (i);
 
 return lookup_const_wide_int (value);
   }
 #else
-  return immed_double_const (copy.elt (0), copy.elt (1), mode);
+  return immed_double_const (v.elt (0), v.elt (1), mode);
 #endif
 }
 
Index: gcc/lto-streamer-in.c
===
--- gcc/lto-streamer-in.c	(revision 202389)
+++ gcc/lto-streamer-in.c	(working copy)
@@ -1273,7 +1273,7 @@ lto_input_tree_1 (struct lto_input_block
   for (i = 0; i < len; i++)
 	a[i] = streamer_read_hwi (ib);
   result = wide_int_to_tree (type, wide_int::from_array
- (a, len, TYPE_PRECISION (type), false));
+ (a, len, TYPE_PRECISION (type)));
   streamer_tree_cache_append (data_in->reader_cache, result, hash);
 }
   else if (tag == LTO_tree_scc)
Index: gcc/real.c
===
--- gcc/real.c	(revision 202389)
+++ gcc/real.c	(working copy)
@@ -2248,7 +2248,6 @@ real_from_integer (REAL_VALUE_TYPE *r, e
   /* Clear out top bits so elt will work with precisions that aren't
 	 a multiple of HOST_BITS_PER_WIDE_INT.  */
   val = wide_int::from (val, len, sgn);
-  wi::clear_undef (val, sgn);
   len = len / HOST_BITS_PER_WIDE_INT;
 
   SET_REAL_EXP (r, len * HOST_BITS_PER_WIDE_INT + e);
Index: gcc/rtl.h
===
--- gcc/rtl.h	(revision 202389)
+++ gcc/rtl.h	(working copy)
@@ -1422,6 +1422,7 @@ wi::int_traits ::get_precisi
   return GET_MODE_PRECISION (x.second);
 }
 
+#if 0
 inline wi::storage_ref
 wi::int_traits ::decompose (HOST_WIDE_INT *,
 	unsigned int precision,
@@ -1437,13 +1438,57 @@ wi::int_traits ::decompose (
   return wi::storage_ref (&CONST_WIDE_INT_ELT (x.first, 0),
 			  CONST_WIDE_INT_NUNITS (x.first), precision);
   
+#if TARGET_SUPPORTS_WIDE_INT != 0
 case CONST_DOUBLE:
   return wi::storage_ref (&CONST_DOUBLE_LOW (x.first), 2, precision);
+#endif
   
 default:
   gcc_unreachable ();
 }
 }
+#else
+/* For now, assume that the storage is not canonical, i.e. that there
+   are bits above the precision that are not all zeros or all ones.
+   If this is fixed in rtl, then we will not need the calls to
+   force_to_size.  */
+inline wi::storage_ref
+wi::int_traits ::decompose (HOST_WIDE_INT *scratch,
+	unsigned int precision,
+	const rtx_mode_t &x)
+{
+  int len;
+  int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+
+  gcc_checking_assert (precision == get_precision (x));
+  switch (GET_CODE (x.first))
+{
+case CONST_INT:
+  len = 1;
+  if (small_prec)
+	scratch[0] = sext_hwi (INTVAL (x.first), precision);
+  else
+	scratch = &INTVAL (x.first);
+  break;
+  
+case CONST_WIDE_INT:
+  len = CONST_WIDE_INT_NUNITS (x.first);
+  scratch = &CONST_WIDE_INT_ELT (x.first, 0);
+  break;
+  
+#if TARGET_SUPPORTS_WIDE_INT == 0
+case CONST_DOUBLE:
+  len = 2;
+  scratch = &CONST_DOUBLE_LOW (x.first);
+  break;
+#endif  
+
+default:
+  gcc_unreachable ()

Re: [wide-int] Fix mode choice in convert_modes

2013-09-18 Thread Kenneth Zadeck

this is fine with me.

kenny

On 09/18/2013 03:27 AM, Richard Sandiford wrote:

Ping

Richard Sandiford  writes:

Enabling the CONST_INT assert showed that this branch-local code was
passing the wrong mode to std::make_pair.  The mode of X is OLDMODE,
and we're supposed to be converting it to MODE.

In the case where OLDMODE is VOIDmode, I think we have to assume
that every bit of the input is significant.

Tested on x86_64-linux-gnu.  OK to install?

This and the other patches that I just committed to trunk are enough
for a bootstrap and regression test to pass on x86_64-linux-gnu with
the assertion for a canonical CONST_INT enabled.  I think it would be
better to leave the assertion out until after the merge though,
so we can deal with any fallout separately.

Thanks,
Richard


Index: gcc/expr.c
===
--- gcc/expr.c  2013-09-09 11:32:33.734617409 +0100
+++ gcc/expr.c  2013-09-09 11:45:27.381001353 +0100
@@ -714,13 +714,14 @@ convert_modes (enum machine_mode mode, e
&& GET_MODE_CLASS (mode) == MODE_INT
&& (oldmode == VOIDmode || GET_MODE_CLASS (oldmode) == MODE_INT))
  {
-  wide_int w = std::make_pair (x, mode);
/* If the caller did not tell us the old mode, then there is
-not much to do with respect to canonization.  */
-  if (oldmode != VOIDmode
- && GET_MODE_PRECISION (mode) > GET_MODE_PRECISION (oldmode))
-   w = wi::ext (w, GET_MODE_PRECISION (oldmode),
-unsignedp ? UNSIGNED : SIGNED);
+not much to do with respect to canonization.  We have to assume
+that all the bits are significant.  */
+  if (oldmode == VOIDmode)
+   oldmode = MAX_MODE_INT;
+  wide_int w = wide_int::from (std::make_pair (x, oldmode),
+  GET_MODE_PRECISION (mode),
+  unsignedp ? UNSIGNED : SIGNED);
return immed_wide_int_const (w, mode);
  }
  




Re: [wide-int] Fix LTO regression that I'd introduced

2013-09-19 Thread Kenneth Zadeck

this looks fine to me.
On 09/19/2013 02:56 PM, Richard Sandiford wrote:

It turns out that gcc20's version of binutils is too old for the LTO plugin,
so the tests I'd been running hadn't exercised it.  This patch fixes a
regression that Kenny pointed out.

The problem was that build_int_cst and build_int_cst_type were using
the signedness of the type to decide how the HWI should be extended,
whereas they're supposed to use sign extension regardless.

Tested on x86_64-linux-gnu, this time with trunk binutils.  OK for wide-int?

Thanks,
Richard


gcc/
* tree.h (wi::hwi): Delete.
* tree.c (build_int_cst, build_int_cst_type): Use wi::shwi.
(build_int_cstu): Use wi::uhwi.

Index: gcc/tree.h
===
--- gcc/tree.h  (revision 202746)
+++ gcc/tree.h  (working copy)
@@ -5206,8 +5206,6 @@
  
  namespace wi

  {
-  hwi_with_prec hwi (HOST_WIDE_INT, const_tree);
-
template 
bool fits_to_tree_p (const T &x, const_tree);
  
@@ -5216,12 +5214,6 @@

wide_int from_mpz (const_tree, mpz_t, bool);
  }
  
-inline wi::hwi_with_prec

-wi::hwi (HOST_WIDE_INT val, const_tree type)
-{
-  return hwi_with_prec (val, TYPE_PRECISION (type), TYPE_SIGN (type));
-}
-
  template 
  bool
  wi::fits_to_tree_p (const T &x, const_tree type)
Index: gcc/tree.c
===
--- gcc/tree.c  (revision 202746)
+++ gcc/tree.c  (working copy)
@@ -1056,13 +1056,13 @@
if (!type)
  type = integer_type_node;
  
-  return wide_int_to_tree (type, wi::hwi (low, type));

+  return wide_int_to_tree (type, wi::shwi (low, TYPE_PRECISION (type)));
  }
  
  tree

  build_int_cstu (tree type, unsigned HOST_WIDE_INT cst)
  {
-  return wide_int_to_tree (type, wi::hwi (cst, type));
+  return wide_int_to_tree (type, wi::uhwi (cst, TYPE_PRECISION (type)));
  }
  
  /* Create an INT_CST node with a LOW value sign extended to TYPE.  */

@@ -1071,7 +1071,7 @@
  build_int_cst_type (tree type, HOST_WIDE_INT low)
  {
gcc_assert (type);
-  return wide_int_to_tree (type, wi::hwi (low, type));
+  return wide_int_to_tree (type, wi::shwi (low, TYPE_PRECISION (type)));
  }
  
  /* Constructs tree in type TYPE from with value given by CST.  Signedness




wide-int: fix merge.

2013-09-23 Thread Kenneth Zadeck
Mike started a merge from trunk to the wide-int branch because the 
previous merge had happened when bootstrap was broken for ppc.


This patch fixes mike's partial merge and has been bootstrapped and 
tested on ppc.  I will pull the patch down and test it on x86-64 tonight.


kenny
Index: gcc/postreload.c
===
--- gcc/postreload.c	(revision 202811)
+++ gcc/postreload.c	(working copy)
@@ -305,7 +305,7 @@
 		case ZERO_EXTEND:
 		  result = wide_int (std::make_pair (this_rtx, GET_MODE (src)));
 		  if (GET_MODE_PRECISION (GET_MODE (src)) > GET_MODE_PRECISION (word_mode))
-		result = result.zext (GET_MODE_PRECISION (word_mode));
+		result = wi::zext (result, GET_MODE_PRECISION (word_mode));
 		  break;
 		case SIGN_EXTEND:
 		  result = wide_int (std::make_pair (this_rtx, GET_MODE (src)));
Index: gcc/fold-const.c
===
--- gcc/fold-const.c	(revision 202811)
+++ gcc/fold-const.c	(working copy)
@@ -9897,17 +9897,15 @@
 
 /*  Mask out the tz least significant bits of X of type TYPE where
 tz is the number of trailing zeroes in Y.  */
-static double_int
-mask_with_tz (tree type, double_int x, double_int y)
+static wide_int
+mask_with_tz (tree type, wide_int x, wide_int y)
 {
-  int tz = y.trailing_zeros ();
-
+  int tz = wi::ctz (y);
   if (tz > 0)
 {
-  double_int mask;
+  wide_int mask;
 
-  mask = ~double_int::mask (tz);
-  mask = mask.ext (TYPE_PRECISION (type), TYPE_UNSIGNED (type));
+  mask = wi::mask (tz, true, TYPE_PRECISION (type));
   return mask & x;
 }
   return x;
@@ -11276,7 +11274,7 @@
 	== INTEGER_CST)
 	  {
 	tree t = TREE_OPERAND (TREE_OPERAND (arg0, 0), 1);
-	double_int masked = mask_with_tz (type, c3, tree_to_double_int (t));
+	wide_int masked = mask_with_tz (type, c3, t);
 
 	try_simplify = (masked != c1);
 	  }
@@ -11670,32 +11668,14 @@
 	  && TREE_CODE (arg0) == MULT_EXPR
 	  && TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST)
 	{
-<<< .working
-	  int arg1tz = wi::ctz (TREE_OPERAND (arg0, 1));
-	  if (arg1tz > 0)
-	{
-	  wide_int arg1mask, masked;
-	  arg1mask = wi::mask (arg1tz, true, TYPE_PRECISION (type));
-	  masked = arg1mask & arg1;
-	  if (masked == 0)
-		return omit_two_operands_loc (loc, type, build_zero_cst (type),
-	  arg0, arg1);
-	  else if (masked != arg1)
-		return fold_build2_loc (loc, code, type, op0,
-	wide_int_to_tree (type, masked));
-	}
-===
-	  double_int masked
-	= mask_with_tz (type, tree_to_double_int (arg1),
-	tree_to_double_int (TREE_OPERAND (arg0, 1)));
+	  wide_int masked = mask_with_tz (type, arg1, TREE_OPERAND (arg0, 1));
 
-	  if (masked.is_zero ())
+	  if (masked == 0)
 	return omit_two_operands_loc (loc, type, build_zero_cst (type),
 	  arg0, arg1);
-	  else if (masked != tree_to_double_int (arg1))
+	  else if (masked != arg1)
 	return fold_build2_loc (loc, code, type, op0,
-	double_int_to_tree (type, masked));
->>> .merge-right.r202797
+	wide_int_to_tree (type, masked));
 	}
 
   /* For constants M and N, if M == (1LL << cst) - 1 && (N & M) == M,
Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c	(revision 202811)
+++ gcc/gimple-ssa-strength-reduction.c	(working copy)
@@ -763,7 +763,7 @@
int (i * S).
Otherwise, just return double int zero.  */
 
-static double_int
+static max_wide_int
 backtrace_base_for_ref (tree *pbase)
 {
   tree base_in = *pbase;
@@ -771,19 +771,19 @@
 
   STRIP_NOPS (base_in);
   if (TREE_CODE (base_in) != SSA_NAME)
-return tree_to_double_int (integer_zero_node);
+return 0;
 
   base_cand = base_cand_from_table (base_in);
 
   while (base_cand && base_cand->kind != CAND_PHI)
 {
   if (base_cand->kind == CAND_ADD
-	  && base_cand->index.is_one ()
+	  && base_cand->index == 1
 	  && TREE_CODE (base_cand->stride) == INTEGER_CST)
 	{
 	  /* X = B + (1 * S), S is integer constant.  */
 	  *pbase = base_cand->base_expr;
-	  return tree_to_double_int (base_cand->stride);
+	  return base_cand->stride;
 	}
   else if (base_cand->kind == CAND_ADD
 	   && TREE_CODE (base_cand->stride) == INTEGER_CST
@@ -800,7 +800,7 @@
 	base_cand = NULL;
 }
 
-  return tree_to_double_int (integer_zero_node);
+  return 0;
 }
 
 /* Look for the following pattern:
Index: gcc/genpreds.c
===
--- gcc/genpreds.c	(revision 202811)
+++ gcc/genpreds.c	(working copy)
@@ -809,11 +809,7 @@
   if (is_const_int || is_const_dbl)
 {
   enum rtx_code appropriate_code
-#if TARGET_SUPPORTS_WIDE_INT
-	= is_const_int ? CONST_INT : CONST_WIDE_INT;
-#else
 	= is_const_int ? CONST_INT : CONST_DOUBLE;
-#en

Re: patch to canonize small wide-ints.

2013-09-24 Thread Kenneth Zadeck

On 09/24/2013 09:39 AM, Richard Biener wrote:

On Tue, 17 Sep 2013, Kenneth Zadeck wrote:


Richi,

This patch canonizes the bits above the precision for wide ints with types or
modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT.

I expect that most of the changes in rtl.h will go away.   in particular, when
we decide that we can depend on richard's patch to clean up rtl constants,
then the only thing that will be left will be the addition of the
TARGET_SUPPORTS_WIDE_INT test.

I do believe that there is one more conserved force in the universe than what
physicist's generally consider: it is uglyness.  There is a lot of truth and
beauty in the patch but in truth there is a lot of places where the uglyness
is just moved someplace else.

in the pushing the ugly around dept, trees and wide-ints are not canonized the
same way.I spent several days going down the road where it tried to have
them be the same, but it got very ugly having 32 bit unsigned int csts have
the upper 32 bits set.   So now wide_int_to_tree and the wide-int constructors
from tree-cst are now more complex.

i think that i am in favor of this patch, especially in conjunction with
richards cleanup, but only mildly.

There is also some cleanup where richard wanted the long lines addressed.

Ok to commit to the wide-int branch?

Looks good to me.

I'll be doing a separate review of the to/from tree parts when I
find time to do that, but that's unrelated to this patch.

Thanks,
Richard.
as i said on irc, after i check in a cleaned up version of this (for 
ppc) i will change the rep for unsiged tree-csts so that they have an 
extra block of zeros if their top bit is set.   this will clean that up 
somewhat.


thanks

kenny


Re: patch to canonize small wide-ints.

2013-09-24 Thread Kenneth Zadeck

On 09/24/2013 09:51 AM, Richard Biener wrote:

On Tue, 24 Sep 2013, Kenneth Zadeck wrote:


On 09/24/2013 09:39 AM, Richard Biener wrote:

On Tue, 17 Sep 2013, Kenneth Zadeck wrote:


Richi,

This patch canonizes the bits above the precision for wide ints with types
or
modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT.

I expect that most of the changes in rtl.h will go away.   in particular,
when
we decide that we can depend on richard's patch to clean up rtl constants,
then the only thing that will be left will be the addition of the
TARGET_SUPPORTS_WIDE_INT test.

I do believe that there is one more conserved force in the universe than
what
physicist's generally consider: it is uglyness.  There is a lot of truth
and
beauty in the patch but in truth there is a lot of places where the
uglyness
is just moved someplace else.

in the pushing the ugly around dept, trees and wide-ints are not canonized
the
same way.I spent several days going down the road where it tried to
have
them be the same, but it got very ugly having 32 bit unsigned int csts
have
the upper 32 bits set.   So now wide_int_to_tree and the wide-int
constructors
from tree-cst are now more complex.

i think that i am in favor of this patch, especially in conjunction with
richards cleanup, but only mildly.

There is also some cleanup where richard wanted the long lines addressed.

Ok to commit to the wide-int branch?

Looks good to me.

I'll be doing a separate review of the to/from tree parts when I
find time to do that, but that's unrelated to this patch.

Thanks,
Richard.

as i said on irc, after i check in a cleaned up version of this (for ppc) i
will change the rep for unsiged tree-csts so that they have an extra block of
zeros if their top bit is set.   this will clean that up somewhat.

Yes, this will give wide-ints a sign (as I originally proposed).  For
RTL constants we're going to "lie" and make all constants negative
that have their MSB set.

Richard.
it only sort of does. anyway it seems to be mostly consistent. As i 
said on irc, i am pleasantly surprised at the lack of damage to the 
ports by assuming that the values are always canonized.   Two large 
public ports and my private port required no changes.


wide-int: Re: patch to canonize small wide-ints.

2013-09-24 Thread Kenneth Zadeck

I just committed the patch to do this as revision 202871.

there are two changes from the earlier patch:

1) the addition of frag in loop-doloop.c.   This fixes an rtl 
canonization bug, but it is unique to the branch.   it did not cause a 
problem on x86 but did on ppc.
2) the code in rtl.h for converting rtl to wide-ints is now much 
simpler.   I now rely on richard's patch, which has been merged into 
this branch to canonize the rtl constants.   I have an assert to make 
sure that canonization is never violated.


I did not change the tree-cst constructor as richard and i had talked 
about on irc.my next patch will be to change the rep of tree-cst so 
that unsigned types that have precisions of some multiple of HBPWI and 
have their top bit set will just carry an extra block of zeros.   That 
will force a rewrite of this constructor.


Kenny

On 09/24/2013 12:06 PM, Richard Sandiford wrote:

Richard Biener  writes:

On Tue, 24 Sep 2013, Kenneth Zadeck wrote:

On 09/24/2013 09:39 AM, Richard Biener wrote:

On Tue, 17 Sep 2013, Kenneth Zadeck wrote:


Richi,

This patch canonizes the bits above the precision for wide ints with types
or
modes that are not a perfect multiple of HOST_BITS_PER_WIDE_INT.

I expect that most of the changes in rtl.h will go away.   in particular,
when
we decide that we can depend on richard's patch to clean up rtl constants,
then the only thing that will be left will be the addition of the
TARGET_SUPPORTS_WIDE_INT test.

I do believe that there is one more conserved force in the universe than
what
physicist's generally consider: it is uglyness.  There is a lot of truth
and
beauty in the patch but in truth there is a lot of places where the
uglyness
is just moved someplace else.

in the pushing the ugly around dept, trees and wide-ints are not canonized
the
same way.I spent several days going down the road where it tried to
have
them be the same, but it got very ugly having 32 bit unsigned int csts
have
the upper 32 bits set.   So now wide_int_to_tree and the wide-int
constructors
from tree-cst are now more complex.

i think that i am in favor of this patch, especially in conjunction with
richards cleanup, but only mildly.

There is also some cleanup where richard wanted the long lines addressed.

Ok to commit to the wide-int branch?

Looks good to me.

I'll be doing a separate review of the to/from tree parts when I
find time to do that, but that's unrelated to this patch.

Thanks,
Richard.

as i said on irc, after i check in a cleaned up version of this (for ppc) i
will change the rep for unsiged tree-csts so that they have an extra block of
zeros if their top bit is set.   this will clean that up somewhat.

Yes, this will give wide-ints a sign (as I originally proposed).

No, nothing changes on that front.  An N-bit wide_int has no sign.
It looks the same whether it came from a signed tree constant,
an unsigned tree constant, or an rtx (which also has no sign).

All the posted patch does is change the internal representation of
excess bits, which are operationally don't-cares.  The semantics are
just the same as before.  And the point of the follow-up change that
Kenny mentioned is to avoid a copy when treating an N-bit INTEGER_CST
as a wider-than-N wide_int in cases where we're extending according to
TYPE_SIGN.  I.e. it's purely an optimisation.  The resulting wide_int
storage is exactly the same as before, we just do less work to get it.

Thanks,
Richard


Index: gcc/tree.c
===
--- gcc/tree.c	(revision 202811)
+++ gcc/tree.c	(working copy)
@@ -1112,7 +1112,6 @@ force_fit_type (tree type, const wide_in
 	  || (overflowable > 0 && sign == SIGNED))
 	{
 	  wide_int tmp = wide_int::from (cst, TYPE_PRECISION (type), sign);
-	  wi::clear_undef (tmp, sign);
 	  int l = tmp.get_len ();
 	  tree t = make_int_cst (l);
 	  if (l > 1)
@@ -1205,11 +1204,11 @@ wide_int_to_tree (tree type, const wide_
 }
 
   wide_int cst = wide_int::from (pcst, prec, sgn);
-  /* The following call makes sure that all tree-cst's are canonical.
- i.e. it really does sign or zero extend the top block of the
- value if the precision of the type is not an even multiple of the
- size of an HWI.  */
-  wi::clear_undef (cst, sgn);
+  int len = int (cst.get_len ());
+  int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  bool recanonize = sgn == UNSIGNED
+&& (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT == len
+&& small_prec;
 
   switch (TREE_CODE (type))
 {
@@ -1291,18 +1290,31 @@ wide_int_to_tree (tree type, const wide_
   t = TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix);
   if (t)
 	{
-	  /* Make sure no one is clobbering the shared constant.  */
+	  /* Make sure no one is clobbering the shared constant.  We
+	 must be careful here because tree-csts and wide-ints are
+	 not canonicali

Re: patch to fix

2012-10-03 Thread Kenneth Zadeck
i have already converted the vrp code, so i have some guess at where you 
are talking about.  (of course correct me if i am wrong).


in the code that computes the range when two variables are multiplied 
together needs to do a multiplication that produces a result that is 
twice as wide as the inputs.


my library is able to do that with one catch (and this is a big catch): 
the target has to have an integer mode that is twice as big as the mode 
of the operands. The issue is that wide-ints actually carry around the 
mode of the value in order to get the bitsize and precision of the 
operands (it does not have the type, because this code has to both work 
on the rtl and tree level and i generally do not want the signness anyway).


my current code in vrp checks to see if such a mode exists and if it 
does, it produces the product.   if the mode does not exist, it returns 
bottom.   What this means is that for most (many or some) targets that 
have a TImode, the largest thing that particular vrp discover ranges for 
is a DImode value.   We could get around this by defining the next 
larger mode than what the target really needs but i wonder how much 
mileage you are going to get out of that with really large numbers.


Of course you could have something else in mind.

kenny

On 10/03/2012 04:47 PM, Marc Glisse wrote:

On Wed, 3 Oct 2012, Kenneth Zadeck wrote:


The patch defines a new datatype, a 'wide_int' (defined in
wide-int.[ch], and this datatype will be used to perform all of the
integer constant math in the compiler.  Externally, wide-int is very
similar to double-int except that it does not have the limitation that
math must be done on exactly two HOST_WIDE_INTs.

Internally, a wide_int is a structure that contains a fixed sized
array of HOST_WIDE_INTs, a length field and a mode.  The size of the
array is determined at generation time by dividing the number of bits
of the largest integer supported on the target by the number of bits
in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
integer can be supported on any host.


Hello,

did you consider making the size of wide_int a template parameter, now 
that we are using C++? All with a convenient typedef or macro so it 
doesn't show. I am asking because in vrp I do some arithmetic that 
requires 2*N+1 bits where N is the size of double_int.






Re: patch to fix constant math

2012-10-04 Thread Kenneth Zadeck

Let me talk about the mode here first.

What this interface/patch provides is a facility where the constant math 
that is done in optimizations is done exactly the way that it would be 
done on the target machine.   What we have now is a compiler that only 
does this if it convenient to do on the host.   I admit that i care 
about this more than others right now, but if intel adds a couple of 
more instructions to their vector units, other people will start to 
really care about this issue.   If you take an OImode value with the 
current compiler and left shift it by 250 the middle end will say that 
the result is 0.   This is just wrong!!!


What this means is that the bitsize and precision of the operations need 
to be carried along when doing math. when wide-int  checks for overflow 
on the multiply or add, it is not checking the if the value overflowed 
on two HWIs, it is checking if the add overflowed in the mode of the 
types that are represented on the target.   When we do shift, we are not 
doing a shift within two HWIs, we are truncating the shift value (if 
this is appropriate) according to the bitsize and shifting according the 
precision.


I think that an argument could be made that storing the mode should be 
changed to an explicit precision and bitsize.  (A possible other option 
would be to store a tree type, but this would make the usage at the rtl 
level very cumbersome since types are rare.) Aside from the work, you 
would not get much push back.


But the signess is a different argument.   At the rtl level, the signess 
is a matter of context.   (you could argue that this is a mistake and i 
would agree, but that is an even bigger change.)   But more to the 
point, at the tree level, there are a surprising number of places where 
the operation desired does not follow the sign of the types that were 
used to construct the constants.   Furthermore, not carrying the sign is 
more consistent with the double int code, which as you point out carries 
nothing.


As for the splitting out the patch in smaller pieces, i am all for it.   
I have done this twice already and i could get the const_scalar_int_p 
patch out quickly.But you do not get too far along that before you 
are still left with a big patch.   I could split out wide-int.* and just 
commit those files with no clients as a first step.   My guess is that 
Richard Sandiford would appreciate that because while he has carefully 
checked the rtl stuff, i think that the code inside wide-int is not in 
his comfort zone of things he would approve.


As far as your btw - noticed this last night.   it is an artifact of the 
way i produced the patch and "responsible people have been sacked".   
However, it shows that you read the patch carefully, and i really 
appreciate that.   i owe you a beer (not that you need another at this 
time of year).


Kenny



On 10/04/2012 08:48 AM, Richard Guenther wrote:

On Wed, Oct 3, 2012 at 7:15 PM, Kenneth Zadeck  wrote:

The enclosed patch is the third of at least four patches that fix the
problems associated with supporting integers on the target that are
wider than two HOST_WIDE_INTs.

While GCC claims to support OI mode, and we have two public ports that
make minor use of this mode, in practice, compilation that uses OImode
mode commonly gives the wrong result or ices.  We have a private port
of GCC for an architecture that is further down the road to needing
comprehensive OImode and we have discovered that this is unusable. We
have decided to fix it in a general way that so that it is most
beneficial to the GCC community.  It is our belief that we are just a
little ahead of the X86 and the NEON and these patches will shortly be
essential.

The first two of these patches were primarily lexigraphical and have
already been committed.They transformed the uses of CONST_DOUBLE
so that it is easy to tell what the intended usage is.

The underlying structures in the next two patches are very general:
once they are added to the compiler, the compiler will be able to
support targets with any size of integer from hosts of any size
integer.

The patch enclosed deals with the portable RTL parts of the compiler.
The next patch, which is currently under construction deals with the
tree level.  However, this patch can be put on the trunk as is, and it
will eleviate many, but not all of the current limitations in the rtl
parts of the compiler.

Some of the patch is conditional, depending on a port defining the
symbol 'TARGET_SUPPORTS_WIDE_INT' to be non zero.  Defining this
symbol to be non zero is declaring that the port has been converted to
use the new form or integer constants.  However, the patch is
completely backwards compatible to allow ports that do not need this
immediately to convert at their leasure.  The conversion process is
not difficult, but it does require some knowledge of the port, so we
are not volinteering to do this for all ports.

OVERVIEW OF THE PATCH:

The patch de

Re: patch to fix

2012-10-04 Thread Kenneth Zadeck

On 10/04/2012 09:17 AM, Marc Glisse wrote:

On Wed, 3 Oct 2012, Mike Stump wrote:

On Oct 3, 2012, at 1:47 PM, Marc Glisse  wrote:
did you consider making the size of wide_int a template parameter, 
now that we are using C++? All with a convenient typedef or macro so 
it doesn't show. I am asking because in vrp I do some arithmetic 
that requires 2*N+1 bits where N is the size of double_int.


No, not really.  I'd maybe answer it this way, we put in a type 
(singular) to support all integral constants in all languages on a 
port. Since we only needed 1, there was little need to templatize 
it.  By supporting all integral constants in all languages, there is 
little need for more.  If Ada say, wanted a 2048 bit integer, then, 
we just have it drop off the size it wants someplace and we would mix 
that in on a MAX(….) line, net result, the type we use would then 
directly support the needs of Ada.  If vpr wanted 2x of all existing 
modes, we could simply change the MAX equation and essentially double 
it; if people need that.  This comes as a cost, as the intermediate 
wide values are fixed size allocated (not variable); so these all 
would be larger.


And this cost could be eliminated by having a template wide_int_ so 
only the places that need it actually use the extra size ;-)


The space is not really an issue in most places since wide-ints tend to 
be short lived.  i guess vrp is slightly different because it creates a 
lot at once.  but then they go away.


However the real question is what are you going to instantiate the 
template on?What we do is look at the target and determine the 
largest type that the target supports and build a wide int type that 
supports that.how are you going to do better?   are you going to 
instantiate one for every type you see?   are these going to be static 
or dynamic?   The last line this email seems to imply that you were 
planning to "know" that __int128 was the largest integer that any target 
or front end could support.


and then what do you do for the parts of the compiler that have 
operations that take things of two different types, like shift. The 
shift amount can and may times is a shorter type that what is being 
shifted. Would these different length integers be represented with 
different instances from the same template?   I am not a c++ programmer 
and so all of this is a little new to me, but given a perspective of the 
rest of the compiler, this does not seem like the right way to go.





On Wed, 3 Oct 2012, Kenneth Zadeck wrote:

i have already converted the vrp code, so i have some guess at where 
you are talking about.  (of course correct me if i am wrong).


in the code that computes the range when two variables are multiplied 
together needs to do a multiplication that produces a result that is 
twice as wide as the inputs.


Yes, exactly.

my library is able to do that with one catch (and this is a big 
catch): the target has to have an integer mode that is twice as big 
as the mode of the operands. The issue is that wide-ints actually 
carry around the mode of the value in order to get the bitsize and 
precision of the operands (it does not have the type, because this 
code has to both work on the rtl and tree level and i generally do 
not want the signness anyway).


my current code in vrp checks to see if such a mode exists and if it 
does, it produces the product.   if the mode does not exist, it 
returns bottom.   What this means is that for most (many or some) 
targets that have a TImode, the largest thing that particular vrp 
discover ranges for is a DImode value.   We could get around this by 
defining the next larger mode than what the target really needs but i 
wonder how much mileage you are going to get out of that with really 
large numbers.


This will be for discussion when you submit that next patch, but 
currently VRP handles integers the same size as double_int. In 
particular, it handles __int128. I would be unhappy if introducing a 
larger bigint type in gcc made us regress there.


You are only happy now because you do not really understand the world 
around you.This is not what your code does.   What you code does is 
that if the host is a 64 bit host you can handle __int128 and if your 
host is a 32 bit host you can handle a __int64.  If you are building a 
cross compiler from a 32 bit host to a 64 bit target, your pass is 
either going to get the wrong answer, give up, or ice.   There are 
currently parts of gcc that do each of these three "solutions" and my 
patch gets rid of these because it does the math as the target does the 
math, no matter that the target is.


The goal of my patch is to make gcc produce the same correct results no 
matter what types the target or host support.The last thing that we 
need to have some optimization "knowing" what the limits of either of 
these are and hard coding that in a set of templates that have been 
statically instantiated.





Re: patch to fix

2012-10-04 Thread Kenneth Zadeck
Actually richi, this code is "correct" for some broken definition of 
correct.


If all that is done is to convert the rtl parts of the compiler, then 
this code is the best you can do (of course an assertion that the length 
is not greater than 2 would be a useful addition).


The code that is in the follow on patch which converts the insides of a 
tree cst to look like a const wide int, i.e. an array of HWIs.   When 
that happens, this code looks completely different. But if you only 
convert the rtl level, at some point there is going to be an impedance 
mismatch and it is buried here.


I will point out that this is the fall out of trying to split things 
into a bunch of smaller patches that could in theory go in separately.


kenny






+/* Constructs tree in type TYPE from with value given by CST.  Signedness
+   of CST is assumed to be the same as the signedness of TYPE.  */
+
+tree
+wide_int_to_tree (tree type, const wide_int &cst)
+{
+  wide_int v;
+  if (TYPE_UNSIGNED (type))
+v = cst.zext (TYPE_PRECISION (type));
+  else
+v = cst.sext (TYPE_PRECISION (type));
+
+  return build_int_cst_wide (type, v.elt (0), v.elt (1));
+}

is surely broken.  A wide-int does not fit a double-int.  How are you
going to "fix" this?

Thanks,
Richard.


kenny




Re: patch to fix constant math

2012-10-04 Thread Kenneth Zadeck

On 10/04/2012 12:58 PM, Richard Guenther wrote:

On Thu, Oct 4, 2012 at 3:55 PM, Kenneth Zadeck  wrote:

Let me talk about the mode here first.

What this interface/patch provides is a facility where the constant math
that is done in optimizations is done exactly the way that it would be done
on the target machine.   What we have now is a compiler that only does this
if it convenient to do on the host.   I admit that i care about this more
than others right now, but if intel adds a couple of more instructions to
their vector units, other people will start to really care about this issue.
If you take an OImode value with the current compiler and left shift it by
250 the middle end will say that the result is 0.   This is just wrong!!!

What this means is that the bitsize and precision of the operations need to
be carried along when doing math. when wide-int  checks for overflow on the
multiply or add, it is not checking the if the value overflowed on two HWIs,
it is checking if the add overflowed in the mode of the types that are
represented on the target.   When we do shift, we are not doing a shift
within two HWIs, we are truncating the shift value (if this is appropriate)
according to the bitsize and shifting according the precision.

I think that an argument could be made that storing the mode should be
changed to an explicit precision and bitsize.  (A possible other option
would be to store a tree type, but this would make the usage at the rtl
level very cumbersome since types are rare.) Aside from the work, you would
not get much push back.

But the signess is a different argument.   At the rtl level, the signess is
a matter of context.   (you could argue that this is a mistake and i would
agree, but that is an even bigger change.)   But more to the point, at the
tree level, there are a surprising number of places where the operation
desired does not follow the sign of the types that were used to construct
the constants.   Furthermore, not carrying the sign is more consistent with
the double int code, which as you point out carries nothing.

Well, on RTL the signedness is on the operation (you have sdiv and udiv, etc.).
yes, there is a complete enough set of operations that allow you to 
specify the signess where this matters.



double-int tries to present a sign-less twos-complement entity of size
2 * HOST_BITS_PER_WIDE_INT.  I think that is sensible and for
obvious reasons should not change.  Both tree and RTL rely on this.
What we do not want is that up to TImode you get an internal representation
done one way (twos-complement) and on OImode and larger you
suddenly get subtly different behavior.  That's a recepie for desaster.


This is the main difference between double-int and wide-int.Wide int 
does the math the way the machine does it or the way the front end would 
expect it to be done.There is nothing about the host that is visible 
in the interfaces.


I reiterate, our world is already bigger than 128 bits and the intel 
world is likely to be soon.   Double int is stuck in a 64/128 bit world. 
these patches, which i admit are huge, are a way out of that box.




I'd like to clean up the interface to double-int some more (now with the
nice C++ stuff we have).  double-int should be pure twos-complement,
there should be no operations on double-ints that behave differently
when done signed or unsigned, instead we have signed and unsigned
versions of the operations (similar to how signedness is handled on
the RTL level).  With some trivial C++ fu you could have a
double_sint and double_uint type that would get rid of the bool
sign params we have to some functions (and then you could
write double_sint >> n using operator notation).


The problem is that size does matter.wide int is effectively 
infinite precision twos complement.In practice, we can get by by 
just looking at the bitsize and precision of the types/modes involved 
and this makes the implementation faster than true infinite precision.


I went done the road trying to fix all of the places where the compiler 
either iced or got the wrong answer.   I showed this to Sandiford and he 
talked me out of it.  He was right, it was a rat hole.  It could have 
been a smaller patch but it was there were places where it was clearly 
going to take monumental work just to be able to back out and say that 
you had nothing.The number of places in the compiler where you 
compare against the largest and smallest representation of an integer is 
not small and some of them are buried very deep down chains that were 
not designed to say "i cannot answer that question".


I believe that i have all of the functionality of double int in wide 
int, it is just the calls look different because there are not all of 
the interfaces that take two HWI's.As mentioned before, all of the 
places where the overflow is computed for the purpose of asking if this 
is ok in two hwi's is gone.




I

Re: patch to fix

2012-10-04 Thread Kenneth Zadeck

There are a bunch of ways to skin the cat.

1) we can define the extra mode.
2) if we get rid of the mode inside the wide int and replace it with an 
explicit precision and bitsize, then we can just make the size of the 
buffer twice as big as the analysis of the modes indicates.
3) or we can leave your code in a form that uses 2 wide ints.   my 
current patch (which i have not gotten working yet) changes this to use 
the mul_full call, but it could be changed.   It is much simpler that 
the existing code.


i do not see how templates offer any solution at all.   the wide int 
code really needs to have something valid to indicate the length of the 
object, and if there is no mode that big, the code will ice.


my personal feeling is that range analysis is quite useful for small 
integers and not so much as the values get larger.   The only really 
large integer constants that you are likely to find in real code are 
encryption keys, like the dvd decoding keys, and if the key is chosen 
well there should be no useful optimization that you can perform on the 
code.   If this did not work on the largest modes, no one would ever 
notice.   i.e. i would bet that you never make a useful transformation 
on any integer that would not fit in an int32.


However, this is your pass, and i understand the principal of "never 
back down".


Kenny

On 10/04/2012 05:06 PM, Marc Glisse wrote:

On Wed, 3 Oct 2012, Kenneth Zadeck wrote:

i have already converted the vrp code, so i have some guess at where 
you are talking about.  (of course correct me if i am wrong).


in the code that computes the range when two variables are multiplied 
together needs to do a multiplication that produces a result that is 
twice as wide as the inputs.


my library is able to do that with one catch (and this is a big 
catch): the target has to have an integer mode that is twice as big 
as the mode of the operands. The issue is that wide-ints actually 
carry around the mode of the value in order to get the bitsize and 
precision of the operands (it does not have the type, because this 
code has to both work on the rtl and tree level and i generally do 
not want the signness anyway).


Ah, after reading the whole thread, now I understand that it is 
because wide_int carries a mode that it makes little sense making it a 
template (sorry that it took me so long when the information was in 
your first answer). I understand that it would be inconvenient (much 
longer code) to have a base_wide_int that does just the arithmetic and 
a wrapper that contains the mode as well.


Your idea below to define dummy extra modes does bring the template 
idea back to the table though ;-)


my current code in vrp checks to see if such a mode exists and if it 
does, it produces the product.   if the mode does not exist, it 
returns bottom.   What this means is that for most (many or some) 
targets that have a TImode, the largest thing that particular vrp 
discover ranges for is a DImode value.   We could get around this by 
defining the next larger mode than what the target really needs but i 
wonder how much mileage you are going to get out of that with really 
large numbers.


The current wrapping multiplication code in vrp works with a pair of 
double_int, so it should keep working with a pair of wide_int. I see 
now why wide_int doesn't allow to simplify the code, but it doesn't 
have to break.






Re: patch to fix constant math

2012-10-05 Thread Kenneth Zadeck

richi,

let me address a bunch of issues that are randomly spread thru the thread.

1) unlike the double int's current relationship to int cst, we do not 
currently wrap a wide-int into an CONST_WIDE_INT nor (in the patch that 
you have not seen) do we wrap a wide-int into the int cst.wide-ints 
are designed to be short lived.   With the exception of the vrp pass, 
they are almost always allocated on the stack and thrown away 
quickly.That way we can afford to grab a big buffer and play with it.


The rep that lives in the CONST_WIDE_INT and the int cst is a garbage 
collected array of HWIs that is only large enough to hold the actual 
value.   and it is almost always an array of length 1. This saves space 
and works because they are immutable.


This means that there is almost no storage bloat by going with this 
representation no matter how large the widest integer is on the platform.


2) wide-int is really just a generalized version of double-int. The abi 
is different because it does not have all of those calls that take two 
HWIs explicitly and because i was building it all at once LOOKING BOTH 
AT THE TREE AND RTL levels, it has a richer set of calls.   Aside from 
it carrying a mode around that it uses for precision and bitsize, it is 
really bare metal.   My problem is that i am still working on getting 
the tree level stuff working, and phase 1 is closing soon and i wanted 
to get my foot in the door.


3) I think that everyone agrees that having CONST_DOUBLE and CONST_INT 
not carrying around modes was a big mistake.   I toyed with the idea for 
a nanosecond of fixing that along with this, but as you point out the 
patches are already way to large and it would have meant that the burden 
on converting the ports would be larger.   BUT, as the note in the 
original email said, this patch moves significantly in the direction of 
being able to have a mode for the integers at the rtl level.It does 
this by no longer storing integer values in CONST_DOUBLEs so there is no 
longer the test of VOIDmode to see what the CONST_DOUBLE contains.
Also the patch contains a significant number of changes (in the machine 
independent rtl code) to the constructors that carry the mode, rather 
than using GENINT.


4) Having the mode (or at least the precision and bitsize) is crutial 
for the performance of wide-int.   All of the interesting functions have 
quick out tests to use the bare metal operators if the precision fits a 
HWI.  This way, 99% of the time, we are just doing c operators with a 
simple check.


kenny


On 10/05/2012 07:41 AM, Richard Guenther wrote:

On Fri, Oct 5, 2012 at 1:24 PM, Richard Sandiford
 wrote:

Richard Guenther  writes:

On Fri, Oct 5, 2012 at 11:55 AM, Richard Sandiford
 wrote:

Richard Guenther  writes:

As far as the wide_ints recording a mode or precision goes: we're in
the "lucky" position of having tried both options.  Trees record the
type (and thus precision) of all compile-time integer constants.
RTL doesn't.  And the RTL way is a right pain.  It leads to interfaces
in which rtx arguments often have to be accompanied by a mode.  And it
leads to clunky differences like those between simplify_unary_operation
(which takes two mode arguments) and simplify_binary_operation
(which with our current set of binary operations requires only one).

But the issue here is not that double_int doesn't record a mode
or precision (or a sign).  The issue is that CONST_DOUBLE and CONST_INT
don't!  The _tree_ INTEGER_CST contains of a type and a double-int.
I see double-int (and the proposed wide-int) as a common building-block
used for kind of "arbitrary precision" (as far as the target needs) integer
constants on both tree and RTL.  And having a common worker implementation
requires it to operate on something different than tree types or RTL mode
plus signedness.


To put it another way: every wide_int operation needs to know
the precision of its arguments and the precision of its result.

That's not true.  Every tree or RTL operation does, not every
wide_int operation.  double_int's are just twos-complement numbers
of precision 2 * HOST_BITS_PER_WIDE_INT.  wide_int's should
be just twos-complement numbers of precision len *
WHATEVER_HOST_COMPONENT_TYPE_IS_SUITABLE_FOR_A_FAST_IMPLEMENTATION.
Operations on double_int and wide_int are bare-metal,
in nearly all of the times you should use routines that do
proper sign-/zero-extending to a possibly smaller precision.  When
using bare-metal you are supposed to do that yourself.

Which is why I was suggesting to add double_sint, double_uint,
double_sintp (with precision), etc., wrappers and operations.


Not storing the precision in the wide_int is putting the onus
on the caller to keep track of the precision separately.

But that's a matter of providing something high-level ontop of
the bare-metal double-int/wide-int (something shared between
RTL and trees).  Duplicating information in the bare-metal
doesn't make sense (and no, INTE

Re: patch to fix constant math

2012-10-05 Thread Kenneth Zadeck


On 10/05/2012 10:36 AM, Richard Guenther wrote:

On Fri, Oct 5, 2012 at 4:14 PM, Richard Sandiford
 wrote:

Richard Guenther  writes:

On Fri, Oct 5, 2012 at 3:18 PM, Richard Sandiford
 wrote:

Richard Sandiford  writes:

How is CONST_WIDE_INT variable size?

It's just the usual trailing variable-length array thing.

Good.  Do you get rid of CONST_DOUBLE (for integers) at the same time?

Yeah.  I initially thought it might be OK to keep them and have
CONST_INT, integer CONST_DOUBLEs and CONST_WIDE_INT alongside
each other.  (The way the patch is structured means that the
choice of whether to keep integer CONST_DOUBLEs can be changed
very easily.)  But Kenny convinced me it was a bad idea.

Sorry to follow up on myself, but to clarify: I was talking about
converted targets here.  (As in, I originally thought even converted
targets could continue to use integer CONST_DOUBLEs.)

Unconverted targets continue to use CONST_DOUBLE.

Why is it that not all targets are "converted"?  What's the difficulty
with that?
I really do not like partially transitioning there.

The problem is that CONST_DOUBLE as it exists today has two meanings:
a floating-point meaning and an integer meaning.  Ports that handle
CONST_DOUBLEs are aware of this and expect the two things to have
the same rtx code.  Whereas in a converted port, the two things
have different rtx codes, and the integers have a different
representation from the current low/high pair.

So to take the first hit in config/alpha/alpha.c,
namely alpha_rtx_costs:

 case CONST_DOUBLE:
   if (x == CONST0_RTX (mode))
 *total = 0;
   else if ((outer_code == PLUS && add_operand (x, VOIDmode))
|| (outer_code == AND && and_operand (x, VOIDmode)))
 *total = 0;
   else if (add_operand (x, VOIDmode) || and_operand (x, VOIDmode))
 *total = 2;
   else
 *total = COSTS_N_INSNS (2);
   return true;

What could the patch do to make this work without modification?
The middle two cases are for integers, but the first and last
presumably apply to both integers and floats.

I didn't say it does not require changes, just that the transition should be
finished.  Some ports have little CONST_DOUBLE uses (which is what
I am grepping for), and if the max-wide-int-size matches that of the
current CONST_DOUBLE there is little chance for code generation
differences (in fact there should be none, correct?).

In the current patch no target is converted (maybe that's just going to
be 5/n?), I'd like to see at least one commonly tested target converted
and if we don't convert all targets another commonly tested target
to stay unconverted (just check gcc-testresults for which pair of targets
that would work).  Definitely at the end of stage3 we should have zero
unconverted targets (but I doubt converting them is a huge task - I have
converted the VECTOR_CST representation as well and we've had
the location + BLOCK merge and other changes affecting all targets).

Richard.
i will convert ppc if that is what it takes.   david's office is 4 isles 
away and mike has a lot of experience on ppc also.  (this is unless 
richard is willing to do mips one afternoon.)   I have done my two 
private ports but i understand that that should not count.

Richard




Re: patch to fix constant math

2012-10-05 Thread Kenneth Zadeck

richard s,
there are two comments that i deferred to you.   that have the word 
richard in them,


richi,
thank, i will start doing this now.

kenny
On 10/05/2012 09:49 AM, Richard Guenther wrote:

On Fri, Oct 5, 2012 at 2:39 PM, Richard Guenther
 wrote:

Ok, I see where you are going.  Let me look at the patch again.

* The introduction and use of CONST_SCALAR_INT_P could be split out
   (obvious and good)

* DEF_RTL_EXPR(CONST_WIDE_INT, "const_wide_int", "", RTX_CONST_OBJ)
   defining that new RTX is orthogonal to providing the wide_int abstraction
   for operating on CONST_INT and CONST_DOUBLE, right?

@@ -144,6 +144,7 @@ static bool
  prefer_and_bit_test (enum machine_mode mode, int bitnum)
  {
bool speed_p;
+  wide_int mask = wide_int::set_bit_in_zero (bitnum, mode);

set_bit_in_zero ... why use this instead of
wide_int::zero (mode).set_bit (bitnum) that would match the old
double_int_zero.set_bit interface and be more composed of primitives?
adding something like this was just based on usage.We do this 
operation all over the place, why not make it concise and efficient. 
The api is very bottom up.   I looked at what the clients were doing all 
over the place and added those functions.


wide-int has both and_not and or_not.   double-int only has and_not, but 
there are a lot of places in where you do or_nots, so we have or_not also.



if (and_test == 0)
  {
@@ -164,8 +165,7 @@ prefer_and_bit_test (enum machine_mode m
  }

/* Fill in the integers.  */
-  XEXP (and_test, 1)
-= immed_double_int_const (double_int_zero.set_bit (bitnum), mode);
+  XEXP (and_test, 1) = immed_wide_int_const (mask);

I suppose immed_wide_int_const may create a CONST_INT, a CONST_DOUBLE
or a CONST_WIDE_INT?
yes, on converted targets it builds const_ints and const_wide_ints and 
on unconverted targets it builds const_ints and const_doubles.The 
reason i did not want to convert the targets is that the code that lives 
in the targets generally wants to use the values to create constants and 
this code is very very very target specific.


+#if TARGET_SUPPORTS_WIDE_INT
+/* Returns 1 if OP is an operand that is a CONST_INT or CONST_WIDE_INT.  */
+int
+const_scalar_int_operand (rtx op, enum machine_mode mode)
+{

why is it necessary to condition anything on TARGET_SUPPORTS_WIDE_INT?
It seems testing/compile coverage of this code will be bad ...

Can't you simply map CONST_WIDE_INT to CONST_DOUBLE for targets
The accessors for the two are completely different.const doubles 
"know" that there are exactly two hwi's.   for const_wide_ints, you only 
know that the len is greater than 1.   anything with len 1 would be 
CONST_INT.


In a modern c++ world, const_int would be a subtype of const_int, but 
that is a huge patch.



not supporting CONST_WIDE_INT (what does it mean to "support"
CONST_WIDE_INT?  Does a target that supports CONST_WIDE_INT no
longer use CONST_DOUBLEs for integers?)

yes, that is exactly what it means.

+  if (!CONST_WIDE_INT_P (op))
+return 0;

hm, that doesn't match the comment (CONST_INT is supposed to return 1 as well).
The comment doesn't really tell what the function does it seems,

I need some context here to reply.


+  int prec = GET_MODE_PRECISION (mode);
+  int bitsize = GET_MODE_BITSIZE (mode);
+
+  if (CONST_WIDE_INT_NUNITS (op) * HOST_BITS_PER_WIDE_INT > bitsize)
+   return 0;

it's mode argument is not documented.  And this doesn't even try to see if
the constant would fit the mode anyway (like if it's zero).  Not sure what
the function is for.
I will upgrade the comments, they were inherited from the old version 
with the const_double changed to the const_wide_int



+   {
+ /* Multiword partial int.  */
+ HOST_WIDE_INT x
+   = CONST_WIDE_INT_ELT (op, CONST_WIDE_INT_NUNITS (op) - 1);
+ return (wide_int::sext (x, prec & (HOST_BITS_PER_WIDE_INT - 1))
+ == x);

err - so now we have wide_int with a mode but we pass in another mode
and see if they have the same value?  Same value as-in signed value?
Which probably is why you need to rule out different size constants above?
Because you don't know whether to sign or zero extend?
no it is worse than that.   I made the decision, which i think is 
correct, that we were not going to carry the mode inside 
const_wide_int.   The justification was that we do not do it for 
const_int.There is a comment in the constructor for const_wide_int 
that says it would be so easy to just put this in.


But, this is an old api that did not change.   only the internals 
changed to support const_wide_int.




+/* Returns 1 if OP is an operand that is a CONST_WIDE_INT.  */
+int
+const_wide_int_operand (rtx op, enum machine_mode mode)
+{

similar comments, common code should be factored out at least.
Instead of conditioning a whole set of new function on supports-wide-int
please add cases to the existing functions to avoid diverging in pieces
that both kind of targ

Re: patch to fix constant math

2012-10-05 Thread Kenneth Zadeck


On 10/05/2012 01:29 PM, Richard Sandiford wrote:

+/* This is the maximal size of the buffer needed for dump.  */
>>+const int MAX = (MAX_BITSIZE_MODE_ANY_INT / 4
>>++ MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT + 32);
>>
>>I'd prefer a target macro for this, not anything based off
>>MAX_BITSIZE_MODE_ANY_INT just to allow no-op transition of a
>>target to wide-ints (which IMHO should be the default for all targets,
>>simplifying the patch).

Kenny didn't comment on this, but I disagree.  Part of the process of
"converting" a port is to ween it off any assumptions about the number
of HWIs.

i also disagree.my patch is littered with are places like this.
consider:

@@ -5180,13 +4808,11 @@ static rtx
 simplify_immed_subreg (enum machine_mode outermode, rtx op,
enum machine_mode innermode, unsigned int byte)
 {
-  /* We support up to 512-bit values (for V8DFmode).  */
   enum {
-max_bitsize = 512,
 value_bit = 8,
 value_mask = (1 << value_bit) - 1
   };
-  unsigned char value[max_bitsize / value_bit];
+  unsigned char value [MAX_BITSIZE_MODE_ANY_MODE/value_bit];
   int value_start;
   int i;
   int elem;

Would you want do have a target macro for this also?
The code works no matter what. we do not need to be chasing this kind of 
problem all over the place.


kenny


Re: patch to fix constant math - first small patch

2012-10-05 Thread Kenneth Zadeck
this patch adds two groups of things to hwint.h that are needed for 
wide-int.[ch].


A new data type, the HOST_HALF_WIDE_INT (and all of it related 
macros).   This type is defined to be exactly 1/2 the width of a 
HOST_WIDE_INT.  This is used by the multiplication and division routines 
in wide-int.c.   This is the type for the "digits" that are multiplied 
together or divided since you want to do the largest multiplication that 
you can do that yields a HOST_WIDE_INT as the result.


The zext_hwi and sext_hwi are simple convenience functions on HWIs.   
These were moved from the wide-int.h in the original patch at the 
request of richi.


ok for commit?

Kenny



2012-10-5  Kenneth Zadeck 

* hwint.h (HOST_BITS_PER_HALF_WIDE_INT, HOST_HALF_WIDE_INT,
HOST_HALF_WIDE_INT_PRINT, HOST_HALF_WIDE_INT_PRINT_C,
HOST_HALF_WIDE_INT_PRINT_DEC, HOST_HALF_WIDE_INT_PRINT_DEC_C,
HOST_HALF_WIDE_INT_PRINT_UNSIGNED, HOST_HALF_WIDE_INT_PRINT_HEX,
HOST_HALF_WIDE_INT_PRINT_HEX_PURE): New symbols.
(sext_hwi, zext_hwi): New functions.

diff --git a/gcc/hwint.h b/gcc/hwint.h
index ca47148..07c4748 100644
--- a/gcc/hwint.h
+++ b/gcc/hwint.h
@@ -77,6 +77,40 @@ extern char sizeof_long_long_must_be_8[sizeof(long long) == 8 ? 1 : -1];
 # endif
 #endif
 
+/* Print support for half a host wide int.  */
+#define HOST_BITS_PER_HALF_WIDE_INT (HOST_BITS_PER_WIDE_INT / 2)
+#if HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_LONG
+# define HOST_HALF_WIDE_INT long
+# define HOST_HALF_WIDE_INT_PRINT HOST_LONG_FORMAT
+# define HOST_HALF_WIDE_INT_PRINT_C "L"
+# define HOST_HALF_WIDE_INT_PRINT_DEC "%" HOST_HALF_WIDE_INT_PRINT "d"
+# define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC HOST_HALF_WIDE_INT_PRINT_C
+# define HOST_HALF_WIDE_INT_PRINT_UNSIGNED "%" HOST_HALF_WIDE_INT_PRINT "u"
+# define HOST_HALF_WIDE_INT_PRINT_HEX "%#" HOST_HALF_WIDE_INT_PRINT "x"
+# define HOST_HALF_WIDE_INT_PRINT_HEX_PURE "%" HOST_HALF_WIDE_INT_PRINT "x"
+#elif HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_INT
+# define HOST_HALF_WIDE_INT int
+# define HOST_HALF_WIDE_INT_PRINT ""
+# define HOST_HALF_WIDE_INT_PRINT_C ""
+# define HOST_HALF_WIDE_INT_PRINT_DEC "%" HOST_HALF_WIDE_INT_PRINT "d"
+# define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC HOST_HALF_WIDE_INT_PRINT_C
+# define HOST_HALF_WIDE_INT_PRINT_UNSIGNED "%" HOST_HALF_WIDE_INT_PRINT "u"
+# define HOST_HALF_WIDE_INT_PRINT_HEX "%#" HOST_HALF_WIDE_INT_PRINT "x"
+# define HOST_HALF_WIDE_INT_PRINT_HEX_PURE "%" HOST_HALF_WIDE_INT_PRINT "x"
+#elif HOST_BITS_PER_HALF_WIDE_INT == HOST_BITS_PER_SHORT
+# define HOST_HALF_WIDE_INT short
+# define HOST_HALF_WIDE_INT_PRINT "h"
+# define HOST_HALF_WIDE_INT_PRINT_C ""
+# define HOST_HALF_WIDE_INT_PRINT_DEC "%" HOST_HALF_WIDE_INT_PRINT "d"
+# define HOST_HALF_WIDE_INT_PRINT_DEC_C HOST_HALF_WIDE_INT_PRINT_DEC HOST_HALF_WIDE_INT_PRINT_C
+# define HOST_HALF_WIDE_INT_PRINT_UNSIGNED "%" HOST_HALF_WIDE_INT_PRINT "u"
+# define HOST_HALF_WIDE_INT_PRINT_HEX "%#" HOST_HALF_WIDE_INT_PRINT "x"
+# define HOST_HALF_WIDE_INT_PRINT_HEX_PURE "%" HOST_HALF_WIDE_INT_PRINT "x"
+#else
+#error Please add support for HOST_HALF_WIDE_INT
+#endif
+
+
 #define HOST_WIDE_INT_1 HOST_WIDE_INT_C(1)
 
 /* This is a magic identifier which allows GCC to figure out the type
@@ -94,9 +128,13 @@ typedef HOST_WIDE_INT __gcc_host_wide_int__;
 # if HOST_BITS_PER_WIDE_INT == 64
 #  define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
  "0x%" HOST_LONG_FORMAT "x%016" HOST_LONG_FORMAT "x"
+#  define HOST_WIDE_INT_PRINT_PADDED_HEX \
+ "%016" HOST_LONG_FORMAT "x"
 # else
 #  define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
  "0x%" HOST_LONG_FORMAT "x%08" HOST_LONG_FORMAT "x"
+#  define HOST_WIDE_INT_PRINT_PADDED_HEX \
+ "%08" HOST_LONG_FORMAT "x"
 # endif
 #else
 # define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT
@@ -104,6 +142,8 @@ typedef HOST_WIDE_INT __gcc_host_wide_int__;
   /* We can assume that 'long long' is at least 64 bits.  */
 # define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
 "0x%" HOST_LONG_LONG_FORMAT "x%016" HOST_LONG_LONG_FORMAT "x"
+# define HOST_WIDE_INT_PRINT_PADDED_HEX \
+"%016" HOST_LONG_LONG_FORMAT "x"
 #endif /* HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG */
 
 #define HOST_WIDE_INT_PRINT_DEC "%" HOST_WIDE_INT_PRINT "d"
@@ -277,4 +317,32 @@ extern HOST_WIDE_INT pos_mul_hwi (HOST_WIDE_INT, HOST_WIDE_INT);
 extern HOST_WIDE_INT mul_hwi (HOST_WIDE_INT, HOST_WIDE_INT);
 extern HOST_WIDE_INT least_common_multiple (HOST_WIDE_INT, HOST_WIDE_INT);

Re: patch to fix constant math - second small patch

2012-10-05 Thread Kenneth Zadeck
This patch adds machinery to genmodes.c so that largest possible sizes 
of various data structures can be determined at gcc build time.  These 
functions create 3 symbols that are available in insn-modes.h:

MAX_BITSIZE_MODE_INT - the bitsize of the largest int.
MAX_BITSIZE_MODE_PARTIAL_INT - the bitsize of the largest partial int.
MAX_BITSIZE_MODE_ANY_INT - the largest bitsize of any kind of int.
Index: gcc/genmodes.c
===
--- gcc/genmodes.c	(revision 191978)
+++ gcc/genmodes.c	(working copy)
@@ -849,6 +849,38 @@ calc_wider_mode (void)
 
 #define print_closer() puts ("};")
 
+/* Compute the max bitsize of some of the classes of integers.  It may
+   be that there are needs for the other integer classes, and this
+   code is easy to extend.  */
+static void
+emit_max_int (void)
+{
+  unsigned int max, mmax;
+  struct mode_data *i;
+  int j;
+
+  puts ("");
+  for (max = 1, i = modes[MODE_INT]; i; i = i->next)
+if (max < i->bytesize)
+	max = i->bytesize;
+  printf ("#define MAX_BITSIZE_MODE_INT %d*BITS_PER_UNIT\n", max);
+  mmax = max;
+  for (max = 1, i = modes[MODE_PARTIAL_INT]; i; i = i->next)
+if (max < i->bytesize)
+	max = i->bytesize;
+  printf ("#define MAX_BITSIZE_MODE_PARTIAL_INT %d*BITS_PER_UNIT\n", max);
+  if (max > mmax)
+mmax = max;
+  printf ("#define MAX_BITSIZE_MODE_ANY_INT %d*BITS_PER_UNIT\n", mmax);
+
+  mmax = 0;
+  for (j = 0; j < MAX_MODE_CLASS; j++)
+for (i = modes[j]; i; i = i->next)
+  if (mmax < i->bytesize)
+	mmax = i->bytesize;
+  printf ("#define MAX_BITSIZE_MODE_ANY_MODE %d*BITS_PER_UNIT\n", mmax);
+}
+
 static void
 emit_insn_modes_h (void)
 {
@@ -913,6 +945,7 @@ enum machine_mode\n{");
 #endif
   printf ("#define CONST_MODE_IBIT%s\n", adj_ibit ? "" : " const");
   printf ("#define CONST_MODE_FBIT%s\n", adj_fbit ? "" : " const");
+  emit_max_int ();
   puts ("\
 \n\
 #endif /* insn-modes.h */");
2012-10-5  Kenneth Zadeck 

* genmodes.c (emit_max_int): New function.
(emit_insn_modes_h): Added call to emit_max_function.



Re: patch to fix constant math - third small patch

2012-10-06 Thread Kenneth Zadeck

This is the third patch in the series of patches to fix constant math.
this one changes some predicates at the rtl level to use the new 
predicate CONST_SCALAR_INT_P.

I did not include a few that were tightly intertwined with other changes.

Not all of these changes are strictly mechanical.   Richard, when 
reviewing this had me make additional changes to remove what he thought 
were latent bugs at the rtl level.   However, it appears that the bugs 
were not latent.I do not know what is going on here but i am smart 
enough to not look a gift horse in the mouth.


All of this was done on the same machine with no changes and identical 
configs.  It is an x86-64 with ubuntu 12-4.


ok for commit?

in the logs below, gbBaseline is a trunk from friday and the gbWide is 
the same revision but with my patches.  Some of this like 
gfortran.dg/pr32627 is obviously flutter, but the rest does not appear 
to be.


=
heracles:~/gcc(13) gccBaseline/contrib/compare_tests 
gbBaseline/gcc/testsuite/gcc/gcc.log gbWide/gcc/testsuite/gcc/gcc.log

New tests that PASS:

gcc.dg/builtins-85.c scan-assembler mysnprintf
gcc.dg/builtins-85.c scan-assembler-not __chk_fail
gcc.dg/builtins-85.c (test for excess errors)


heracles:~/gcc(14) gccBaseline/contrib/compare_tests 
gbBaseline/gcc/testsuite/gfortran/gfortran.log 
gbWide/gcc/testsuite/gfortran/gfortran.log

New tests that PASS:

gfortran.dg/pr32627.f03  -O3 -fomit-frame-pointer -funroll-loops (test 
for excess errors)

gfortran.dg/pr32627.f03  -O3 -fomit-frame-pointer  (test for excess errors)
gfortran.dg/pr32627.f03  -Os  (test for excess errors)
gfortran.dg/pr32635.f  -O0  execution test
gfortran.dg/pr32635.f  -O0  (test for excess errors)
gfortran.dg/substr_6.f90  -O2  (test for excess errors)

Old tests that passed, that have disappeared: (Eeek!)

gfortran.dg/pr32627.f03  -O1  (test for excess errors)
gfortran.dg/pr32627.f03  -O3 -fomit-frame-pointer -funroll-all-loops 
-finline-functions  (test for excess errors)

gfortran.dg/pr32627.f03  -O3 -g  (test for excess errors)
gfortran.dg/substring_equivalence.f90  -O  (test for excess errors)
Using /home/zadeck/gcc/gccBaseline/gcc/testsuite/config/default.exp as 
tool-and-target-specific interface file.


=== g++ Summary ===

# of expected passes49793
# of expected failures284
# of unsupported tests601

runtest completed at Fri Oct  5 16:10:20 2012
heracles:~/gcc(16) tail gbWide/gcc/testsuite/g++/g++.log Using 
/usr/share/dejagnu/config/unix.exp as generic interface file for target.
Using /home/zadeck/gcc/gccWide/gcc/testsuite/config/default.exp as 
tool-and-target-specific interface file.


=== g++ Summary ===

# of expected passes50472
# of expected failures284
# of unsupported tests613

runtest completed at Fri Oct  5 19:51:50 2012







Re: patch to fix constant math

2012-10-07 Thread Kenneth Zadeck


On 10/07/2012 08:47 AM, Richard Guenther wrote:

len seems redundant unless you want to optimize encoding.
>>len == (precision + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT.

>
>that is exactly what we do.   However, we are optimizing time, not space.
>the value is always stored in compressed form, i.e the same representation
>as is used inside the CONST_WIDE_INTs and INT_CSTs. this makes the
>transformation between them very fast.   And since we do this a lot, it
>needs to be fast.   So the len is the number of HWIs needed to represent the
>value which is typically less than what would be implied by the precision.

But doesn't this require a sign?  Otherwise how do you encode TImode 0x?
Would len be 1 here?  (and talking about CONST_WIDE_INT vs CONST_INT,
wouldn't CONST_INT be used anyway for all ints we can encode "small"?  Or
is it (hopefully) the goal to replace CONST_INT with CONST_WIDE_INT
everywhere?)  Or do you say wide_int is always "signed', thus TImode 0x
needs len == 2 and an explicit zero upper HWI word?

the compression of this has len ==2 with the explicit upper being 0.


  Or do you say wide_int
is always "unsigned", thus TImode -1 needs len == 2?  Noting that double-ints
were supposed to be twos-complement (thus, 'unsigned') numbers having
implicit non-zero bits sounds error-prone.

That said - I don't see any value in optimizing storage for short-lived
(as you say) wide-ints (apart from limiting it to the mode size).  For
example mutating operations on wide-ints are not really possible
if you need to extend storage.
the compression used is independent of whether it is sign or unsigned.  
but the compression "looks" signed.   i.e. you do not represent the 
upper hwi if they would just be a sign extension of the top hwi that is 
represented.   However, this does not imply anything about the math that 
was used to set those bits that way, and you can always go back to the 
full rep independent of the sign.


I do not have any long term plans to merge CONST_INT into 
CONST_WIDE_INT.   It would be a huge change in the ports and would be 
fairly space inefficient.  My guess is that in real code, less than 1% 
of real constants will have a length greater than 1 even on a 32 bit 
host.   CONST_INT is very space efficient. This could have been 
mentioned as part of Richard's response to your comment about the way we 
represent the CONST_WIDE_INTs.   In practice, there are almost none of 
them anyway.


In fact, you could argue that the tree level did it wrong (not that i am 
suggesting to change this).   But it makes me think what was going on 
when the decision to make TYPE_PRECISION be an INT_CST rather than just 
a HWI was made.   For that there is an implication that it could never 
take more than a HWI since no place in the code even checks 
TREE_INT_CST_HIGH for these.

>>+  enum Op {
>>+NONE,
>>
>>we don't have sth like this in double-int.h, why was the double-int
>>mechanism
>>not viable?

>
>i have chosen to use enums for things rather than passing booleans.

But it's bad to use an enum with 4 values for a thing that can only possibly
expect two.  You cannot optimize tests as easily.  Please retain the bool uns
parameters instead.

I am breaking it into two enums.


Re: patch to fix constant math

2012-10-07 Thread Kenneth Zadeck


On 10/07/2012 09:19 AM, Richard Guenther wrote:

>In fact, you could argue that the tree level did it wrong (not that i am
>suggesting to change this).   But it makes me think what was going on when
>the decision to make TYPE_PRECISION be an INT_CST rather than just a HWI was
>made.   For that there is an implication that it could never take more than
>a HWI since no place in the code even checks TREE_INT_CST_HIGH for these.

Well - on the tree level we now always have two HWIs for all INTEGER_CSTs.  If
we can, based on the size of the underlying mode, reduce that to one
HWI we already
win something.  If we add an explicit length to allow a smaller
encoding for larger modes
(tree_base conveniently has an available 'int' for this ...) then we'd
win in more cases.
Thus, is CONST_INT really necessarily better than optimized CONST_WIDE_INT
storage?
i have to admit, that looking at these data structures gives me a 
headache.  This all looks like something that Rube Goldberg would have 
invented had he done object oriented design  (Richard S did not know who 
Rube Goldberg when i mentioned this name to him a few years ago since 
this is an American thing, but the british had their own equivalent and 
I assume the germans do too.).


i did the first cut of changing the rtl level structure and Richard S 
threw up on it and suggested what is there now, which happily (for me) i 
was able to get mike to implement.


mike also did the tree level version of the data structures for me.   i 
will make sure he used that left over length field.


The bottom line is that you most likely just save the length, but that 
is a big percent of something this small.  Both of rtl ints have a mode, 
so if we can make that change later, it will be space neutral.


Re: patch to fix constant math - third small patch

2012-10-08 Thread Kenneth Zadeck

yes, my bad.   here it is with the patches.
On 10/06/2012 11:55 AM, Kenneth Zadeck wrote:

This is the third patch in the series of patches to fix constant math.
this one changes some predicates at the rtl level to use the new 
predicate CONST_SCALAR_INT_P.

I did not include a few that were tightly intertwined with other changes.

Not all of these changes are strictly mechanical.   Richard, when 
reviewing this had me make additional changes to remove what he 
thought were latent bugs at the rtl level.   However, it appears that 
the bugs were not latent.I do not know what is going on here but i 
am smart enough to not look a gift horse in the mouth.


All of this was done on the same machine with no changes and identical 
configs.  It is an x86-64 with ubuntu 12-4.


ok for commit?

in the logs below, gbBaseline is a trunk from friday and the gbWide is 
the same revision but with my patches.  Some of this like 
gfortran.dg/pr32627 is obviously flutter, but the rest does not appear 
to be.


=
heracles:~/gcc(13) gccBaseline/contrib/compare_tests 
gbBaseline/gcc/testsuite/gcc/gcc.log gbWide/gcc/testsuite/gcc/gcc.log

New tests that PASS:

gcc.dg/builtins-85.c scan-assembler mysnprintf
gcc.dg/builtins-85.c scan-assembler-not __chk_fail
gcc.dg/builtins-85.c (test for excess errors)


heracles:~/gcc(14) gccBaseline/contrib/compare_tests 
gbBaseline/gcc/testsuite/gfortran/gfortran.log 
gbWide/gcc/testsuite/gfortran/gfortran.log

New tests that PASS:

gfortran.dg/pr32627.f03  -O3 -fomit-frame-pointer -funroll-loops (test 
for excess errors)
gfortran.dg/pr32627.f03  -O3 -fomit-frame-pointer  (test for excess 
errors)

gfortran.dg/pr32627.f03  -Os  (test for excess errors)
gfortran.dg/pr32635.f  -O0  execution test
gfortran.dg/pr32635.f  -O0  (test for excess errors)
gfortran.dg/substr_6.f90  -O2  (test for excess errors)

Old tests that passed, that have disappeared: (Eeek!)

gfortran.dg/pr32627.f03  -O1  (test for excess errors)
gfortran.dg/pr32627.f03  -O3 -fomit-frame-pointer -funroll-all-loops 
-finline-functions  (test for excess errors)

gfortran.dg/pr32627.f03  -O3 -g  (test for excess errors)
gfortran.dg/substring_equivalence.f90  -O  (test for excess errors)
Using /home/zadeck/gcc/gccBaseline/gcc/testsuite/config/default.exp as 
tool-and-target-specific interface file.


=== g++ Summary ===

# of expected passes49793
# of expected failures284
# of unsupported tests601

runtest completed at Fri Oct  5 16:10:20 2012
heracles:~/gcc(16) tail gbWide/gcc/testsuite/g++/g++.log Using 
/usr/share/dejagnu/config/unix.exp as generic interface file for target.
Using /home/zadeck/gcc/gccWide/gcc/testsuite/config/default.exp as 
tool-and-target-specific interface file.


=== g++ Summary ===

# of expected passes50472
# of expected failures284
# of unsupported tests613

runtest completed at Fri Oct  5 19:51:50 2012







diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 299150e..0404605 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3633,9 +3633,8 @@ expand_debug_locations (void)
 
 	gcc_assert (mode == GET_MODE (val)
 			|| (GET_MODE (val) == VOIDmode
-			&& (CONST_INT_P (val)
+			&& (CONST_SCALAR_INT_P (val)
 || GET_CODE (val) == CONST_FIXED
-|| CONST_DOUBLE_AS_INT_P (val) 
 || GET_CODE (val) == LABEL_REF)));
 	  }
 
diff --git a/gcc/combine.c b/gcc/combine.c
index 4e0a579..b531305 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2617,16 +2617,19 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx i0, int *new_direct_jump_p,
  constant.  */
   if (i1 == 0
   && (temp = single_set (i2)) != 0
-  && (CONST_INT_P (SET_SRC (temp))
-	  || CONST_DOUBLE_AS_INT_P (SET_SRC (temp)))
+  && CONST_SCALAR_INT_P (SET_SRC (temp))
   && GET_CODE (PATTERN (i3)) == SET
-  && (CONST_INT_P (SET_SRC (PATTERN (i3)))
-	  || CONST_DOUBLE_AS_INT_P (SET_SRC (PATTERN (i3
+  && CONST_SCALAR_INT_P (SET_SRC (PATTERN (i3)))
   && reg_subword_p (SET_DEST (PATTERN (i3)), SET_DEST (temp)))
 {
   rtx dest = SET_DEST (PATTERN (i3));
   int offset = -1;
   int width = 0;
+  
+  /* There are not explicit tests to make sure that this is not a
+	 float, but there is code here that would not be correct if it
+	 was.  */
+  gcc_assert (GET_MODE_CLASS (GET_MODE (SET_SRC (temp))) != MODE_FLOAT);
 
   if (GET_CODE (dest) == ZERO_EXTRACT)
 	{
@@ -5102,8 +5105,7 @@ subst (rtx x, rtx from, rtx to, int in_dest, int in_cond, int unique_copy)
 	  if (GET_CODE (new_rtx) == CLOBBER && XEXP (new_rtx, 0) == const0_rtx)
 		return new_rtx;
 
-	  if (GET_CODE (x) == SUBREG
-		  && (CONST_INT_P (new_rtx) || CONST_DOUBLE_AS_INT_P (new_rtx)))
+	  if (GET_CODE (x) == SUBREG && CONST_SCALAR_INT_P (new_rtx))
 		{
 		  enum machine_mode mode = GET_MODE (x);
 
@@ -7133,7 +7135,7 @@ make_extraction (

Re: patch to fix constant math - first small patch

2012-10-08 Thread Kenneth Zadeck

is this ok to commit with this change?

kenny
On 10/05/2012 08:14 PM, Joseph S. Myers wrote:

On Fri, 5 Oct 2012, Kenneth Zadeck wrote:


+# define HOST_HALF_WIDE_INT_PRINT "h"

This may cause problems on hosts not supporting %hd (MinGW?), and there's
no real need for using "h" here given the promotion of short to int; you
can just use "" (rather than e.g. needing special handling in xm-mingw32.h
like is done for HOST_LONG_LONG_FORMAT).





Re: wide int patch #6: Replacement of hwi extraction from int-csts.

2012-10-18 Thread Kenneth Zadeck


On 10/18/2012 06:22 AM, Richard Biener wrote:

On Wed, Oct 17, 2012 at 11:47 PM, Kenneth Zadeck
 wrote:

Richi,

I apologize for the size of this patch, but it only does one very small
thing, it is just that it does it all over the middle end.

This patch introduces a new api for extracting a signed or unsigned hwi from
an integer cst.  There had been an abi for doing this, but it has some
problems that were fixed here, and it was only used sometimes.

The new abi consists of 6 functions, three for testing if the constant in
the int cst will fit and three for actually pulling out the value.

The ones for testing are tree_fits_uhwi_p, tree_fits_shwi_p,
tree_fits_hwi_p.   The first two of these are unsigned and signed versions,
and the second one takes a boolean parameter which is true if the value is
positive.   This replaces the host_integerp which basically has the same
functionality of tree_fits_hwi_p.   The reason for changing this is that
there were about 400 calls to host_integerp and all but 4 of them took a
constant parameter. These names are chosen to be similar to the similar
functions in wide-int and are more mnemonic that the existing name since
they use the more readable u and s prefixes that a lot of other places do.

On the accessing side, there is tree_to_uhwi, tree_to_shwi, and tree_to_hwi.
The first two do checking when checking is enabled. The last does no
checking.

Just a quick note here - the changelog mentions tree_low_cst (as new
function!?) but not host_integerp.  You should probably get rid of both,
otherwise uses will creap back as people are familiar with them
(I'm not sure these changes for consistency are always good ...)

i will fix this.

I don't like that tree_to_hwi does not do any checking.  In fact I don't
like that it _exists_, after all it has a return type which signedness
does not magically change.  Unchecked conversions should use
TREE_LOW_CST.
the idea is that when wide-int goes in, there is actually no 
TREE_INT_CST_LOW.   The concept of low and high go out the door. the 
int-cst will have an array in it that is big enough to hold the value.
so tree_to_hwi becomes a short hand for just accessing the lower element 
of the array.


you could argue that you should say tree_fits_uhwi_p followed by 
tree_to_uhwi (and the same for signed).   This is an easy fix.   it just 
seemed a little redundant.


I should also point out that about 2/3 if this patch is going to die as 
the rest of the wide int stuff goes in.   But i did not want to submit a 
patch that only converted 1/3 of the cases.   The truth is that most of 
these places where you are doing this conversion are just because the 
people were too lazy to do the math at the full precision of the double int.


Thus, my 2 cents before I really look at the patch (which will likely
be next week only, so maybe you can do a followup with the above
suggestions).

Thanks,
Richard.


it is expected normally, the unchecked accesses follows an explicit check,
or if there is no explicit check, then one of the checked ones follows.
This is always not the case for places that just want to look at the bottom
bits of some large number, such as several places that compute alignment.
These just use the naked unchecked access.

There were a lot of places that either did no checking or did the checking
inline.   This patch tracks down all of those places and they now use the
same abi.

There are three places where i found what appear to be bugs in the code.
These are places where the existing code did an unsigned check and then
followed it with a signed access (or visa versa). These places are marked
with comments that contain the string "THE NEXT LINE IS POSSIBLY WRONG".
With some guidance from the reviewer, i will fix these are remove the
unacceptable comments.

Aside from that, this patch is very very boring because it just makes this
one transformation. Look for interesting stuff in tree.h.   Everything else
is just forcing everywhere to use a single abi.

This patch could go on with a little work without the other 5 patches or it
can wait for them.

kenny




Re: wide int patch #6: Replacement of hwi extraction from int-csts.

2012-10-18 Thread Kenneth Zadeck
you know richi, i did not know who i was actually talking to.   i said 
who is this richard beiner person and then i saw the email address.



On 10/18/2012 08:58 AM, Richard Biener wrote:

On Thu, Oct 18, 2012 at 2:52 PM, Kenneth Zadeck
 wrote:

On 10/18/2012 06:22 AM, Richard Biener wrote:

On Wed, Oct 17, 2012 at 11:47 PM, Kenneth Zadeck
 wrote:

Richi,

I apologize for the size of this patch, but it only does one very small
thing, it is just that it does it all over the middle end.

This patch introduces a new api for extracting a signed or unsigned hwi
from
an integer cst.  There had been an abi for doing this, but it has some
problems that were fixed here, and it was only used sometimes.

The new abi consists of 6 functions, three for testing if the constant in
the int cst will fit and three for actually pulling out the value.

The ones for testing are tree_fits_uhwi_p, tree_fits_shwi_p,
tree_fits_hwi_p.   The first two of these are unsigned and signed
versions,
and the second one takes a boolean parameter which is true if the value
is
positive.   This replaces the host_integerp which basically has the same
functionality of tree_fits_hwi_p.   The reason for changing this is that
there were about 400 calls to host_integerp and all but 4 of them took a
constant parameter. These names are chosen to be similar to the similar
functions in wide-int and are more mnemonic that the existing name since
they use the more readable u and s prefixes that a lot of other places
do.

On the accessing side, there is tree_to_uhwi, tree_to_shwi, and
tree_to_hwi.
The first two do checking when checking is enabled. The last does no
checking.

Just a quick note here - the changelog mentions tree_low_cst (as new
function!?) but not host_integerp.  You should probably get rid of both,
otherwise uses will creap back as people are familiar with them
(I'm not sure these changes for consistency are always good ...)

i will fix this.

these are bugs in the changelog, not the code.   new changelog included.



I don't like that tree_to_hwi does not do any checking.  In fact I don't
like that it _exists_, after all it has a return type which signedness
does not magically change.  Unchecked conversions should use
TREE_LOW_CST.

the idea is that when wide-int goes in, there is actually no
TREE_INT_CST_LOW.   The concept of low and high go out the door. the int-cst
will have an array in it that is big enough to hold the value.
so tree_to_hwi becomes a short hand for just accessing the lower element of
the array.

you could argue that you should say tree_fits_uhwi_p followed by
tree_to_uhwi (and the same for signed).   This is an easy fix.   it just
seemed a little redundant.

Well, if you want raw access to the lower element (when do you actually
want that, when not in wide-int.c/h?) ... you still need to care about the
signedness of the result.  And tree_fits_uhwi_p does not return the
same as tree_fits_shwi_p all the time.

I don't see any goodness in tree_to_hwi nor tree_fits_hwi really.  Because
if you just access the lower word then that still has a sign (either
HOST_WIDE_INT or unsigned HOST_WIDE_INT).  We should get rid
of those places - can you enumerate them?  I think you said it was just
a few callers with variable signedness argument.
Note that tree_fits_hwi_p does check.   it just takes a parameter to say 
if it wants signed or unsigned checking (it is the old host_integerp, 
repackaged).   You really do need this function as it is for the 4 or 5 
places it is called.  The parameter to it is typically, but not always, 
the sign of the type of the int cst being passed to it.


it is the tree_to_hwi that is unchecked.  Most of the places are 
identified with comments.  This survives the changelog.   (i happen to 
be in the group of people that think changelogs are useless, and that we 
should do a better job of commenting the code.)


I do not know if this is sloppyness or not, but the signedness that is 
checked rarely matches the sign of the variable that the value is 
assigned.  I found this quite frustrating when i was making the patch 
but this kind of thing is common in code where the original writer "knew 
what (s)he was doing."  Unless you are doing comparisons or shifting, 
the signedness of target does not really make much difference.


if you want me to change the sequences of explicit checking and 
unchecked access to explicit checking followed by a checked access, then 
i am happy to do this.


kenny

Richard.


I should also point out that about 2/3 if this patch is going to die as the
rest of the wide int stuff goes in.   But i did not want to submit a patch
that only converted 1/3 of the cases.   The truth is that most of these
places where you are doing this conversion are just because the people were
too lazy to do the math at the full precision of the double int.


Thus, my 2 cents before I really look at the patch (which will likely
be next week only, so maybe you ca

Re: wide int patch #7: Replacement of INT_CST_LT and INT_CST_LT_UNSIGNED

2012-10-18 Thread Kenneth Zadeck
(((unsigned HOST_WIDE_INT) TREE_INT_CST_HIGH (A)		\
-< (unsigned HOST_WIDE_INT) TREE_INT_CST_HIGH (B))		\
-   || (((unsigned HOST_WIDE_INT) TREE_INT_CST_HIGH (A)		\
-	== (unsigned HOST_WIDE_INT) TREE_INT_CST_HIGH (B))	\
-   && TREE_INT_CST_LOW (A) < TREE_INT_CST_LOW (B)))
-
 struct GTY(()) tree_int_cst {
   struct tree_typed typed;
   double_int int_cst;
@@ -5689,6 +5677,9 @@ extern const_tree strip_invariant_refs (const_tree);
 extern tree lhd_gcc_personality (void);
 extern void assign_assembler_name_if_neeeded (tree);
 extern void warn_deprecated_use (tree, tree);
+extern HOST_WIDE_INT int_cst_sign_mask (const_tree, int);
+extern bool tree_int_cst_lts_p (const_tree, const_tree);
+extern bool tree_int_cst_ltu_p (const_tree, const_tree);
 
 
 /* In cgraph.c */
2012-10-18  Kenneth Zadeck 
* c-family/c-common.c (shorten_compare): Now uses
tree_int_cst_ltu_p and tree_int_cst_lts_p.
* cp/call.c (type_passed_as, convert_for_arg_passing): Ditto.
* cp/class.c (walk_subobject_offsets, walk_subobject_offsets,
end_of_class, include_empty_classes, layout_class_type): Ditto.
* cp/decl.c (compute_array_index_type): Ditto.
* fold-const.c (fold_relational_const): Ditto.
* java/expr.c (build_field_ref): Ditto.
* targhooks.c (default_cxx_get_cookie_size): Ditto.
* tree-ssa-uninit.c (is_value_included_in): Ditto.
* tree-vrp.c (operand_less_p): Ditto.
* tree.c (tree_int_cst_lt): Ditto.
(int_cst_sign_mask, tree_int_cst_lts_p)
(tree_int_cst_ltu_p): New functions.
* tree.h (INT_CST_LT, INT_CST_LT_UNSIGNED): Remove.


Re: wide int patch #7: Replacement of INT_CST_LT and INT_CST_LT_UNSIGNED

2012-10-19 Thread Kenneth Zadeck


On 10/19/2012 04:22 AM, Richard Biener wrote:

On Thu, Oct 18, 2012 at 7:32 PM, Kenneth Zadeck
 wrote:

This patch replaces all instances of  INT_CST_LT and INT_CST_LT_UNSIGNED
with
the new functions tree_int_cst_lts_p and tree_int_cst_ltu_p.   With the new
implementation of int_cst these functions will be too big to do inline.

These new function names are extremely confusing given that we already
have tree_int_cst_lt which does the right thing based on the signedness
of the INTEGER_CST trees.

The whole point of the macros was to be inlined and you break that.  That is,
for example

if (unsignedp && unsignedp0)
 {
- min_gt = INT_CST_LT_UNSIGNED (primop1, minval);
- max_gt = INT_CST_LT_UNSIGNED (primop1, maxval);
- min_lt = INT_CST_LT_UNSIGNED (minval, primop1);
- max_lt = INT_CST_LT_UNSIGNED (maxval, primop1);
+ min_gt = tree_int_cst_ltu_p (primop1, minval);
+ max_gt = tree_int_cst_ltu_p (primop1, maxval);
+ min_lt = tree_int_cst_ltu_p (minval, primop1);
+ max_lt = tree_int_cst_ltu_p (maxval, primop1);
 }
else
 {
- min_gt = INT_CST_LT (primop1, minval);
- max_gt = INT_CST_LT (primop1, maxval);
- min_lt = INT_CST_LT (minval, primop1);
- max_lt = INT_CST_LT (maxval, primop1);
+ min_gt = tree_int_cst_lts_p (primop1, minval);
...

could have just been

 min_gt = tree_int_cst_lt (primop1, minval);


without any sign check.

So if you think you need to kill the inlined variants please use the existing
tree_int_cst_lt instead.
no, they could not have.   tree_int_cst_lt uses the signedness of the 
type to determine how to do the comparison.These two functions, as 
the macros they replace, force the comparison to be done independent of 
the signedness of the type.


I do not know why we need to do this.  I am just applying a plug 
compatible replacement here. I did not write this code, but I do not 
think that i can just do as you say here.


Kenny


Thanks,
Richard.


This is a small patch that has no prerequisites.

Tested on x86-64.

kenny




Re: wide int patch #7: Replacement of INT_CST_LT and INT_CST_LT_UNSIGNED

2012-10-19 Thread Kenneth Zadeck


On 10/19/2012 07:13 AM, Richard Biener wrote:

On Fri, Oct 19, 2012 at 12:59 PM, Kenneth Zadeck
 wrote:

On 10/19/2012 04:22 AM, Richard Biener wrote:

On Thu, Oct 18, 2012 at 7:32 PM, Kenneth Zadeck
 wrote:

This patch replaces all instances of  INT_CST_LT and INT_CST_LT_UNSIGNED
with
the new functions tree_int_cst_lts_p and tree_int_cst_ltu_p.   With the
new
implementation of int_cst these functions will be too big to do inline.

These new function names are extremely confusing given that we already
have tree_int_cst_lt which does the right thing based on the signedness
of the INTEGER_CST trees.

The whole point of the macros was to be inlined and you break that.  That
is,
for example

 if (unsignedp && unsignedp0)
  {
- min_gt = INT_CST_LT_UNSIGNED (primop1, minval);
- max_gt = INT_CST_LT_UNSIGNED (primop1, maxval);
- min_lt = INT_CST_LT_UNSIGNED (minval, primop1);
- max_lt = INT_CST_LT_UNSIGNED (maxval, primop1);
+ min_gt = tree_int_cst_ltu_p (primop1, minval);
+ max_gt = tree_int_cst_ltu_p (primop1, maxval);
+ min_lt = tree_int_cst_ltu_p (minval, primop1);
+ max_lt = tree_int_cst_ltu_p (maxval, primop1);
  }
 else
  {
- min_gt = INT_CST_LT (primop1, minval);
- max_gt = INT_CST_LT (primop1, maxval);
- min_lt = INT_CST_LT (minval, primop1);
- max_lt = INT_CST_LT (maxval, primop1);
+ min_gt = tree_int_cst_lts_p (primop1, minval);
...

could have just been

  min_gt = tree_int_cst_lt (primop1, minval);


without any sign check.

So if you think you need to kill the inlined variants please use the
existing
tree_int_cst_lt instead.

no, they could not have.   tree_int_cst_lt uses the signedness of the type
to determine how to do the comparison.These two functions, as the macros
they replace, force the comparison to be done independent of the signedness
of the type.

Well, looking at the surrounding code it's indeed non-obvious that this would
be a 1:1 transform.  But then they should not compare _trees_ but instead
compare double-ints (or wide-ints).

That said, I still think having a tree_int_cst_lt[us]_p function is wrong.
tree INTEGER_CSTs have a sign.  (apart from that opinion we have
tree_int_cst_lt and you introduce tree_int_cst_ltu_p - consistent
would be tree_int_cst_ltu).

This reply applies just as much to this patch as patch 6.
I morally agree with you 100%.But the code does not agree with you.

On patch 6, there are about 450 places where we want to take the lower 
hwi worth of bits out of a int cst.Of those, only 5 places use the 
function that allows the signedness to be passed in.   The rest make the 
signedness decision based on the local code and completely ignore any 
information in the type.Of those 5 that do allow the signedness to 
be passed in, only three of them actually pass it in based on the 
signedness of the variable they are accessing.


I am sure that a lot of these are wrong.  But i could not tell you which 
are and which are not.
luckily, a lot of this will go away with the full wide-int code because 
i just do most of this math in the full precision so the issue never 
comes up.But after i am finished, there will still be a fair number 
of places that do this.   (luckily, a large number of them are pulling 
the number out and comparing it to the precision of something, so this 
is likely to be harmless no matter how the code is written).


But to a large extent, you are shooting the messenger here, and not 
person who committed the crime.   I will be happy to add some comments 
to point the clients of these to the one that looks at the type.   In 
looking over the patch, the only obvious ones that could be changed are 
the ones in tree-ssa-uninit.c and the tree-vrp.c. The one in tree-vrp.c 
just looks like that the person writing the code did not know about 
tree_int_cst_lt and wrote the check out our himself.  (i will fix this 
in the tree-vrp patch that i am working on now. The one in 
tree-ssa-uniunit looks correct.


But beyond that, the rest are in the front ends and so i think that this 
as good as you get out of me.


Kenny


I do not know why we need to do this.  I am just applying a plug compatible
replacement here. I did not write this code, but I do not think that i can
just do as you say here.

So use the double-int interface in the places you substituted your new
tree predicates.  Yes, you'll have to touch that again when converting to
wide-int - but if those places really want to ignore the sign of the tree
they should not use a tree interface.

Richard.


Kenny



Thanks,
Richard.


This is a small patch that has no prerequisites.

Tested on x86-64.

kenny






Re: wide int patch #7: Replacement of INT_CST_LT and INT_CST_LT_UNSIGNED

2012-10-19 Thread Kenneth Zadeck


On 10/19/2012 07:58 AM, Richard Biener wrote:

On Fri, Oct 19, 2012 at 1:49 PM, Kenneth Zadeck
 wrote:

On 10/19/2012 07:13 AM, Richard Biener wrote:

On Fri, Oct 19, 2012 at 12:59 PM, Kenneth Zadeck
 wrote:

On 10/19/2012 04:22 AM, Richard Biener wrote:

On Thu, Oct 18, 2012 at 7:32 PM, Kenneth Zadeck
 wrote:

This patch replaces all instances of  INT_CST_LT and
INT_CST_LT_UNSIGNED
with
the new functions tree_int_cst_lts_p and tree_int_cst_ltu_p.   With the
new
implementation of int_cst these functions will be too big to do inline.

These new function names are extremely confusing given that we already
have tree_int_cst_lt which does the right thing based on the signedness
of the INTEGER_CST trees.

The whole point of the macros was to be inlined and you break that.
That
is,
for example

  if (unsignedp && unsignedp0)
   {
- min_gt = INT_CST_LT_UNSIGNED (primop1, minval);
- max_gt = INT_CST_LT_UNSIGNED (primop1, maxval);
- min_lt = INT_CST_LT_UNSIGNED (minval, primop1);
- max_lt = INT_CST_LT_UNSIGNED (maxval, primop1);
+ min_gt = tree_int_cst_ltu_p (primop1, minval);
+ max_gt = tree_int_cst_ltu_p (primop1, maxval);
+ min_lt = tree_int_cst_ltu_p (minval, primop1);
+ max_lt = tree_int_cst_ltu_p (maxval, primop1);
   }
  else
   {
- min_gt = INT_CST_LT (primop1, minval);
- max_gt = INT_CST_LT (primop1, maxval);
- min_lt = INT_CST_LT (minval, primop1);
- max_lt = INT_CST_LT (maxval, primop1);
+ min_gt = tree_int_cst_lts_p (primop1, minval);
...

could have just been

   min_gt = tree_int_cst_lt (primop1, minval);


without any sign check.

So if you think you need to kill the inlined variants please use the
existing
tree_int_cst_lt instead.

no, they could not have.   tree_int_cst_lt uses the signedness of the
type
to determine how to do the comparison.These two functions, as the
macros
they replace, force the comparison to be done independent of the
signedness
of the type.

Well, looking at the surrounding code it's indeed non-obvious that this
would
be a 1:1 transform.  But then they should not compare _trees_ but instead
compare double-ints (or wide-ints).

That said, I still think having a tree_int_cst_lt[us]_p function is wrong.
tree INTEGER_CSTs have a sign.  (apart from that opinion we have
tree_int_cst_lt and you introduce tree_int_cst_ltu_p - consistent
would be tree_int_cst_ltu).

This reply applies just as much to this patch as patch 6.
I morally agree with you 100%.But the code does not agree with you.

On patch 6, there are about 450 places where we want to take the lower hwi
worth of bits out of a int cst.Of those, only 5 places use the function
that allows the signedness to be passed in.   The rest make the signedness
decision based on the local code and completely ignore any information in
the type.Of those 5 that do allow the signedness to be passed in, only
three of them actually pass it in based on the signedness of the variable
they are accessing.

I am sure that a lot of these are wrong.  But i could not tell you which are
and which are not.
luckily, a lot of this will go away with the full wide-int code because i
just do most of this math in the full precision so the issue never comes up.
But after i am finished, there will still be a fair number of places that do
this.   (luckily, a large number of them are pulling the number out and
comparing it to the precision of something, so this is likely to be harmless
no matter how the code is written).

But to a large extent, you are shooting the messenger here, and not person
who committed the crime.   I will be happy to add some comments to point the
clients of these to the one that looks at the type.   In looking over the
patch, the only obvious ones that could be changed are the ones in
tree-ssa-uninit.c and the tree-vrp.c. The one in tree-vrp.c just looks like
that the person writing the code did not know about tree_int_cst_lt and
wrote the check out our himself.  (i will fix this in the tree-vrp patch
that i am working on now. The one in tree-ssa-uniunit looks correct.

But beyond that, the rest are in the front ends and so i think that this as
good as you get out of me.

Well, if you transform bogus (by moral standards) code into other
bogus code the whole point of your patch is to exchange the names
of a set of macros / functions to another set of macros / functions.

I see no point in that then.

Leave broken code as-is.  The more often you touch broken code
and just mangle it in some way the harder it gets to get to the
point that would maybe reveal the real intent of the original code.

Sorry for the harsh words, but to take the example of
INT_CST_LT and INT_CST_LT_UNSIGNED -- those are remanents of
a world before double_ints existed.  All uses should have been
replaced by double_int usage by now; replacing them with something
tree-ish is the wrong way.

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-23 Thread Kenneth Zadeck


On 10/23/2012 10:12 AM, Richard Biener wrote:

On Tue, Oct 9, 2012 at 5:09 PM, Kenneth Zadeck  wrote:

This patch implements the wide-int class.this is a more general version
of the double-int class and is meant to be the eventual replacement for that
class.The use of this class removes all dependencies of the host from
the target compiler's integer math.

I have made all of the changes i agreed to in the earlier emails. In
particular, this class internally maintains a bitsize and precision but not
a mode. The class now is neutral about modes and tree-types.the
functions that take modes or tree-types are just convenience functions that
translate the parameters into bitsize and precision and where ever there is
a call that takes a mode, there is a corresponding call that takes a
tree-type.

All of the little changes that richi suggested have also been made.

The buffer sizes is now twice the size needed by the largest integer mode.
This gives enough room for tree-vrp to do full multiplies on any type that
the target supports.

Tested on x86-64.

This patch depends on the first three patches.   I am still waiting on final
approval on the hwint.h patch.

Ok to commit?

diff --git a/gcc/wide-int.h b/gcc/wide-int.h
new file mode 100644
index 000..efd2c01
--- /dev/null
+++ b/gcc/wide-int.h
...
+#ifndef GENERATOR_FILE



The whole file is guarded with that ... why?  That is bound to be fragile once
use of wide-int spreads?  How do generator programs end up including
this file if they don't need it at all?
This is so that wide-int can be included at the level of the 
generators.   There some stuff that needs to see this type that is done 
during the build build phase that cannot see the types that are included 
in wide-int.h.

+#include "tree.h"
+#include "hwint.h"
+#include "options.h"
+#include "tm.h"
+#include "insn-modes.h"
+#include "machmode.h"
+#include "double-int.h"
+#include 
+#include "insn-modes.h"
+

That's a lot of tree and rtl dependencies.  double-int.h avoids these by
placing conversion routines in different headers or by only resorting to
types in coretypes.h.  Please try to reduce the above to a minimum.

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max byte mode
size 4 on a 64bit host.
I do not believe that this can happen.   The core compiler includes all 
modes up to TI mode, so by default we already up to 128 bits.

I still would like to have the ability to provide specializations of wide_int
for "small" sizes, thus ideally wide_int would be a template templated
on the number of HWIs in val.  Interface-wise wide_int<2> should be
identical to double_int, thus we should be able to do

typedef wide_int<2> double_int;
If you want to go down this path after the patches get in, go for it.
I see no use at all for this.
This was not meant to be a plug in replacement for double int. This goal 
of this patch is to get the compiler to do the constant math the way 
that the target does it.   Any such instantiation is by definition 
placing some predefined limit that some target may not want.



in double-int.h and replace its implementation with a specialization of
wide_int.  Due to a number of divergences (double_int is not a subset
of wide_int) that doesn't seem easily possible (one reason is the
ShiftOp and related enums you use).  Of course wide_int is not a
template either.  For the hypotetical embedded target above we'd
end up using wide_int<1>, a even more trivial specialization.

I realize again this wide-int is not what your wide-int is (because you
add a precision member).  Still factoring out the "common"s of
wide-int and double-int into a wide_int_raw <> template should be
possible.

+class wide_int {
+  /* Internal representation.  */
+
+  /* VAL is set to a size that is capable of computing a full
+ multiplication on the largest mode that is represented on the
+ target.  The full multiplication is use by tree-vrp.  If
+ operations are added that require larger buffers, then VAL needs
+ to be changed.  */
+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
+  unsigned short len;
+  unsigned int bitsize;
+  unsigned int precision;

The len, bitsize and precision members need documentation.  At least
one sounds redundant.

+ public:
+  enum ShiftOp {
+NONE,
NONE is never a descriptive name ... I suppose this is for arithmetic vs.
logical shifts?

suggest something

+/* There are two uses for the wide-int shifting functions.  The
+   first use is as an emulation of the target hardware.  The
+   second use is as service routines for other optimizations.  The
+   first case needs to be identified by passing TRUNC as the value
+   of ShiftOp so that shift amount is proper

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-23 Thread Kenneth Zadeck


On 10/23/2012 02:38 PM, Lawrence Crowl wrote:

On 10/23/12, Kenneth Zadeck  wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  inline bool minus_one_p () const;
+  inline bool zero_p () const;
+  inline bool one_p () const;
+  inline bool neg_p () const;

what's wrong with w == -1, w == 0, w == 1, etc.?

I would love to do this and you seem to be somewhat knowledgeable
of c++.  But i cannot for the life of me figure out how to do it.

Starting from the simple case, you write an operator ==.

as global operator:  bool operator == (wide_int w, int i);
as member operator:  bool wide_int::operator == (int i);

In the simple case,

bool operator == (wide_int w, int i)
{
   switch (i)
 {
   case -1: return w.minus_one_p ();
   case  0: return w.zero_p ();
   case  1: return w.one_p ();
   default: unexpected
 }
}

no, this seems wrong.you do not want to write code that can only 
fail at runtime unless there is a damn good reason to do that.

say i have a TImode number, which must be represented in 4 ints
on a 32 bit host (the same issue happens on 64 bit hosts, but
the examples are simpler on 32 bit hosts) and i compare it to -1.
The value that i am going to see as the argument of the function
is going have the value 0x.  but the value that i have
internally is 128 bits.  do i take this and 0 or sign extend it?

What would you have done with w.minus_one_p ()?
the code "knows" that -1 is a negative number and it knows the precision 
of w.That is enough information.   So it logically builds a -1 that 
has enough bits to do the conversion.




in particular if someone wants to compare a number to 0xdeadbeef i
have no idea what to do.  I tried defining two different functions,
one that took a signed and one that took and unsigned number but
then i wanted a cast in front of all the positive numbers.

This is where it does get tricky.  For signed arguments, you should sign
extend.  For unsigned arguments, you should not.  At present, we need
multiple overloads to avoid type ambiguities.

bool operator == (wide_int w, long long int i);
bool operator == (wide_int w, unsigned long long int i);
inline bool operator == (wide_int w, long int i)
   { return w == (long long int) i; }
inline bool operator (wide_int w, unsigned long int i)
   { return w == (unsigned long long int) i; }
inline bool operator == (wide_int w, int i)
   { return w == (long long int) i; }
inline bool operator (wide_int w, unsigned int i)
   { return w == (unsigned long long int) i; }

(There is a proposal before the C++ committee to fix this problem.)

Even so, there is room for potential bugs when wide_int does not
carry around whether or not it is signed.  The problem is that
regardless of what the programmer thinks of the sign of the wide int,
the comparison will use the sign of the int.
when they do we can revisit this.   but i looked at this and i said the 
potential bugs were not worth the effort.



If there is a way to do this, then i will do it, but it is going
to have to work properly for things larger than a HOST_WIDE_INT.

The long-term solution, IMHO, is to either carry the sign information
around in either the type or the class data.  (I prefer type, but
with a mechanism to carry it as data when needed.)  Such comparisons
would then require consistency in signedness between the wide int
and the plain int.
carrying the sign information is a non starter.The rtl level does 
not have it and the middle end violates it more often than not.My 
view was to design this having looked at all of the usage.   I have 
basically converted the whole compiler before i released the abi.   I am 
still getting out the errors and breaking it up in reviewable sized 
patches, but i knew very very well who my clients were before i wrote 
the abi.



I know that double-int does some of this and it does not carry
around a notion of signedness either.  is this just code that has
not been fully tested or is there a trick in c++ that i am missing?

The double int class only provides == and !=, and only with other
double ints.  Otherwise, it has the same value query functions that
you do above.  In the case of double int, the goal was to simplify
use of the existing semantics.  If you are changing the semantics,
consider incorporating sign explicitly.


i have, and it does not work.


Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-23 Thread Kenneth Zadeck


On 10/23/2012 04:25 PM, Lawrence Crowl wrote:

On 10/23/12, Kenneth Zadeck  wrote:

On 10/23/2012 02:38 PM, Lawrence Crowl wrote:

On 10/23/12, Kenneth Zadeck  wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  inline bool minus_one_p () const;
+  inline bool zero_p () const;
+  inline bool one_p () const;
+  inline bool neg_p () const;

what's wrong with w == -1, w == 0, w == 1, etc.?

I would love to do this and you seem to be somewhat knowledgeable
of c++.  But i cannot for the life of me figure out how to do it.

Starting from the simple case, you write an operator ==.

as global operator:  bool operator == (wide_int w, int i);
as member operator:  bool wide_int::operator == (int i);

In the simple case,

bool operator == (wide_int w, int i)
{
switch (i)
  {
case -1: return w.minus_one_p ();
case  0: return w.zero_p ();
case  1: return w.one_p ();
default: unexpected
  }
}

no, this seems wrong.you do not want to write code that can only
fail at runtime unless there is a damn good reason to do that.

Well, that's because it's the oversimplified case.  :-)


say i have a TImode number, which must be represented in 4 ints
on a 32 bit host (the same issue happens on 64 bit hosts, but
the examples are simpler on 32 bit hosts) and i compare it to -1.
The value that i am going to see as the argument of the function
is going have the value 0x.  but the value that i have
internally is 128 bits.  do i take this and 0 or sign extend it?

What would you have done with w.minus_one_p ()?

the code "knows" that -1 is a negative number and it knows the
precision of w.  That is enough information.  So it logically
builds a -1 that has enough bits to do the conversion.

And the code could also know that '-n' is a negative number and do
the identical conversion.  It would certainly be more difficult to
write and get all the edge cases.
I am not a c++ hacker.   if someone wants to go there later, we can 
investigate this.

but it seems like a can of worms right now.



in particular if someone wants to compare a number to 0xdeadbeef i
have no idea what to do.  I tried defining two different functions,
one that took a signed and one that took and unsigned number but
then i wanted a cast in front of all the positive numbers.

This is where it does get tricky.  For signed arguments, you should sign
extend.  For unsigned arguments, you should not.  At present, we need
multiple overloads to avoid type ambiguities.

bool operator == (wide_int w, long long int i);
bool operator == (wide_int w, unsigned long long int i);
inline bool operator == (wide_int w, long int i)
{ return w == (long long int) i; }
inline bool operator (wide_int w, unsigned long int i)
{ return w == (unsigned long long int) i; }
inline bool operator == (wide_int w, int i)
{ return w == (long long int) i; }
inline bool operator (wide_int w, unsigned int i)
{ return w == (unsigned long long int) i; }

(There is a proposal before the C++ committee to fix this problem.)

Even so, there is room for potential bugs when wide_int does not
carry around whether or not it is signed.  The problem is that
regardless of what the programmer thinks of the sign of the wide int,
the comparison will use the sign of the int.

when they do we can revisit this.   but i looked at this and i said the
potential bugs were not worth the effort.

I won't disagree.  I was answering what I thought were questions on
what was possible.


If there is a way to do this, then i will do it, but it is going
to have to work properly for things larger than a HOST_WIDE_INT.

The long-term solution, IMHO, is to either carry the sign information
around in either the type or the class data.  (I prefer type, but
with a mechanism to carry it as data when needed.)  Such comparisons
would then require consistency in signedness between the wide int
and the plain int.

carrying the sign information is a non starter.The rtl level does
not have it and the middle end violates it more often than not.My
view was to design this having looked at all of the usage.   I have
basically converted the whole compiler before i released the abi.   I am
still getting out the errors and breaking it up in reviewable sized
patches, but i knew very very well who my clients were before i wrote
the abi.

Okay.


I know that double-int does some of this and it does not carry
around a notion of signedness either.  is this just code that has
not been fully tested or is there a trick in c++ that i am missing?

The double int class only provides == and !=, and only with other
double ints.  Otherwise, it has the same value query functions that
you do above.  In the case of double int, the goal was to simplify
use of the existing semantics.  If you are changing the semantics,
consider incorporating sign explicitly.

i have, and it does not work.

Unfortunate.

There is certainly a desire here not to le

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-25 Thread Kenneth Zadeck


On 10/25/2012 06:42 AM, Richard Biener wrote:

On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump  wrote:

On Oct 24, 2012, at 2:43 AM, Richard Biener  wrote:

On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max byte mode
size 4 on a 64bit host.

I do not believe that this can happen.   The core compiler includes all
modes up to TI mode, so by default we already up to 128 bits.

And mode bitsizes are always power-of-two?  I suppose so.

Actually, no, they are not.  Partial int modes can have bit sizes that are not 
power of two, and, if there isn't an int mode that is bigger, we'd want to 
round up the partial int bit size.  Something like ((2 * 
MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /  
HOST_BITS_PER_WIDE_INT should do it.


I still would like to have the ability to provide specializations of
wide_int
for "small" sizes, thus ideally wide_int would be a template templated
on the number of HWIs in val.  Interface-wise wide_int<2> should be
identical to double_int, thus we should be able to do

typedef wide_int<2> double_int;

If you want to go down this path after the patches get in, go for it.I
see no use at all for this.
This was not meant to be a plug in replacement for double int. This goal of
this patch is to get the compiler to do the constant math the way that the
target does it.   Any such instantiation is by definition placing some
predefined limit that some target may not want.

Well, what I don't really like is that we now have two implementations
of functions that perform integer math on two-HWI sized integers.  What
I also don't like too much is that we have two different interfaces to operate
on them!  Can't you see how I come to not liking this?  Especially the
latter …

double_int is logically dead.  Reactoring wide-int and double-int is a waste of 
time, as the time is better spent removing double-int from the compiler.  All 
the necessary semantics and code of double-int _has_ been refactored into 
wide-int already.  Changing wide-int in any way to vend anything to double-int 
is wrong, as once double-int is removed, then all the api changes to make 
double-int share from wide-int is wasted and must then be removed.  The path 
forward is the complete removal of double-int; it is wrong, has been wrong and 
always will be wrong, nothing can change that.

double_int, compared to wide_int, is fast and lean.  I doubt we will
get rid of it - you
will make compile-time math a _lot_ slower.  Just profile when you for example
change get_inner_reference to use wide_ints.

To be able to remove double_int in favor of wide_int requires _at least_
templating wide_int on 'len' and providing specializations for 1 and 2.

It might be a non-issue for math that operates on trees or RTXen due to
the allocation overhead we pay, but in recent years we transitioned important
paths away from using tree math to using double_ints _for speed reasons_.

Richard.
i do not know why you believe this about the speed. double int 
always does synthetic math since you do everything at 128 bit precision.


the thing about wide int, is that since it does math to the precision's 
size, it almost never does uses synthetic operations since the sizes for 
almost every instance can be done using the native math on the 
machine.   almost every call has a check to see if the operation can be 
done natively.I seriously doubt that you are going to do TI mode 
math much faster than i do it and if you do who cares.


the number of calls does not effect the performance in any negative way 
and it fact is more efficient since common things that require more than 
one operation in double in are typically done in a single operation.




Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-10-30 Thread Kenneth Zadeck

jakub,

i am hoping to get the rest of my wide integer conversion posted by nov 
5.   I am under some adverse conditions here: hurricane sandy hit her 
pretty badly.  my house is hooked up to a small generator, and no one 
has any power for miles around.


So far richi has promised to review them.   he has sent some comments, 
but so far no reviews.Some time after i get the first round of them 
posted, i will do a second round that incorporates everyones comments.


But i would like a little slack here if possible.While this work is 
a show stopper for my private port, the patches address serious problems 
for many of the public ports, especially ones that have very flexible 
vector units.I believe that there are significant set of latent 
problems currently with the existing ports that use ti mode that these 
patches will fix.


However, i will do everything in my power to get the first round of the 
patches posted by nov 5 deadline.


kenny

On 10/29/2012 01:56 PM, Jakub Jelinek wrote:

Status
==

I'd like to close the stage 1 phase of GCC 4.8 development
on Monday, November 5th.  If you have still patches for new features you'd
like to see in GCC 4.8, please post them for review soon.  Patches
posted before the freeze, but reviewed shortly after the freeze, may
still go in, further changes should be just bugfixes and documentation
fixes.


Quality Data


Priority  #   Change from Last Report
---   ---
P1   23   + 23
P2   77   +  8
P3   85   + 84
---   ---
Total   185   +115


Previous Report
===

http://gcc.gnu.org/ml/gcc/2012-03/msg00011.html

The next report will be sent by me again, announcing end of stage 1.




Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-10-31 Thread Kenneth Zadeck

Richi,

Let me explain to you what a broken api is.   I have spent the last week 
screwing around with tree-vpn and as of last night i finally got it to 
work.   In tree-vpn, it is clear that double-int is the precise 
definition of a broken api.


The tree-vpn uses an infinite-precision view of arithmetic. However, 
that infinite precision is implemented on top of a finite, CARVED IN 
STONE, base that is and will always be without a patch like this, 128 
bits on an x86-64.However, as was pointed out by earlier, tree-vrp 
needs 2 * the size of a type + 1 bit to work correctly.Until 
yesterday i did not fully understand the significance of that 1 bit.  
what this means is that tree-vrp does not work on an x86-64 with _int128 
variables.


There are no checks in tree-vrp to back off when it sees something too 
large, tree-vrp simply gets the wrong answer.   To me, this is a broken 
api and is GCC at its very worst.   The patches that required this 
SHOULD HAVE NEVER GONE INTO GCC.   What you have with my patches is 
someone who is willing to fix a large and complex problem that should 
have been fixed years ago.


I understand that you do not like several aspects of the wide-int api 
and i am willing to make some of those improvements.   However, what i 
am worried about is that you are in some ways really attached to the 
style of programmed where everything is dependent on the size of a 
HWI.I will continue to push back on those comments but have been 
working the rest in as i have been going along.


To answer your other question, it will be a significant problem if i 
cannot get these patches in.   They are very prone to patch rot and my 
customer wants a product without many patches to the base code.
Also, i fear that your real reason that you want to wait is because you 
really do not like the fact these patches get rid of double in and that 
style of programming and putting off that day serves no one well.


kenny









On 10/31/2012 05:59 AM, Richard Sandiford wrote:

Richard Biener  writes:

On Tue, Oct 30, 2012 at 10:05 PM, Kenneth Zadeck
  wrote:

jakub,

i am hoping to get the rest of my wide integer conversion posted by nov 5.
I am under some adverse conditions here: hurricane sandy hit her pretty
badly.  my house is hooked up to a small generator, and no one has any power
for miles around.

So far richi has promised to review them.   he has sent some comments, but
so far no reviews.Some time after i get the first round of them posted,
i will do a second round that incorporates everyones comments.

But i would like a little slack here if possible.While this work is a
show stopper for my private port, the patches address serious problems for
many of the public ports, especially ones that have very flexible vector
units.I believe that there are significant set of latent problems
currently with the existing ports that use ti mode that these patches will
fix.

However, i will do everything in my power to get the first round of the
patches posted by nov 5 deadline.

I suppose you are not going to merge your private port for 4.8 and thus
the wide-int changes are not a show-stopper for you.

That said, I considered the main conversion to be appropriate to be
defered for the next stage1.  There is no advantage in disrupting the
tree more at this stage.

I would like the wide_int class and rtl stuff to go in 4.8 though.
IMO it's a significant improvement in its own right, and Kenny
submitted it well before the deadline.

Richard





Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 08:11 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
 wrote:

Richard Biener  writes:

On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
 wrote:

Richard Biener  writes:

On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 wrote:

On 10/25/2012 06:42 AM, Richard Biener wrote:

On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump  wrote:

On Oct 24, 2012, at 2:43 AM, Richard Biener 
wrote:

On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max byte mode
size 4 on a 64bit host.

I do not believe that this can happen.   The core compiler includes all
modes up to TI mode, so by default we already up to 128 bits.

And mode bitsizes are always power-of-two?  I suppose so.

Actually, no, they are not.  Partial int modes can have bit sizes that
are not power of two, and, if there isn't an int mode that is bigger, we'd
want to round up the partial int bit size.  Something like ((2 *
MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
HOST_BITS_PER_WIDE_INT should do it.


I still would like to have the ability to provide specializations of
wide_int
for "small" sizes, thus ideally wide_int would be a template templated
on the number of HWIs in val.  Interface-wise wide_int<2> should be
identical to double_int, thus we should be able to do

typedef wide_int<2> double_int;

If you want to go down this path after the patches get in, go for it.
I
see no use at all for this.
This was not meant to be a plug in replacement for double int. This
goal of
this patch is to get the compiler to do the constant math the way that
the
target does it.   Any such instantiation is by definition placing some
predefined limit that some target may not want.

Well, what I don't really like is that we now have two implementations
of functions that perform integer math on two-HWI sized integers.  What
I also don't like too much is that we have two different interfaces to
operate
on them!  Can't you see how I come to not liking this?  Especially the
latter …

double_int is logically dead.  Reactoring wide-int and double-int is a
waste of time, as the time is better spent removing double-int from the
compiler.  All the necessary semantics and code of double-int _has_ been
refactored into wide-int already.  Changing wide-int in any way to vend
anything to double-int is wrong, as once double-int is removed,
then all the
api changes to make double-int share from wide-int is wasted and must then
be removed.  The path forward is the complete removal of double-int; it is
wrong, has been wrong and always will be wrong, nothing can change that.

double_int, compared to wide_int, is fast and lean.  I doubt we will
get rid of it - you
will make compile-time math a _lot_ slower.  Just profile when you for
example
change get_inner_reference to use wide_ints.

To be able to remove double_int in favor of wide_int requires _at least_
templating wide_int on 'len' and providing specializations for 1 and 2.

It might be a non-issue for math that operates on trees or RTXen due to
the allocation overhead we pay, but in recent years we transitioned
important
paths away from using tree math to using double_ints _for speed reasons_.

Richard.

i do not know why you believe this about the speed. double int always
does synthetic math since you do everything at 128 bit precision.

the thing about wide int, is that since it does math to the precision's
size, it almost never does uses synthetic operations since the sizes for
almost every instance can be done using the native math on the machine.
almost every call has a check to see if the operation can be done natively.
I seriously doubt that you are going to do TI mode math much faster than i
do it and if you do who cares.

the number of calls does not effect the performance in any negative way and
it fact is more efficient since common things that require more than one
operation in double in are typically done in a single operation.

Simple double-int operations like

inline double_int
double_int::and_not (double_int b) const
{
   double_int result;
   result.low = low & ~b.low;
   result.high = high & ~b.high;
   return result;
}

are always going to be faster than conditionally executing only one operation
(but inside an offline function).

OK, this is really in reply to the 4.8 thing, but it felt more
appropriate here.

It's interesting that you gave this example, since before you were
complaining about too many fused ops.  Clearly this one could be
removed in favour of separate and() and not() operations, but why
not provide a fused one if there are clients who'll make use of it?

I was more concerned about fused operations that use precision
or bitsize as input.  That is for example


+  bool only_sign_

Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-10-31 Thread Kenneth Zadeck

jakub

my port has 256 bit integers.   They are done by strapping together all 
of the elements of a vector unit.
if one looks at where intel is going, they are doing exactly the same 
thing.The difference is that they like to add the operations one at 
a time rather than just do a clean implementation like we did.   Soon 
they will get there, it is just a matter of time.


i understand the tree-vrp code well enough to say that this operation 
does not work if you have timode, but i do not know how to translate 
that back into c to generate a test case.My patch to tree-vrp is 
adaptable in that it looks at the types in the program and adjusts its 
definition of infinite precision based on the code that it sees.  I can 
point people to that code in tree vrp and am happy to do that, but that 
is not my priority now.


also, richi pointed out that there are places in the tree level constant 
propagators that require infinite precision so he is really the person 
who both should know about this and generate proper tests.


kenny

On 10/31/2012 09:55 AM, Jakub Jelinek wrote:

On Wed, Oct 31, 2012 at 09:44:50AM -0400, Kenneth Zadeck wrote:

The tree-vpn uses an infinite-precision view of arithmetic. However,
that infinite precision is implemented on top of a finite, CARVED IN
STONE, base that is and will always be without a patch like this,
128 bits on an x86-64.However, as was pointed out by earlier,
tree-vrp needs 2 * the size of a type + 1 bit to work correctly.
Until yesterday i did not fully understand the significance of that
1 bit.  what this means is that tree-vrp does not work on an x86-64
with _int128 variables.

If you see a VRP bug, please file a PR with a testcase, or point to existing
PR.  I agree with richi that it would be better to add a clean wide_int
implementation for 4.9, rather than rushing something in, introducing
lots of bugs, just for a port that hasn't been submitted, nor I understand
why > int128_t integer types are so crucial to your port, the vector
support doesn't generally need very large integers, even if your
vector unit is 256-bit, 512-bit or larger.

Jakub




Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 10:05 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 2:54 PM, Kenneth Zadeck
 wrote:

On 10/31/2012 08:11 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
 wrote:

Richard Biener  writes:

On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
 wrote:

Richard Biener  writes:

On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 wrote:

On 10/25/2012 06:42 AM, Richard Biener wrote:

On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump 
wrote:

On Oct 24, 2012, at 2:43 AM, Richard Biener

wrote:

On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max byte
mode
size 4 on a 64bit host.

I do not believe that this can happen.   The core compiler
includes all
modes up to TI mode, so by default we already up to 128 bits.

And mode bitsizes are always power-of-two?  I suppose so.

Actually, no, they are not.  Partial int modes can have bit sizes
that
are not power of two, and, if there isn't an int mode that is
bigger, we'd
want to round up the partial int bit size.  Something like ((2 *
MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
HOST_BITS_PER_WIDE_INT should do it.


I still would like to have the ability to provide
specializations of
wide_int
for "small" sizes, thus ideally wide_int would be a template
templated
on the number of HWIs in val.  Interface-wise wide_int<2> should
be
identical to double_int, thus we should be able to do

typedef wide_int<2> double_int;

If you want to go down this path after the patches get in, go for
it.
I
see no use at all for this.
This was not meant to be a plug in replacement for double int.
This
goal of
this patch is to get the compiler to do the constant math the way
that
the
target does it.   Any such instantiation is by definition placing
some
predefined limit that some target may not want.

Well, what I don't really like is that we now have two
implementations
of functions that perform integer math on two-HWI sized integers.
What
I also don't like too much is that we have two different
interfaces to
operate
on them!  Can't you see how I come to not liking this?  Especially
the
latter …

double_int is logically dead.  Reactoring wide-int and double-int
is a
waste of time, as the time is better spent removing double-int from
the
compiler.  All the necessary semantics and code of double-int _has_
been
refactored into wide-int already.  Changing wide-int in any way to
vend
anything to double-int is wrong, as once double-int is removed,
then all the
api changes to make double-int share from wide-int is wasted and
must then
be removed.  The path forward is the complete removal of
double-int; it is
wrong, has been wrong and always will be wrong, nothing can change
that.

double_int, compared to wide_int, is fast and lean.  I doubt we will
get rid of it - you
will make compile-time math a _lot_ slower.  Just profile when you
for
example
change get_inner_reference to use wide_ints.

To be able to remove double_int in favor of wide_int requires _at
least_
templating wide_int on 'len' and providing specializations for 1 and
2.

It might be a non-issue for math that operates on trees or RTXen due
to
the allocation overhead we pay, but in recent years we transitioned
important
paths away from using tree math to using double_ints _for speed
reasons_.

Richard.

i do not know why you believe this about the speed. double int
always
does synthetic math since you do everything at 128 bit precision.

the thing about wide int, is that since it does math to the
precision's
size, it almost never does uses synthetic operations since the sizes
for
almost every instance can be done using the native math on the
machine.
almost every call has a check to see if the operation can be done
natively.
I seriously doubt that you are going to do TI mode math much faster
than i
do it and if you do who cares.

the number of calls does not effect the performance in any negative
way and
it fact is more efficient since common things that require more than
one
operation in double in are typically done in a single operation.

Simple double-int operations like

inline double_int
double_int::and_not (double_int b) const
{
double_int result;
result.low = low & ~b.low;
result.high = high & ~b.high;
return result;
}

are always going to be faster than conditionally executing only one
operation
(but inside an offline function).

OK, this is really in reply to the 4.8 thing, but it felt more
appropriate here.

It's interesting that you gave this example, since before you were
complaining about too many fused ops.  Clearly this one could be
removed in favour of separate and() and not() operations, but why
not provide a fused one if there are clients who'll make use of it?

I was more conce

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 09:54 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 2:30 PM, Richard Sandiford
 wrote:

Richard Biener  writes:

But that means that wide_int has to model a P-bit operation as a
"normal" len*HOST_WIDE_INT operation and then fix up the result
after the fact, which seems unnecessarily convoluted.

It does that right now.  The operations are carried out in a loop
over len HOST_WIDE_INT parts, the last HWI is then special-treated
to account for precision/size.  (yes, 'len' is also used as optimization - the
fact that len ends up being mutable is another thing I dislike about
wide-int.  If wide-ints are cheap then all ops should be non-mutating
(at least to 'len')).

But the point of having a mutating len is that things like zero and -1
are common even for OImode values.  So if you're doing someting potentially
expensive like OImode multiplication, why do it to the number of
HOST_WIDE_INTs needed for an OImode value when the value we're
processing has only one significant HOST_WIDE_INT?

I don't propose doing that.  I propose that no wide-int member function
may _change_ it's len (to something larger).  Only that way you can
avoid allocating wasted space for zero and -1.  That way also the
artificial limit on 2 * largest-int-mode-hwis goes.

it is now 4x not 2x to accomodate the extra bit in tree-vrp.

remember that the space burden is minimal.wide-ints are not 
persistent and there are never more than a handful at a time.



  I still don't
see why a full-precision 2*HOST_WIDE_INT operation (or a full-precision
X*HOST_WIDE_INT operation for any X) has any special meaning.

Well, the same reason as a HOST_WIDE_INT variable has a meaning.
We use it to constrain what we (efficiently) want to work on.  For example
CCP might iterate up to 2 * HOST_BITS_PER_WIDE_INT times when
doing bit-constant-propagation in loops (for TImode integers on a x86_64 host).

But what about targets with modes wider than TImode?  Would double_int
still be appropriate then?  If not, why does CCP have to use a templated
type with a fixed number of HWIs (and all arithmetic done to a fixed
number of HWIs) rather than one that can adapt to the runtime values,
like wide_int can?

Because nobody cares about accurate bit-tracking for modes larger than
TImode.  And because no convenient abstraction was available ;)
yes, but tree-vrp does not even work for timode.  and there are not 
tests to scale it back when it does see ti-mode.   I understand that 
these can be added, but they so far have not been.


I would also point out that i was corrected on this point by (i believe) 
lawrence.   He points out that tree-vrp is still important for 
converting signed to unsigned for larger modes.




Oh, and I don't necessary see a use of double_int in its current form
but for an integer representation on the host that is efficient to manipulate
integer constants of a target dependent size.  For example the target
detail that we have partial integer modes with bitsize > precision and that
the bits > precision appearantly have a meaning when looking at the
bit-representation of a constant should not be part of the base class
of wide-int (I doubt it belongs to wide-int at all, but I guess you know more
about the reason we track bitsize in addition to precision - I think it's
abstraction at the wrong level, the tree level does fine without knowing
about bitsize).

TBH I'm uneasy about the bitsize thing too.  I think bitsize is only
tracked for shift truncation, and if so, I agree it makes sense
to do that separately.

So, can we please remove all traces of bitsize from wide-int then?


But anyway, this whole discussion seems to have reached a stalemate.
Or I suppose a de-facto rejection, since you're the only person in
a position to approve the thing :-)

There are many (silent) people that are able to approve the thing.  But the
point is I have too many issues with the current patch that I'm unable
to point at a specific thing I want Kenny to change after which the patch
would be fine.  So I rely on some guesswork from Kenny giving my
advices "leaner API", "less fused ops", "get rid of bitsize", "think of
abstracting the core HWI[len] operation", "there should be no tree or
RTL dependencies in the wide-int API" to produce an updated variant.
Which of course takes time, which of course crosses my vacation, which
in the end means it isn't going to make 4.8 (I _do_ like the idea of not
having a dependence on host properties for integer constant representation).

Btw, a good hint at what a minimal wide-int API would look like is if
you _just_ replace double-int users with it.  Then you obviously have to
implement only the double-int interface and conversion from/to double-int.

Richard.



Richard




Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 10:24 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 3:18 PM, Kenneth Zadeck
 wrote:

On 10/31/2012 10:05 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 2:54 PM, Kenneth Zadeck
 wrote:

On 10/31/2012 08:11 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
 wrote:

Richard Biener  writes:

On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
 wrote:

Richard Biener  writes:

On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 wrote:

On 10/25/2012 06:42 AM, Richard Biener wrote:

On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump

wrote:

On Oct 24, 2012, at 2:43 AM, Richard Biener

wrote:

On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max
byte
mode
size 4 on a 64bit host.

I do not believe that this can happen.   The core compiler
includes all
modes up to TI mode, so by default we already up to 128 bits.

And mode bitsizes are always power-of-two?  I suppose so.

Actually, no, they are not.  Partial int modes can have bit sizes
that
are not power of two, and, if there isn't an int mode that is
bigger, we'd
want to round up the partial int bit size.  Something like ((2 *
MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
HOST_BITS_PER_WIDE_INT should do it.


I still would like to have the ability to provide
specializations of
wide_int
for "small" sizes, thus ideally wide_int would be a template
templated
on the number of HWIs in val.  Interface-wise wide_int<2>
should
be
identical to double_int, thus we should be able to do

typedef wide_int<2> double_int;

If you want to go down this path after the patches get in, go
for
it.
I
see no use at all for this.
This was not meant to be a plug in replacement for double int.
This
goal of
this patch is to get the compiler to do the constant math the
way
that
the
target does it.   Any such instantiation is by definition
placing
some
predefined limit that some target may not want.

Well, what I don't really like is that we now have two
implementations
of functions that perform integer math on two-HWI sized
integers.
What
I also don't like too much is that we have two different
interfaces to
operate
on them!  Can't you see how I come to not liking this?
Especially
the
latter …

double_int is logically dead.  Reactoring wide-int and double-int
is a
waste of time, as the time is better spent removing double-int
from
the
compiler.  All the necessary semantics and code of double-int
_has_
been
refactored into wide-int already.  Changing wide-int in any way
to
vend
anything to double-int is wrong, as once double-int is removed,
then all the
api changes to make double-int share from wide-int is wasted and
must then
be removed.  The path forward is the complete removal of
double-int; it is
wrong, has been wrong and always will be wrong, nothing can
change
that.

double_int, compared to wide_int, is fast and lean.  I doubt we
will
get rid of it - you
will make compile-time math a _lot_ slower.  Just profile when you
for
example
change get_inner_reference to use wide_ints.

To be able to remove double_int in favor of wide_int requires _at
least_
templating wide_int on 'len' and providing specializations for 1
and
2.

It might be a non-issue for math that operates on trees or RTXen
due
to
the allocation overhead we pay, but in recent years we
transitioned
important
paths away from using tree math to using double_ints _for speed
reasons_.

Richard.

i do not know why you believe this about the speed. double int
always
does synthetic math since you do everything at 128 bit precision.

the thing about wide int, is that since it does math to the
precision's
size, it almost never does uses synthetic operations since the
sizes
for
almost every instance can be done using the native math on the
machine.
almost every call has a check to see if the operation can be done
natively.
I seriously doubt that you are going to do TI mode math much faster
than i
do it and if you do who cares.

the number of calls does not effect the performance in any negative
way and
it fact is more efficient since common things that require more
than
one
operation in double in are typically done in a single operation.

Simple double-int operations like

inline double_int
double_int::and_not (double_int b) const
{
 double_int result;
 result.low = low & ~b.low;
 result.high = high & ~b.high;
 return result;
}

are always going to be faster than conditionally executing only one
operation
(but inside an offline function).

OK, this is really in reply to the 4.8 thing, but it felt more
appropriate here.

It's interesting that you gave this example, since before you were
complaining about too many fused ops.  Clearly this one could be
removed in favour of separate and() and not() opera

Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 08:44 AM, Richard Biener wrote:

On Wed, Oct 31, 2012 at 1:22 PM, Richard Sandiford
 wrote:

Richard Biener  writes:

On Wed, Oct 31, 2012 at 1:05 PM, Richard Sandiford
 wrote:

Richard Biener  writes:

On Wed, Oct 31, 2012 at 11:43 AM, Richard Sandiford
 wrote:

Richard Biener  writes:

On Thu, Oct 25, 2012 at 12:55 PM, Kenneth Zadeck
 wrote:

On 10/25/2012 06:42 AM, Richard Biener wrote:

On Wed, Oct 24, 2012 at 7:23 PM, Mike Stump
 wrote:

On Oct 24, 2012, at 2:43 AM, Richard Biener 
wrote:

On Tue, Oct 23, 2012 at 6:12 PM, Kenneth Zadeck
 wrote:

On 10/23/2012 10:12 AM, Richard Biener wrote:

+  HOST_WIDE_INT val[2 * MAX_BITSIZE_MODE_ANY_INT /
HOST_BITS_PER_WIDE_INT];

are we sure this rounds properly?  Consider a port with max byte mode
size 4 on a 64bit host.

I do not believe that this can happen.  The core compiler
includes all
modes up to TI mode, so by default we already up to 128 bits.

And mode bitsizes are always power-of-two?  I suppose so.

Actually, no, they are not.  Partial int modes can have bit sizes that
are not power of two, and, if there isn't an int mode that is
bigger, we'd
want to round up the partial int bit size.  Something like ((2 *
MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT - 1) /
HOST_BITS_PER_WIDE_INT should do it.


I still would like to have the ability to provide specializations of
wide_int
for "small" sizes, thus ideally wide_int would be a template
templated
on the number of HWIs in val.  Interface-wise wide_int<2> should be
identical to double_int, thus we should be able to do

typedef wide_int<2> double_int;

If you want to go down this path after the patches get in, go for it.
I
see no use at all for this.
This was not meant to be a plug in replacement for double int. This
goal of
this patch is to get the compiler to do the constant math the way that
the
target does it.   Any such instantiation is by definition placing some
predefined limit that some target may not want.

Well, what I don't really like is that we now have two implementations
of functions that perform integer math on two-HWI sized integers.  What
I also don't like too much is that we have two different interfaces to
operate
on them!  Can't you see how I come to not liking this?  Especially the
latter …

double_int is logically dead.  Reactoring wide-int and double-int is a
waste of time, as the time is better spent removing double-int from the
compiler.  All the necessary semantics and code of double-int _has_ been
refactored into wide-int already.  Changing wide-int in any way to vend
anything to double-int is wrong, as once double-int is removed,
then all the
api changes to make double-int share from wide-int is wasted and
must then
be removed.  The path forward is the complete removal of
double-int; it is
wrong, has been wrong and always will be wrong, nothing can change that.

double_int, compared to wide_int, is fast and lean.  I doubt we will
get rid of it - you
will make compile-time math a _lot_ slower.  Just profile when you for
example
change get_inner_reference to use wide_ints.

To be able to remove double_int in favor of wide_int requires _at least_
templating wide_int on 'len' and providing specializations for 1 and 2.

It might be a non-issue for math that operates on trees or RTXen due to
the allocation overhead we pay, but in recent years we transitioned
important
paths away from using tree math to using double_ints _for speed reasons_.

Richard.

i do not know why you believe this about the speed. double int always
does synthetic math since you do everything at 128 bit precision.

the thing about wide int, is that since it does math to the precision's
size, it almost never does uses synthetic operations since the sizes for
almost every instance can be done using the native math on the machine.
almost every call has a check to see if the operation can be done
natively.
I seriously doubt that you are going to do TI mode math much faster than i
do it and if you do who cares.

the number of calls does not effect the performance in any
negative way and
it fact is more efficient since common things that require more than one
operation in double in are typically done in a single operation.

Simple double-int operations like

inline double_int
double_int::and_not (double_int b) const
{
   double_int result;
   result.low = low & ~b.low;
   result.high = high & ~b.high;
   return result;
}

are always going to be faster than conditionally executing only one
operation
(but inside an offline function).

OK, this is really in reply to the 4.8 thing, but it felt more
appropriate here.

It's interesting that you gave this example, since before you were
complaining about too many fused ops.  Clearly this one could be
removed in favour of separate and() and not() operations, but why
not provide a fused one if there are clients who'll make use of it?

I was more concerned about fused o

Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-10-31 Thread Kenneth Zadeck
I was not planning to do that mangling for 4.8.My primary 
justification for getting it in publicly now is that there are a large 
number of places where the current compiler (both at the tree and rtl 
levels) do not do optimization of the value is larger than a single 
hwi.My code generalizes all of these places so that they do the 
transformations independent of the size of the hwi.   (in some cases at 
the rtl level, the transformations were only done on 32 bit or smaller 
types, but i have seen nothing like that at the tree level.)   This 
provides benefits for cross compilers and for ports that support timode 
now.


The fact that i have chosen to do it in such a way that we will never 
have this problem again is the part of the patch that richi seems to 
object to.


We have patches that do the mangling for 256 for the front ends but we 
figured that we would post those for comments.   These are likely to be 
controversial because the require extensions to the syntax to accept 
large constants.


But there is no reason why the patches that fix the existing problems in 
a general way should not be considered for this release.


Kenny

On 10/31/2012 10:27 AM, Jakub Jelinek wrote:

On Wed, Oct 31, 2012 at 10:04:58AM -0400, Kenneth Zadeck wrote:

if one looks at where intel is going, they are doing exactly the
same thing.The difference is that they like to add the
operations one at a time rather than just do a clean implementation
like we did.   Soon they will get there, it is just a matter of
time.

All I see on Intel is whole vector register shifts (and like on many other
ports and/or/xor/andn could be considered whole register too).
And, even if your port has 256-bit integer arithmetics, there is no mangling
for __int256_t or similar, so I don't see how you can offer such data type
as supported in the 4.8 timeframe.

Jakub




Re: patch to fix constant math - 4th patch - the wide-int class.

2012-10-31 Thread Kenneth Zadeck


On 10/31/2012 09:30 AM, Richard Sandiford wrote:

Richard Biener  writes:

But that means that wide_int has to model a P-bit operation as a
"normal" len*HOST_WIDE_INT operation and then fix up the result
after the fact, which seems unnecessarily convoluted.

It does that right now.  The operations are carried out in a loop
over len HOST_WIDE_INT parts, the last HWI is then special-treated
to account for precision/size.  (yes, 'len' is also used as optimization - the
fact that len ends up being mutable is another thing I dislike about
wide-int.  If wide-ints are cheap then all ops should be non-mutating
(at least to 'len')).

But the point of having a mutating len is that things like zero and -1
are common even for OImode values.  So if you're doing someting potentially
expensive like OImode multiplication, why do it to the number of
HOST_WIDE_INTs needed for an OImode value when the value we're
processing has only one significant HOST_WIDE_INT?
I think with a little thought i can add some special constructors and 
get rid of the mutating aspects of the interface.





  I still don't
see why a full-precision 2*HOST_WIDE_INT operation (or a full-precision
X*HOST_WIDE_INT operation for any X) has any special meaning.

Well, the same reason as a HOST_WIDE_INT variable has a meaning.
We use it to constrain what we (efficiently) want to work on.  For example
CCP might iterate up to 2 * HOST_BITS_PER_WIDE_INT times when
doing bit-constant-propagation in loops (for TImode integers on a x86_64 host).

But what about targets with modes wider than TImode?  Would double_int
still be appropriate then?  If not, why does CCP have to use a templated
type with a fixed number of HWIs (and all arithmetic done to a fixed
number of HWIs) rather than one that can adapt to the runtime values,
like wide_int can?


Oh, and I don't necessary see a use of double_int in its current form
but for an integer representation on the host that is efficient to manipulate
integer constants of a target dependent size.  For example the target
detail that we have partial integer modes with bitsize > precision and that
the bits > precision appearantly have a meaning when looking at the
bit-representation of a constant should not be part of the base class
of wide-int (I doubt it belongs to wide-int at all, but I guess you know more
about the reason we track bitsize in addition to precision - I think it's
abstraction at the wrong level, the tree level does fine without knowing
about bitsize).

TBH I'm uneasy about the bitsize thing too.  I think bitsize is only
tracked for shift truncation, and if so, I agree it makes sense
to do that separately.

But anyway, this whole discussion seems to have reached a stalemate.
Or I suppose a de-facto rejection, since you're the only person in
a position to approve the thing :-)

Richard




Re: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-10-31 Thread Kenneth Zadeck

Jakub,

it is hard from all of the threads to actually distill what the real 
issues are here.  So let me start from a clean slate and state them simply.


Richi has three primary objections:

1) that we can do all of this with a templated version of double-int.
2) that we should not be passing in a precision and bitsize into the 
interface.

3) that the interface is too large.

I have attached a fragment of my patch #5 to illustrate the main thrust 
of my patches and to illustrate the usefulness to gcc right now.


In the current trunk, we have code that does simplification when the 
mode fits in an HWI and we have code that does the simplification if the 
mode fits in two HWIs.   if the mode does not fit in two hwi's the code 
does not do the simplification.


Thus here and in a large number of other places we have two copies of 
the code.Richi wants there to be multiple template instantiations of 
double-int.This means that we are now going to have to have 3 copies 
of this code to support oi mode on a 64 bit host and 4 copies on a 32 
bit host.


Further note that there are not as many cases for the 2*hwi in the code 
as their are for the hwi case and in general this is true through out 
the compiler.  (CLRSB is missing from the 2hwi case in the patch)  We 
really did not write twice the code when we stated supporting 2 hwi, we 
added about 1.5 times the code (simplify-rtx is better than most of the 
rest of the compiler).  I am using the rtl level as an example here 
because i have posted all of those patches, but the tree level is no 
better.


I do not want to write this code a third time and certainly not a fourth 
time.   Just fixing all of this is quite useful now: it fills in a lot 
of gaps in our transformations and it removes many edge case crashes 
because ti mode really is lightly tested.  However, this patch becomes 
crucial as the world gets larger.


Richi's second point is that we should be doing everything at "infinite 
precision" and not passing in an explicit bitsize and precision.   That 
works ok (sans the issues i raised with it in tree-vpn earlier) when the 
largest precision on the machine fits in a couple of hwis.However, 
for targets that have large integers or cross compilers, this becomes 
expensive.The idea behind my set of patches is that for the 
transformations that can work this way, we do the math in the precision 
of the type or mode.   In general this means that almost all of the math 
will be done quickly, even on targets that support really big 
integers.   For passes like tree-vrp, the math will be done at some 
multiple of the largest type seen in the actual program.The amount 
of the multiple is a function of the optimization, not the target or the 
host. Currently (on my home computer) the wide-int interface allows the 
optimization to go 4x the largest mode on the target.


I can get rid of this bound at the expense of doing an alloca rather 
than stack allocating a fixed sized structure.However, given the 
extremely heavy use of this interface, that does not seem like the best 
of tradeoffs.


The truth is that the vast majority of the compiler actually wants to 
see the math done the way that it is going to be done on the machine.  
Tree-vrp and the gimple constant prop do not.  But i have made 
accommodations to handle both needs.I believe that the reason that 
double-int was never used at the rtl level is that it does not actually 
do the math in a way that is useful to the target.


Richi's third objection is that the interface is too large.   I 
disagree.   It was designed based on the actual usage of the 
interface.   When i found places where i was writing the same code over 
and over again, i put it in a function as part of the interface.   I 
later went back and optimized many of these because this is a very 
heavily used interface.  Richi has many other objections, but i have 
agreed to fix almost all of them, so i am not going to address them here.


It really will be a huge burden to have to carry these patched until the 
next revision.  We are currently in stage 1 and i believe that the minor 
issues that richi raises can be easily addressed.


kenny
@@ -1373,302 +1411,87 @@ simplify_const_unary_operation (enum rtx_code code, enum machine_mode mode,
   return CONST_DOUBLE_FROM_REAL_VALUE (d, mode);
 }
 
-  if (CONST_INT_P (op)
-  && width <= HOST_BITS_PER_WIDE_INT && width > 0)
+  if (CONST_SCALAR_INT_P (op) && width > 0)
 {
-  HOST_WIDE_INT arg0 = INTVAL (op);
-  HOST_WIDE_INT val;
+  wide_int result;
+  enum machine_mode imode = op_mode == VOIDmode ? mode : op_mode;
+  wide_int op0 = wide_int::from_rtx (op, imode);
+
+#if TARGET_SUPPORTS_WIDE_INT == 0
+  /* This assert keeps the simplification from producing a result
+	 that cannot be represented in a CONST_DOUBLE but a lot of
+	 upstream callers expect that this function never fails to
+	 simplify something and so you if you a

Re: patch to canonize small wide-ints.

2013-10-03 Thread Kenneth Zadeck

Richard and others,

This patch changes the way that unsigned trees are canonicalized. As 
with the last patch, I think that this patch is not that much of a step 
forward.   While it is still true that the signed tree-csts match the 
representation of wide-ints, unsigned tree-csts do not and require 
conversion on the way in and out, as before.   The only thing is that 
the conversion is different.   Again, if Richard wants it in then i will 
commit it.


Kenny
Index: gcc/tree.c
===
--- gcc/tree.c	(revision 203039)
+++ gcc/tree.c	(working copy)
@@ -1187,10 +1187,10 @@ wide_int_to_tree (tree type, const wide_
   tree t;
   int ix = -1;
   int limit = 0;
-  int i;
+  unsigned int i;
 
   gcc_assert (type);
-  int prec = TYPE_PRECISION (type);
+  unsigned int prec = TYPE_PRECISION (type);
   signop sgn = TYPE_SIGN (type);
 
   /* Verify that everything is canonical.  */
@@ -1204,11 +1204,11 @@ wide_int_to_tree (tree type, const wide_
 }
 
   wide_int cst = wide_int::from (pcst, prec, sgn);
-  int len = int (cst.get_len ());
-  int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  unsigned int len = int (cst.get_len ());
+  unsigned int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
   bool recanonize = sgn == UNSIGNED
-&& (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT == len
-&& small_prec;
+&& small_prec
+&& (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT == len;
 
   switch (TREE_CODE (type))
 {
@@ -1235,7 +1235,7 @@ wide_int_to_tree (tree type, const wide_
 
 case INTEGER_TYPE:
 case OFFSET_TYPE:
-  if (TYPE_UNSIGNED (type))
+  if (TYPE_SIGN (type) == UNSIGNED)
 	{
 	  /* Cache 0..N */
 	  limit = INTEGER_SHARE_LIMIT;
@@ -1294,7 +1294,7 @@ wide_int_to_tree (tree type, const wide_
 	 must be careful here because tree-csts and wide-ints are
 	 not canonicalized in the same way.  */
 	  gcc_assert (TREE_TYPE (t) == type);
-	  gcc_assert (TREE_INT_CST_NUNITS (t) == len);
+	  gcc_assert (TREE_INT_CST_NUNITS (t) == (int)len);
 	  if (recanonize)
 	{
 	  len--;
@@ -1321,7 +1321,10 @@ wide_int_to_tree (tree type, const wide_
 	  TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix) = t;
 	}
 }
-  else if (cst.get_len () == 1)
+  else if (cst.get_len () == 1
+	   && (TYPE_SIGN (type) == SIGNED
+	   || recanonize
+	   || cst.elt (0) >= 0))
 {
   /* 99.99% of all int csts will fit in a single HWI.  Do that one
 	 efficiently.  */
@@ -1351,14 +1354,29 @@ wide_int_to_tree (tree type, const wide_
 	 for the gc to take care of.  There will not be enough of them
 	 to worry about.  */
   void **slot;
-  tree nt = make_int_cst (len);
-  TREE_INT_CST_NUNITS (nt) = len;
+  tree nt;
+  if (!recanonize
+	  && TYPE_SIGN (type) == UNSIGNED 
+	  && cst.elt (len - 1) < 0)
+	{
+	  unsigned int blocks_needed 
+	= (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT;
+
+	  nt = make_int_cst (blocks_needed + 1);
+	  for (i = len; i < blocks_needed; i++)
+	TREE_INT_CST_ELT (nt, i) = (HOST_WIDE_INT)-1;
+
+	  TREE_INT_CST_ELT (nt, blocks_needed) = 0;
+	}
+  else
+	nt = make_int_cst (len);
   if (recanonize)
 	{
 	  len--;
 	  TREE_INT_CST_ELT (nt, len) = zext_hwi (cst.elt (len), small_prec);
 	}
-  for (int i = 0; i < len; i++)
+	
+  for (i = 0; i < len; i++)
 	TREE_INT_CST_ELT (nt, i) = cst.elt (i);
   TREE_TYPE (nt) = type;
 
@@ -10556,7 +10574,8 @@ widest_int_cst_value (const_tree x)
 
 #if HOST_BITS_PER_WIDEST_INT > HOST_BITS_PER_WIDE_INT
   gcc_assert (HOST_BITS_PER_WIDEST_INT >= HOST_BITS_PER_DOUBLE_INT);
-  gcc_assert (TREE_INT_CST_NUNITS (x) <= 2);
+  gcc_assert (TREE_INT_CST_NUNITS (x) <= 2
+	  || (TREE_INT_CST_NUNITS (x) == 3 && TREE_INT_CST_ELT (x, 2) == 0));
   
   if (TREE_INT_CST_NUNITS (x) == 1)
 val = ((HOST_WIDEST_INT)val << HOST_BITS_PER_WIDE_INT) >> HOST_BITS_PER_WIDE_INT;
@@ -10565,7 +10584,8 @@ widest_int_cst_value (const_tree x)
 	<< HOST_BITS_PER_WIDE_INT);
 #else
   /* Make sure the sign-extended value will fit in a HOST_WIDE_INT.  */
-  gcc_assert (TREE_INT_CST_NUNITS (x) == 1);
+  gcc_assert (TREE_INT_CST_NUNITS (x) == 1
+	  || (TREE_INT_CST_NUNITS (x) == 2 && TREE_INT_CST_ELT (x, 1) == 0));
 #endif
 
   if (bits < HOST_BITS_PER_WIDEST_INT)
Index: gcc/tree.h
===
--- gcc/tree.h	(revision 203039)
+++ gcc/tree.h	(working copy)
@@ -3050,7 +3050,8 @@ cst_fits_shwi_p (const_tree x)
   if (TREE_CODE (x) != INTEGER_CST)
 return false;
 
-  return TREE_INT_CST_NUNITS (x) == 1;
+  return TREE_INT_CST_NUNITS (x) == 1
+|| (TREE_INT_CST_NUNITS (x) == 2 && TREE_INT_CST_ELT (x, 1) == 0);
 }
 
 /* Checks that X is integer constant that can be expressed in signed
@@ -3093,7 +3094,7 @@ tree_fits_uhwi_p (const_tree cst)
   /* For numbers of unsigned type that are longer than a HWI, if
 	 the top bit of the bottom word 

Re: patch to canonize small wide-ints.

2013-10-03 Thread Kenneth Zadeck

On 10/03/2013 10:29 AM, Richard Sandiford wrote:

Kenneth Zadeck  writes:

+  /* Got to be careful of precision 0 values.  */
+  if (precision)
+len = MIN (len, max_len);
+  if (TYPE_SIGN (TREE_TYPE (x)) == UNSIGNED)
  {
-  if (precision < HOST_BITS_PER_WIDE_INT
- && TYPE_SIGN (TREE_TYPE (x)) == UNSIGNED)
+  unsigned int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  if (small_prec)
{
- /* The rep of wide-int is signed, so if the value comes from
-an unsigned int_cst, we have to sign extend it to make it
-correct.  */
- scratch[0] = sext_hwi (val[0], precision);
- return wi::storage_ref (scratch, 1, precision);
+ /* We have to futz with this because the canonization for
+short unsigned numbers in wide-int is different from the
+canonized short unsigned numbers in the tree-cst.  */
+ if (len == max_len)
+   {
+ for (unsigned int i = 0; i < len - 1; i++)
+   scratch[i] = val[i];
+ scratch[len - 1] = sext_hwi (val[len - 1], precision);
+ return wi::storage_ref (scratch, len, precision);
+   }
}
-  /* Otherwise we can use the constant as-is when not extending.  */
-  return wi::storage_ref (val, len, precision);
+
+  unsigned int xprecision = get_precision (x);
+  len = wi::force_to_size (scratch, val, len, xprecision, precision, 
UNSIGNED);
+  return wi::storage_ref (scratch, len, precision);
  }
  
-  /* Widen the constant according to its sign.  */

-  len = wi::force_to_size (scratch, val, len, xprecision, precision,
-  TYPE_SIGN (TREE_TYPE (x)));
-  return wi::storage_ref (scratch, len, precision);
+  /* Signed and the rest of the unsigned cases are easy.  */
+  return wi::storage_ref (val, len, precision);

AFAICT, no unsigned cases fall through to the "rest of the unsigned cases"
statement.

You are seeing the result of me being corner cased to death.



Changing the representation of unsigned constants is only worthwhile
if we can avoid the force_to_size for some unsigned cases.  I think we can
avoid it for precision >= xprecision && !small_prec.  Either we should take
the hit of doing that comparison (but see below) or the change isn't
worthwhile.
i think that it is something closer to precision >= xprecision + 
HOST_BITS_PER_WIDE_INT && ...
The problem is that the tree cst may have one extra block beyond the 
precision.


for my example say HBPWI is 8, not 64.   The uint16 constant 0x must 
be represented by 3 hwi's in the tree-cst as 00 ff ff.   This is why the 
patch came out today rather than yesterday.

I can fire up a run with this, if you want.


I was thinking that we should always be able to use the constant as-is
for max_wide_int-based and addr_wide_int-based operations.  The small_prec
again, you can get edge cased to death here.i think it would work 
for max because that really is bigger than anything else, but it is 
possible (though unlikely) to have something big converted to an address 
by truncation.



case will get optimised away (because precision is a compile-time constant)
and the force_to_size shouldn't be necessary (because "max" and "addr"
should always be wide enough).  Could we assert for precision >= xprecision
instead of the force_to_size?  Or are there cases in which we implicitly
truncate tree constants for (variable-precision) wide_int-based operations?
again the max is safe by definition.The addr is inherently unsafe 
because i believe that some of the places where the values are converted 
to addr-wide-int come from tree-csts where the type comes from the 
source program.


Thanks,
Richard




Re: patch to canonize unsigned tree-csts

2013-10-03 Thread Kenneth Zadeck



I was thinking that we should always be able to use the constant as-is
for max_wide_int-based and addr_wide_int-based operations.  The small_prec

again, you can get edge cased to death here.i think it would work
for max because that really is bigger than anything else, but it is
possible (though unlikely) to have something big converted to an address
by truncation.

But I'd have expected that conversion to be represented by an explicit
CONVERT_EXPR or NOP_EXPR.  It seems wrong to use addr_wide_int directly on
something that isn't bit- or byte-address-sized.  It'd be the C equivalent
of int + long -> int rather than the expected int + long -> long.

Same goes for wide_int.  If we're doing arithmetic at a specific
precision, it seems odd for one of the inputs to be wider and yet
not have an explicit truncation.
you miss the second reason why we needed addr-wide-int.   A large amount 
of the places where the addressing arithmetic is done are not "type 
safe".Only the gimple and rtl that is translated from the source 
code is really type safe. In passes like the strength reduction 
where they just "grab things from all over", the addr-wide-int or the 
max-wide-int are safe haven structures that are guaranteed to be large 
enough to not matter.So what i fear here is something like a very 
wide loop counter being grabbed into some address calculation.



Thanks,
Richard




Re: patch to canonicalize tree-csts.

2013-10-04 Thread Kenneth Zadeck
So here is a patch with the change. As before, bootstrapped an tested on 
x86-64.


On 10/03/2013 12:16 PM, Richard Sandiford wrote:

Kenneth Zadeck  writes:

Changing the representation of unsigned constants is only worthwhile
if we can avoid the force_to_size for some unsigned cases.  I think we can
avoid it for precision >= xprecision && !small_prec.  Either we should take
the hit of doing that comparison (but see below) or the change isn't
worthwhile.

i think that it is something closer to precision >= xprecision +
HOST_BITS_PER_WIDE_INT && ...
The problem is that the tree cst may have one extra block beyond the
precision.

Ah, OK.


I was thinking that we should always be able to use the constant as-is
for max_wide_int-based and addr_wide_int-based operations.  The small_prec

again, you can get edge cased to death here.i think it would work
for max because that really is bigger than anything else, but it is
possible (though unlikely) to have something big converted to an address
by truncation.

But I'd have expected that conversion to be represented by an explicit
CONVERT_EXPR or NOP_EXPR.  It seems wrong to use addr_wide_int directly on
something that isn't bit- or byte-address-sized.  It'd be the C equivalent
of int + long -> int rather than the expected int + long -> long.

Same goes for wide_int.  If we're doing arithmetic at a specific
precision, it seems odd for one of the inputs to be wider and yet
not have an explicit truncation.

Thanks,
Richard


Index: gcc/tree.c
===
--- gcc/tree.c	(revision 203039)
+++ gcc/tree.c	(working copy)
@@ -1187,10 +1187,10 @@ wide_int_to_tree (tree type, const wide_
   tree t;
   int ix = -1;
   int limit = 0;
-  int i;
+  unsigned int i;
 
   gcc_assert (type);
-  int prec = TYPE_PRECISION (type);
+  unsigned int prec = TYPE_PRECISION (type);
   signop sgn = TYPE_SIGN (type);
 
   /* Verify that everything is canonical.  */
@@ -1204,11 +1204,11 @@ wide_int_to_tree (tree type, const wide_
 }
 
   wide_int cst = wide_int::from (pcst, prec, sgn);
-  int len = int (cst.get_len ());
-  int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
+  unsigned int len = int (cst.get_len ());
+  unsigned int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
   bool recanonize = sgn == UNSIGNED
-&& (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT == len
-&& small_prec;
+&& small_prec
+&& (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT == len;
 
   switch (TREE_CODE (type))
 {
@@ -1235,7 +1235,7 @@ wide_int_to_tree (tree type, const wide_
 
 case INTEGER_TYPE:
 case OFFSET_TYPE:
-  if (TYPE_UNSIGNED (type))
+  if (TYPE_SIGN (type) == UNSIGNED)
 	{
 	  /* Cache 0..N */
 	  limit = INTEGER_SHARE_LIMIT;
@@ -1294,7 +1294,7 @@ wide_int_to_tree (tree type, const wide_
 	 must be careful here because tree-csts and wide-ints are
 	 not canonicalized in the same way.  */
 	  gcc_assert (TREE_TYPE (t) == type);
-	  gcc_assert (TREE_INT_CST_NUNITS (t) == len);
+	  gcc_assert (TREE_INT_CST_NUNITS (t) == (int)len);
 	  if (recanonize)
 	{
 	  len--;
@@ -1321,7 +1321,10 @@ wide_int_to_tree (tree type, const wide_
 	  TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix) = t;
 	}
 }
-  else if (cst.get_len () == 1)
+  else if (cst.get_len () == 1
+	   && (TYPE_SIGN (type) == SIGNED
+	   || recanonize
+	   || cst.elt (0) >= 0))
 {
   /* 99.99% of all int csts will fit in a single HWI.  Do that one
 	 efficiently.  */
@@ -1351,14 +1354,29 @@ wide_int_to_tree (tree type, const wide_
 	 for the gc to take care of.  There will not be enough of them
 	 to worry about.  */
   void **slot;
-  tree nt = make_int_cst (len);
-  TREE_INT_CST_NUNITS (nt) = len;
+  tree nt;
+  if (!recanonize
+	  && TYPE_SIGN (type) == UNSIGNED 
+	  && cst.elt (len - 1) < 0)
+	{
+	  unsigned int blocks_needed 
+	= (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT;
+
+	  nt = make_int_cst (blocks_needed + 1);
+	  for (i = len; i < blocks_needed; i++)
+	TREE_INT_CST_ELT (nt, i) = (HOST_WIDE_INT)-1;
+
+	  TREE_INT_CST_ELT (nt, blocks_needed) = 0;
+	}
+  else
+	nt = make_int_cst (len);
   if (recanonize)
 	{
 	  len--;
 	  TREE_INT_CST_ELT (nt, len) = zext_hwi (cst.elt (len), small_prec);
 	}
-  for (int i = 0; i < len; i++)
+	
+  for (i = 0; i < len; i++)
 	TREE_INT_CST_ELT (nt, i) = cst.elt (i);
   TREE_TYPE (nt) = type;
 
@@ -10556,7 +10574,8 @@ widest_int_cst_value (const_tree x)
 
 #if HOST_BITS_PER_WIDEST_INT > HOST_BITS_PER_WIDE_INT
   gcc_assert (HOST_BITS_PER_WIDEST_INT >= HOST_BITS_PER_DOUBLE_INT);
-  gcc_assert (TREE_INT_CST_NUNITS (x) <= 2);
+  gcc_assert (TREE_INT_CST_NUNITS (x) <= 2
+	  || (TREE_INT_CST_NUNITS (x) == 3 && TREE_INT_CST_ELT (x

Re: patch to canonize unsigned tree-csts

2013-10-04 Thread Kenneth Zadeck

On 10/04/2013 01:00 PM, Richard Sandiford wrote:

I was hoping Richard would weigh in here.  In case not...

Kenneth Zadeck  writes:

I was thinking that we should always be able to use the constant as-is
for max_wide_int-based and addr_wide_int-based operations.  The small_prec

again, you can get edge cased to death here.i think it would work
for max because that really is bigger than anything else, but it is
possible (though unlikely) to have something big converted to an address
by truncation.

But I'd have expected that conversion to be represented by an explicit
CONVERT_EXPR or NOP_EXPR.  It seems wrong to use addr_wide_int directly on
something that isn't bit- or byte-address-sized.  It'd be the C equivalent
of int + long -> int rather than the expected int + long -> long.

Same goes for wide_int.  If we're doing arithmetic at a specific
precision, it seems odd for one of the inputs to be wider and yet
not have an explicit truncation.

you miss the second reason why we needed addr-wide-int.   A large amount
of the places where the addressing arithmetic is done are not "type
safe".Only the gimple and rtl that is translated from the source
code is really type safe. In passes like the strength reduction
where they just "grab things from all over", the addr-wide-int or the
max-wide-int are safe haven structures that are guaranteed to be large
enough to not matter.So what i fear here is something like a very
wide loop counter being grabbed into some address calculation.

It still seems really dangerous to be implicitly truncating a wider type
to addr_wide_int.  It's not something we'd ever do in mainline, because
uses of addr_wide_int are replacing uses of double_int, and double_int
is our current maximum-width representation.

Using addr_wide_int rather than max_wide_int is an optimisation.
IMO part of implementating that optimisation should be to introduce
explicit truncations whenever we try to use addr_wide_int to operate
on inputs than might be wider than addr_wide_int.

So I still think the assert is the way to go.
addr_wide_int is not as much of an optimization as it is documentation 
of what you are doing - i.e. this is addressing arithmetic.  My 
justification for putting it in was that we wanted a sort of an abstract 
type to say that this was not just user math, it was addressing 
arithmetic and that the ultimate result is going to be slammed into a 
target pointer.


I was only using that as an example to try to indicate that I did not 
think that it was wrong if someone did truncate.   In particular, would 
you want the assert to be that the value was truncated or that the type 
of the value would allow numbers that would be truncated?   I actually 
think neither.


If a programmer uses a long long on a 32 bit machine for some index 
variable and slams that into a pointer, he either knows what he is doing 
or has made a mistake.do you really think that the compiler should ice?




Thanks,
Richard





Re: patch to canonize unsigned tree-csts

2013-10-15 Thread Kenneth Zadeck

i added the assertion that richard requested and tested this on x86-64.

committed as revision 203602.


On 10/06/2013 05:13 AM, Richard Sandiford wrote:

Kenneth Zadeck  writes:

On 10/04/2013 01:00 PM, Richard Sandiford wrote:

I was hoping Richard would weigh in here.  In case not...

Kenneth Zadeck  writes:

I was thinking that we should always be able to use the constant as-is
for max_wide_int-based and addr_wide_int-based operations.  The small_prec

again, you can get edge cased to death here.i think it would work
for max because that really is bigger than anything else, but it is
possible (though unlikely) to have something big converted to an address
by truncation.

But I'd have expected that conversion to be represented by an explicit
CONVERT_EXPR or NOP_EXPR.  It seems wrong to use addr_wide_int directly on
something that isn't bit- or byte-address-sized.  It'd be the C equivalent
of int + long -> int rather than the expected int + long -> long.

Same goes for wide_int.  If we're doing arithmetic at a specific
precision, it seems odd for one of the inputs to be wider and yet
not have an explicit truncation.

you miss the second reason why we needed addr-wide-int.   A large amount
of the places where the addressing arithmetic is done are not "type
safe".Only the gimple and rtl that is translated from the source
code is really type safe. In passes like the strength reduction
where they just "grab things from all over", the addr-wide-int or the
max-wide-int are safe haven structures that are guaranteed to be large
enough to not matter.So what i fear here is something like a very
wide loop counter being grabbed into some address calculation.

It still seems really dangerous to be implicitly truncating a wider type
to addr_wide_int.  It's not something we'd ever do in mainline, because
uses of addr_wide_int are replacing uses of double_int, and double_int
is our current maximum-width representation.

Using addr_wide_int rather than max_wide_int is an optimisation.
IMO part of implementating that optimisation should be to introduce
explicit truncations whenever we try to use addr_wide_int to operate
on inputs than might be wider than addr_wide_int.

So I still think the assert is the way to go.

addr_wide_int is not as much of an optimization as it is documentation
of what you are doing - i.e. this is addressing arithmetic.  My
justification for putting it in was that we wanted a sort of an abstract
type to say that this was not just user math, it was addressing
arithmetic and that the ultimate result is going to be slammed into a
target pointer.

I was only using that as an example to try to indicate that I did not
think that it was wrong if someone did truncate.   In particular, would
you want the assert to be that the value was truncated or that the type
of the value would allow numbers that would be truncated?   I actually
think neither.

I'm arguing for:

 gcc_assert (precision >= xprecision);

in wi::int_traits ::decompose.

IIRC one of the reasons for wanting addr_wide_int rather than wide_int
was that we wanted a type that could handle both bit and byte sizes.
And we wanted to be able to convert between bits and bytes seamlessly.
That means that shifting is a valid operation for addr_wide_int.  But if
we also implicitly (and that's the key word) used addr_wide_int
directly on tree constants that are wider than addr_wide_int, and say
shifted the result right, the answer would be different from what you'd
get if you did the shift in max_wide_int.  That seems like new behaviour,
since all address arithmetic is effectively done to maximum precision on
mainline.  It's just that the maximum on mainline is rather small.

If code is looking through casts to see wider-than-addr_wide_int types,
I think it's reasonable for that code to have to explicitly force the
tree to addr_wide_int size, via addr_wide_int::from.  Leaving it implicit
seems too subtle and also means that every caller to wi::int_traits
::decompose does a check that is usually unnecessary.


If a programmer uses a long long on a 32 bit machine for some index
variable and slams that into a pointer, he either knows what he is doing
or has made a mistake.do you really think that the compiler should ice?

No, I'm saying that passes that operate on addr_wide_ints while "grabbing
trees from all over" (still not sure what that means in practice) should
explicitly mark places where a truncation is deliberately allowed.
Those places then guarantee that the dropped bits wouldn't affect any of
the later calculations, which is something only the pass itself can know.

We already forbid direct assignments like:

addr_wide_int x = max_wide_int(...);

at compile time, for similar reasons.

Thanks,
Richard


Index: gcc/tree.c
===
--- gcc/tree.c	(revisio

Re: [wide-int] int_traits

2013-10-16 Thread Kenneth Zadeck

On 10/16/2013 09:55 AM, Richard Biener wrote:

Speaking in terms of a patch:

Index: tree.h
===
--- tree.h  (revision 203690)
+++ tree.h  (working copy)
@@ -5184,14 +5184,11 @@ wi::int_traits ::decompose (
   / HOST_BITS_PER_WIDE_INT);
unsigned int xprecision = get_precision (x);
  
-  gcc_assert (precision >= xprecision);

+  gcc_checking_assert (precision >= xprecision && precision != 0);
  
-  /* Got to be careful of precision 0 values.  */

-  if (precision)
-len = MIN (len, max_len);
if (TYPE_SIGN (TREE_TYPE (x)) == UNSIGNED)
  {
-  unsigned int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1);
+  unsigned int small_prec = xprecision & (HOST_BITS_PER_WIDE_INT -
1);
if (small_prec)
 {
   /* We have to futz with this because the canonization for
@@ -5201,7 +5198,7 @@ wi::int_traits ::decompose (
 {
   for (unsigned int i = 0; i < len - 1; i++)
 scratch[i] = val[i];
- scratch[len - 1] = sext_hwi (val[len - 1], precision);
+ scratch[len - 1] = sext_hwi (val[len - 1], small_prec);
   return wi::storage_ref (scratch, len, precision);
 }
 }

the precision 0 thing is gone as far as I understand?
There is still the case where the front ends create types with precision 
0, and that causes a few odd cases, but the precision 0 that you refer 
to is gone.

Also sign-extending the upper hwi must be done with precision %
HOST_BITS_PER_WIDE_INT, no?  And it assumes that we only ever
have to extend the upper most HWI, thus 'len' is never too big.

yes

I note that we immediately return in the above case, so

   if (precision < xprecision + HOST_BITS_PER_WIDE_INT)
 {
   len = wi::force_to_size (scratch, val, len, xprecision,
precision, UNSIGNED);
   return wi::storage_ref (scratch, len, precision);
 }

applies only when the desired precision is a multiple of a HWI.

yes

I assume it adds that extra zero word in case the MSB is set,
right?

yes

I am confused about the condition written here and
how we look at precision and not xprecision when deciding how
to extend - given a 8-bit precision unsigned 0xff and precision == 64
we do not even consider sign-extending the value because we look
at precision and not xprecision.  Don't we need to look at
xprecision here?

I do not think so.   There are three cases here:

1) xprecision < precision:
say xprecision = 8 and precision = 32.

The number is between 0 and 255 and after canonicalization the number 
will be between 0 and 255.


2) xprecision == precision
not much to talk about here.

3) xprecision > precision

We need to loose the upper bits of the input and it does not matter what 
they were.The only thing that we are concerned about is that the 
bits above what is now the sign bit pointed to by precision are matched 
in the bits above precision.



I do think we could change is that the "if (precision < xprecision + 
HWI)" to become "else if (...)", which will save a test.

After all an assert precision == xprecision
does not work in this routine.

Quickly execute.exp tested only.

To avoid code quality wreckage we have to implement a different way
of allocating the 'scratch' storage of wide_int_ref_storage
(well, ideally we wouldn't need it ...).  I thought of simply
allocating it via alloca (), but we need to do that in all
callers that build a wide_int_ref (eventually via a hidden
default arg).  But as that's a template specialization of
generic_wide_int ... so the option is to provide a function
wrapper inside wi:: for this, like
I want richard and mike to be the people who respond to the next 
point.I am not a c++ person and all of this storage manager layer 
stuff gives me a headache.



template 
wide_int_ref
wi::ref (const T &t, HOST_WIDE_INT *scratch = XALLOCAVEC (HOST_WIDE_INT,
WIDE_INT_MAX_ELTS))
{
   return wide_int_ref (t, scratch);
}

and amend the storage constructors to get a scratch argument.

The reason for all this is that otherwise the wide_int_ref object
escapes when we pass the scratch storage to the workers in
int_traits .

Thanks,
Richard.




powerpc support for TARGET_SUPPORTS_WIDE_INT.

2013-10-16 Thread Kenneth Zadeck
This patch makes the PPC the first public target to use the 
CONST_WIDE_INT representation rather than CONST_DOUBLES to represent 
larger integers.   There are two  parts to this patch.   The changes to 
genrecog.c represent fixes that would be needed for any port that 
supports CONST_WIDE_INTS and the rest are changes specific to the rs6000 
port.   Bootstrapped and tested on a ppc64.


Committed as revision 204710.

Kenny
Index: gcc/genrecog.c
===
--- gcc/genrecog.c	(revision 203701)
+++ gcc/genrecog.c	(working copy)
@@ -588,6 +588,7 @@ validate_pattern (rtx pattern, rtx insn,
 		 && GET_CODE (src) != PC
 		 && GET_CODE (src) != CC0
 		 && !CONST_INT_P (src)
+		 && !CONST_WIDE_INT_P (src)
 		 && GET_CODE (src) != CALL)
 	  {
 	const char *which;
@@ -772,13 +773,14 @@ add_to_sequence (rtx pattern, struct dec
 
 	   We can optimize the generated code a little if either
 	   (a) the predicate only accepts one code, or (b) the
-	   predicate does not allow CONST_INT, in which case it
-	   can match only if the modes match.  */
+	   predicate does not allow CONST_INT or CONST_WIDE_INT,
+	   in which case it can match only if the modes match.  */
 	pred = lookup_predicate (pred_name);
 	if (pred)
 	  {
 		test->u.pred.data = pred;
-		allows_const_int = pred->codes[CONST_INT];
+		allows_const_int = (pred->codes[CONST_INT]
+|| pred->codes[CONST_WIDE_INT]);
 		if (was_code == MATCH_PARALLEL
 		&& pred->singleton != PARALLEL)
 		  message_with_line (pattern_lineno,
Index: gcc/recog.c
===
--- gcc/recog.c	(revision 203701)
+++ gcc/recog.c	(working copy)
@@ -1198,7 +1198,8 @@ const_scalar_int_operand (rtx op, enum m
 
 /* Returns 1 if OP is an operand that is a CONST_WIDE_INT of mode
MODE.  This most likely is not as useful as
-   const_scalar_int_operand, but is here for consistancy.  */
+   const_scalar_int_operand since it does not accept CONST_INTs, but
+   is here for consistancy.  */
 int
 const_wide_int_operand (rtx op, enum machine_mode mode)
 {
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md	(revision 203701)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -19,7 +19,7 @@
 
 ;; Return 1 for anything except PARALLEL.
 (define_predicate "any_operand"
-  (match_code "const_int,const_double,const,symbol_ref,label_ref,subreg,reg,mem"))
+  (match_code "const_int,const_double,const_wide_int,const,symbol_ref,label_ref,subreg,reg,mem"))
 
 ;; Return 1 for any PARALLEL.
 (define_predicate "any_parallel_operand"
@@ -596,7 +596,7 @@ (define_predicate "easy_vector_constant_
 
 ;; Return 1 if operand is constant zero (scalars and vectors).
 (define_predicate "zero_constant"
-  (and (match_code "const_int,const_double,const_vector")
+  (and (match_code "const_int,const_double,const_wide_int,const_vector")
(match_test "op == CONST0_RTX (mode)")))
 
 ;; Return 1 if operand is 0.0.
@@ -790,7 +790,7 @@ (define_predicate "logical_operand"
 ;; Return 1 if op is a constant that is not a logical operand, but could
 ;; be split into one.
 (define_predicate "non_logical_cint_operand"
-  (and (match_code "const_int,const_double")
+  (and (match_code "const_int,const_wide_int")
(and (not (match_operand 0 "logical_operand"))
 	(match_operand 0 "reg_or_logical_cint_operand"
 
@@ -1058,7 +1058,7 @@ (define_predicate "current_file_function
 ;; Return 1 if this operand is a valid input for a move insn.
 (define_predicate "input_operand"
   (match_code "symbol_ref,const,reg,subreg,mem,
-	   const_double,const_vector,const_int")
+	   const_double,const_wide_int,const_vector,const_int")
 {
   /* Memory is always valid.  */
   if (memory_operand (op, mode))
@@ -1071,8 +1071,7 @@ (define_predicate "input_operand"
 
   /* Allow any integer constant.  */
   if (GET_MODE_CLASS (mode) == MODE_INT
-  && (GET_CODE (op) == CONST_INT
-	  || GET_CODE (op) == CONST_DOUBLE))
+  && CONST_SCALAR_INT_P (op))
 return 1;
 
   /* Allow easy vector constants.  */
@@ -,7 +1110,7 @@ (define_predicate "input_operand"
 ;; Return 1 if this operand is a valid input for a vsx_splat insn.
 (define_predicate "splat_input_operand"
   (match_code "symbol_ref,const,reg,subreg,mem,
-	   const_double,const_vector,const_int")
+	   const_double,const_wide_int,const_vector,const_int")
 {
   if (MEM_P (op))
 {
Index: gcc/config/rs6000/darwin.h
===
--- gcc/config/rs6000/darwin.h	(revision 203701)
+++ gcc/config/rs6000/darwin.h	(working copy)
@@ -421,3 +421,4 @@ do {	\
 /* So far, there is no rs6000_fold_builtin, if one is introduced, then
this will need to be modified similar to the x86 case.  */
 #define TARGET_FOLD_BUILTIN SUBTARGET_FOLD_BUILTIN
+
Index: gcc/config/r

Re: [wide-int] int_traits

2013-10-16 Thread Kenneth Zadeck

On 10/16/2013 02:45 PM, Richard Sandiford wrote:

Kenneth Zadeck  writes:

I note that we immediately return in the above case, so

if (precision < xprecision + HOST_BITS_PER_WIDE_INT)
  {
len = wi::force_to_size (scratch, val, len, xprecision,
precision, UNSIGNED);
return wi::storage_ref (scratch, len, precision);
  }

applies only when the desired precision is a multiple of a HWI.

yes

I assume it adds that extra zero word in case the MSB is set,
right?

yes

I am confused about the condition written here and
how we look at precision and not xprecision when deciding how
to extend - given a 8-bit precision unsigned 0xff and precision == 64
we do not even consider sign-extending the value because we look
at precision and not xprecision.  Don't we need to look at
xprecision here?

I do not think so.   There are three cases here:

1) xprecision < precision:
say xprecision = 8 and precision = 32.

The number is between 0 and 255 and after canonicalization the number
will be between 0 and 255.

2) xprecision == precision
not much to talk about here.

3) xprecision > precision

We need to loose the upper bits of the input and it does not matter what
they were.The only thing that we are concerned about is that the
bits above what is now the sign bit pointed to by precision are matched
in the bits above precision.

(3) is gone with the assert though.

you seem to be correct here.

As mentioned in my message yesterday, I thought your new way of canonising
unsigned tree constants meant that there was always an upper zero bit.
Is that right?

i believe this is correct.

If so, xprecision < precision is a no-op, because the number always
has the right form for wider precisions.  The only difficult case is
xprecision == precision, since then we need to peel off any upper -1 HWIs.
say my HWI size is 8 bits (just to keep from typing a million 'f's.   if 
i have a 16 bit unsigned number that is all 1s in the tree world it is 3 
hwis

0x00 0xff 0xff.

but inside regular wide int, it would take 1 wide int whose value is 0xff.
inside of max it would be the same as the tree, but then the test 
precision < xprecision + hbpwi never kicks in because precision is 
guaranteed to be huge.


inside of addr_wide_int, i think we tank with the assertion.

the case actually comes up on the ppc because they do a lot of 128 bit 
math.I think i got thru the x86-64 without noticing this.





If that's right and xprecision == precision can use val with an adjusted len,
then...


After all an assert precision == xprecision
does not work in this routine.

Quickly execute.exp tested only.

To avoid code quality wreckage we have to implement a different way
of allocating the 'scratch' storage of wide_int_ref_storage
(well, ideally we wouldn't need it ...).  I thought of simply
allocating it via alloca (), but we need to do that in all
callers that build a wide_int_ref (eventually via a hidden
default arg).  But as that's a template specialization of
generic_wide_int ... so the option is to provide a function
wrapper inside wi:: for this, like

I want richard and mike to be the people who respond to the next
point.I am not a c++ person and all of this storage manager layer
stuff gives me a headache.

...the use of the scratch array goes away completely for addr_wide_int
and max_wide_int.  If this code is on a hot path for wide_int then maybe
we should simply require that wide_int operations involving trees have
explicit extensions, like in rtl.  (I.e. only do the implicit extension
for addr_wide_int and max_wide_int.)

i do not understand how this is true.

I'm not opposed to going back to a separate scratch array if all else fails,
but it's really ugly, so I'd like to look at the alternatives first...

Thanks,
Richard




Re: patch to canonize unsigned tree-csts

2013-10-16 Thread Kenneth Zadeck

On 10/15/2013 02:30 PM, Richard Sandiford wrote:

Richard Sandiford  writes:

   if (small_prec)
 ;
   else if (precision == xprecision)
 while (len >= 0 && val[len - 1] == -1)
   len--;

Err, len > 0 obviously.
you were only close.patch tested on ppc and committed as revision 
203739.


Index: gcc/tree.c
===
--- gcc/tree.c  (revision 203701)
+++ gcc/tree.c  (working copy)
@@ -1204,7 +1204,7 @@ wide_int_to_tree (tree type, const wide_
 }

   wide_int cst = wide_int::from (pcst, prec, sgn);
-  unsigned int len = int (cst.get_len ());
+  unsigned int len = cst.get_len ();
   unsigned int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
   bool recanonize = sgn == UNSIGNED
 && small_prec
Index: gcc/tree.h
===
--- gcc/tree.h  (revision 203701)
+++ gcc/tree.h  (working copy)
@@ -5204,13 +5204,13 @@ wi::int_traits ::decompose (
  scratch[len - 1] = sext_hwi (val[len - 1], precision);
  return wi::storage_ref (scratch, len, precision);
}
-   }
-
-  if (precision < xprecision + HOST_BITS_PER_WIDE_INT)
-   {
- len = wi::force_to_size (scratch, val, len, xprecision, 
precision, UNS

IGNED);
- return wi::storage_ref (scratch, len, precision);
-   }
+   }
+  /* We have to futz here because a large unsigned int with
+precision 128 may look (0x0 0x 0xF...) as a
+tree-cst and as (0xF...) as a wide-int.  */
+  else if (precision == xprecision && len == max_len)
+while (len > 1 && val[len - 1] == (HOST_WIDE_INT)-1)
+  len--;
 }

   /* Signed and the rest of the unsigned cases are easy.  */



Re: [wide-int] int_traits

2013-10-17 Thread Kenneth Zadeck

On 10/17/2013 04:46 AM, Richard Biener wrote:

the case actually comes up on the ppc because they do a lot of 128 bit
math.I think i got thru the x86-64 without noticing this.

Well, it'd be suspicious if we're directly using 128-bit numbers
in addr_wide_int.  The justification for the assertion was that we
should explicitly truncate to addr_wide_int when deliberately
ignoring upper bits, beyond bit or byte address width.  128 bits
definitely falls into that category on powerpc.

My question is whether with 8-bit HWI 0x00 0xff 0xff is a valid
wide-int value if it has precision 16.  AFAIK that is what the
code produces, but now Kenny says this is only for some kind
of wide-ints but not all?  That is, why is


The issue is not that the rules are different between the different 
flavors of wide int, it is that the circumstances are different. The 
rule is that the only bits above the precision that exist are if the 
precision is not an even multiple of the HBPWI.   In that case, the bits 
are always an extension of the sign bits.


max_wide_int and addr_wide_int have large enough precisions so that it 
is impossible to ever generate an unsigned number on the target that is 
large enough to ever run against the precision.However, for the 
fixed precision case, you can, and at least on the ppc, you do, generate 
unsigned numbers that are big enough to have to be recanonicalized like 
this.



inline wi::storage_ref
wi::int_traits ::decompose (HOST_WIDE_INT *scratch,
 unsigned int precision, const_tree
x)
{
   unsigned int len = TREE_INT_CST_NUNITS (x);
   const HOST_WIDE_INT *val = (const HOST_WIDE_INT *) &TREE_INT_CST_ELT (x,
0);
   return wi::storage_ref (val, len, precision);
}

not a valid implementation together with making sure that the
INTEGER_CST tree rep has that extra word of zeros if required?
I would like to see us move in that direction (given that the
tree rep of INTEGER_CST has transitioned to variable-length already).

Btw, code such as

tree
wide_int_to_tree (tree type, const wide_int_ref &pcst)
{
...
   unsigned int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
   bool recanonize = sgn == UNSIGNED
 && small_prec
 && (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT ==
len;

definitely needs a comment.  I would have thought _all_ unsigned
numbers need re-canonicalization.  Well, maybe only if we're
forcing the extra zeros.

It will get a comment.

[This function shows another optimization issue:

 case BOOLEAN_TYPE:
   /* Cache false or true.  */
   limit = 2;
   if (wi::leu_p (cst, 1))
 ix = cst.to_uhwi ();

I would have expected cst <= 1 be optimized to cst.len == 1 &&
cst.val[0] <= 1.  It expands to

:
   MEM[(long int *)&D.50698 + 16B] = 1;
   MEM[(struct wide_int_ref_storage *)&D.50698] = &MEM[(struct
wide_int_ref_storage *)&D.50698].scratch;
   MEM[(struct wide_int_ref_storage *)&D.50698 + 8B] = 1;
   MEM[(struct wide_int_ref_storage *)&D.50698 + 12B] = 32;
   _277 = MEM[(const struct wide_int_storage *)&cst + 260B];
   if (_277 <= 64)
 goto ;
   else
 goto ;

   :
   xl_491 = zext_hwi (1, 32);  // ok, checking enabled and thus out-of-line
   _494 = MEM[(const long int *)&cst];
   _495 = (long unsigned int) _494;
   yl_496 = zext_hwi (_495, _277);
   _497 = xl_491 < yl_496;
   goto ;

   :
   _503 = wi::ltu_p_large (&MEM[(struct wide_int_ref_storage
*)&D.50698].scratch, 1, 32, &MEM[(const struct wide_int_storage
*)&cst].val, len_274, _277);

this keeps D.50698 and cst un-SRAable - inline storage is problematic
for this reason.  But the representation should guarantee the
compare with a low precision (32 bit) constant is evaluatable
at compile-time if len of the larger value is > 1, no?

   :
   # _504 = PHI <_497(42), _503(43)>
   D.50698 ={v} {CLOBBER};
   if (_504 != 0)
 goto ;
   else
 goto ;

   :
   pretmp_563 = MEM[(const struct wide_int_storage *)&cst + 256B];
   goto  ();

   :
   _65 = generic_wide_int::to_uhwi (&cst, 0);
   ix_66 = (int) _65;
   goto ;

The question is whether we should try to optimize wide-int for
such cases or simply not use wi:leu_p (cst, 1) but rather

  if (cst.fits_uhwi_p () == 1 && cst.to_uhwi () < 1)

?
i find this ugly, but i see where you are coming from.   The problem is 
that both you and i know that the len has to be 1, but the optimizer 
does not.   This is a case where I think that we made a mistake getting 
rid of the wi::one_p, wi::zero_p and wi::minus_one_p.  The internals of 
one_p were return (len == 1 && val[0] ==1) and i think that is much 
nicer than what you put there.  On the other hand, it seem that a 
person more skilled than i am with c++ could specialize the comparisons 
with an integer constant, since i believe that that constant must fit in 
one hwi (I am a little concerned about large unsigned constants).



Thanks,
Richard






Re: [wide-int] int_traits

2013-10-17 Thread Kenneth Zadeck

On 10/17/2013 07:49 AM, Richard Biener wrote:

On Thu, 17 Oct 2013, Kenneth Zadeck wrote:


On 10/17/2013 04:46 AM, Richard Biener wrote:

the case actually comes up on the ppc because they do a lot of 128 bit
math.I think i got thru the x86-64 without noticing this.

Well, it'd be suspicious if we're directly using 128-bit numbers
in addr_wide_int.  The justification for the assertion was that we
should explicitly truncate to addr_wide_int when deliberately
ignoring upper bits, beyond bit or byte address width.  128 bits
definitely falls into that category on powerpc.

My question is whether with 8-bit HWI 0x00 0xff 0xff is a valid
wide-int value if it has precision 16.  AFAIK that is what the
code produces, but now Kenny says this is only for some kind
of wide-ints but not all?  That is, why is

The issue is not that the rules are different between the different flavors of
wide int, it is that the circumstances are different. The rule is that the
only bits above the precision that exist are if the precision is not an even
multiple of the HBPWI.   In that case, the bits are always an extension of the
sign bits.

max_wide_int and addr_wide_int have large enough precisions so that it is
impossible to ever generate an unsigned number on the target that is large
enough to ever run against the precision.However, for the fixed precision
case, you can, and at least on the ppc, you do, generate unsigned numbers that
are big enough to have to be recanonicalized like this.


inline wi::storage_ref
wi::int_traits ::decompose (HOST_WIDE_INT *scratch,
  unsigned int precision, const_tree
x)
{
unsigned int len = TREE_INT_CST_NUNITS (x);
const HOST_WIDE_INT *val = (const HOST_WIDE_INT *) &TREE_INT_CST_ELT (x,
0);
return wi::storage_ref (val, len, precision);
}

not a valid implementation together with making sure that the
INTEGER_CST tree rep has that extra word of zeros if required?
I would like to see us move in that direction (given that the
tree rep of INTEGER_CST has transitioned to variable-length already).

Btw, code such as

tree
wide_int_to_tree (tree type, const wide_int_ref &pcst)
{
...
unsigned int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
bool recanonize = sgn == UNSIGNED
  && small_prec
  && (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT ==
len;

definitely needs a comment.  I would have thought _all_ unsigned
numbers need re-canonicalization.  Well, maybe only if we're
forcing the extra zeros.

It will get a comment.

[This function shows another optimization issue:

  case BOOLEAN_TYPE:
/* Cache false or true.  */
limit = 2;
if (wi::leu_p (cst, 1))
  ix = cst.to_uhwi ();

I would have expected cst <= 1 be optimized to cst.len == 1 &&
cst.val[0] <= 1.  It expands to

:
MEM[(long int *)&D.50698 + 16B] = 1;
MEM[(struct wide_int_ref_storage *)&D.50698] = &MEM[(struct
wide_int_ref_storage *)&D.50698].scratch;
MEM[(struct wide_int_ref_storage *)&D.50698 + 8B] = 1;
MEM[(struct wide_int_ref_storage *)&D.50698 + 12B] = 32;
_277 = MEM[(const struct wide_int_storage *)&cst + 260B];
if (_277 <= 64)
  goto ;
else
  goto ;

:
xl_491 = zext_hwi (1, 32);  // ok, checking enabled and thus out-of-line
_494 = MEM[(const long int *)&cst];
_495 = (long unsigned int) _494;
yl_496 = zext_hwi (_495, _277);
_497 = xl_491 < yl_496;
goto ;

:
_503 = wi::ltu_p_large (&MEM[(struct wide_int_ref_storage
*)&D.50698].scratch, 1, 32, &MEM[(const struct wide_int_storage
*)&cst].val, len_274, _277);

this keeps D.50698 and cst un-SRAable - inline storage is problematic
for this reason.  But the representation should guarantee the
compare with a low precision (32 bit) constant is evaluatable
at compile-time if len of the larger value is > 1, no?

:
# _504 = PHI <_497(42), _503(43)>
D.50698 ={v} {CLOBBER};
if (_504 != 0)
  goto ;
else
  goto ;

:
pretmp_563 = MEM[(const struct wide_int_storage *)&cst + 256B];
goto  ();

:
_65 = generic_wide_int::to_uhwi (&cst, 0);
ix_66 = (int) _65;
goto ;

The question is whether we should try to optimize wide-int for
such cases or simply not use wi:leu_p (cst, 1) but rather

   if (cst.fits_uhwi_p () == 1 && cst.to_uhwi () < 1)

?

i find this ugly, but i see where you are coming from.   The problem is that
both you and i know that the len has to be 1, but the optimizer does not.
This is a case where I think that we made a mistake getting rid of the
wi::one_p, wi::zero_p and wi::minus_one_p.  The internals of one_p were return
(len == 1 && val[0] ==1) and i think that is much nicer than what you put
there.  On the other hand, it seem that a person more skilled than i am
with c++ could specia

Re: [wide-int] int_traits

2013-10-17 Thread Kenneth Zadeck

On 10/17/2013 08:29 AM, Richard Biener wrote:

On Thu, 17 Oct 2013, Richard Sandiford wrote:


Richard Biener  writes:

The new tree representation can have a length greater than max_len
for an unsigned tree constant that occupies a whole number of HWIs.
The tree representation of an unsigned 0x8000 is 0x00 0x80 0x00.
When extended to max_wide_int the representation is the same.
But a 2-HWI addr_wide_int would be 0x80 0x00, without the leading zero.
The MIN trims the length from 3 to 2 in the last case.

Oh, so it was the tree rep that changed?  _Why_ was it changed?
We still cannot use it directly from wide-int and the extra
word is redundant because we have access to TYPE_UNSIGNED (TREE_TYPE ()).

It means that we can now use the tree's HWI array directly, without any
copying, for addr_wide_int and max_wide_int.  The only part of decompose ()
that does a copy is the small_prec case, which is trivially compiled out
for addr_wide_int and max_wide_int.

" 2) addr_wide_int.  This is a fixed size representation that is
  guaranteed to be large enough to compute any bit or byte sized
  address calculation on the target.  Currently the value is 64 + 4
  bits rounded up to the next number even multiple of
  HOST_BITS_PER_WIDE_INT (but this can be changed when the first
  port needs more than 64 bits for the size of a pointer).

  This flavor can be used for all address math on the target.  In
  this representation, the values are sign or zero extended based
  on their input types to the internal precision.  All math is done
  in this precision and then the values are truncated to fit in the
  result type.  Unlike most gimple or rtl intermediate code, it is
  not useful to perform the address arithmetic at the same
  precision in which the operands are represented because there has
  been no effort by the front ends to convert most addressing
  arithmetic to canonical types.

  In the addr_wide_int, all numbers are represented as signed
  numbers.  There are enough bits in the internal representation so
  that no infomation is lost by representing them this way."

so I guess from that that addr_wide_int.get_precision is always
that "64 + 4 rounded up".  Thus decompose gets that constant precision
input and the extra zeros make the necessary extension always a no-op.
Aha.
it is until someone comes up with a port that this will not work for, 
then they will have to add some machinery to sniff the port and make 
this bigger.I am hoping to be retired by the time this happens.

For max_wide_int the same rules apply, just its size is larger.

Ok.  So the reps are only canonical wide-int because we only
ever use them with precision > xprecision (maybe we should assert
that).

It is now asserted for (as of a few days ago).



Btw, we are not using them directly, but every time we actually
build a addr_wide_int / max_wide_int we copy them anyway:

/* Initialize the storage from integer X, in precision N.  */
template 
template 
inline fixed_wide_int_storage ::fixed_wide_int_storage (const T &x)
{
   /* Check for type compatibility.  We don't want to initialize a
  fixed-width integer from something like a wide_int.  */
   WI_BINARY_RESULT (T, FIXED_WIDE_INT (N)) *assertion ATTRIBUTE_UNUSED;
   wide_int_ref xi (x, N);
   len = xi.len;
   for (unsigned int i = 0; i < len; ++i)
 val[i] = xi.val[i];
}




it avoids a 2nd copy though, which shows nicely what was rummaging in
my head for the last two days - that the int_trais <> abstraction
was somehow at the wrong level - it should have been traits that
are specific to the storage model?  or the above should use
int_traits<>::decompose manually with it always doing the
copy (that would also optimize away one copy and eventually
would make the extra zeros not necessary).
this came in with richard's storage manager patch.In my older code, 
we tried and succeeded many times to just borrow the underlying rep.
I think that richard needs to work this out.

I originally thought that extra zeros get rid of all copying from trees
to all wide-int kinds.

What's the reason again to not use my original proposed encoding
of the MSB being the sign bit?  RTL constants simply are all signed
then.  Just you have to also sign-extend in functions like lts_p
as not all constants are sign-extended.  But we can use both tree
(with the now appended zero) and RTL constants representation
unchanged.
I am not following you here.   In trees, the msb is effectively a sign 
bit, even for unsigned numbers because we add that extra block.


but inside of wide int, we do not add extra blocks beyond the 
precision.   That would be messy for a lot of other reasons.





Re: [wide-int] int_traits

2013-10-17 Thread Kenneth Zadeck

On 10/17/2013 07:30 AM, Richard Biener wrote:

On Thu, 17 Oct 2013, Richard Sandiford wrote:


Richard Biener  writes:

On Thu, 17 Oct 2013, Richard Sandiford wrote:


Kenneth Zadeck  writes:

As mentioned in my message yesterday, I thought your new way of canonising
unsigned tree constants meant that there was always an upper zero bit.
Is that right?

i believe this is correct.

If so, xprecision < precision is a no-op, because the number always
has the right form for wider precisions.  The only difficult case is
xprecision == precision, since then we need to peel off any upper -1 HWIs.

say my HWI size is 8 bits (just to keep from typing a million 'f's.   if
i have a 16 bit unsigned number that is all 1s in the tree world it is 3
hwis
0x00 0xff 0xff.

but inside regular wide int, it would take 1 wide int whose value is 0xff.
inside of max it would be the same as the tree, but then the test
precision < xprecision + hbpwi never kicks in because precision is
guaranteed to be huge.

inside of addr_wide_int, i think we tank with the assertion.

It should be OK for addr_wide_int too.  The precision still fits 2 HWIs.
The initial length is greater than the maximum length of an addr_wide_int,
but your "len = MAX (len, max_len)" deals with that.

It's

   len = MIN (len, max_len)

Oops, yeah, I meant MIN, sorry.


which looked suspicious to me, but with precision >= xprecision
precision can only be zero if xprecision is zero which looked to
me like it cannot happen - or rather it should be fixed.

Despite the comment above the code, I don't think this MIN is there
for the zero-precision case.  I think it's there to handle the new
tree representation.

The new tree representation can have a length greater than max_len
for an unsigned tree constant that occupies a whole number of HWIs.
The tree representation of an unsigned 0x8000 is 0x00 0x80 0x00.
When extended to max_wide_int the representation is the same.
But a 2-HWI addr_wide_int would be 0x80 0x00, without the leading zero.
The MIN trims the length from 3 to 2 in the last case.

Oh, so it was the tree rep that changed?  _Why_ was it changed?
We still cannot use it directly from wide-int and the extra
word is redundant because we have access to TYPE_UNSIGNED (TREE_TYPE ()).
  
It was changed because you asked me to change it.   However, you end up 
with special cases either way.

the case actually comes up on the ppc because they do a lot of 128 bit

math.I think i got thru the x86-64 without noticing this.

Well, it'd be suspicious if we're directly using 128-bit numbers
in addr_wide_int.  The justification for the assertion was that we
should explicitly truncate to addr_wide_int when deliberately
ignoring upper bits, beyond bit or byte address width.  128 bits
definitely falls into that category on powerpc.

My question is whether with 8-bit HWI 0x00 0xff 0xff is a valid
wide-int value if it has precision 16.

No, for a 16-bit wide_int it should be 0xff.  0x00 0xff 0xff is correct
for any wide_int wider than 16 bits though.


AFAIK that is what the code produces,

In which case?  This is:

precision == 16
xprecision == 16
len == 3
max_len == 2

The MIN trims the len to 2 and then the loop Kenny added trims it
again to 1, so the "0x00 0xff 0xff" becomes "0xff".  The "0x00 0xff"
is still there in the array, but not used.


but now Kenny says this is only for some kind
of wide-ints but not all?  That is, why is

inline wi::storage_ref
wi::int_traits ::decompose (HOST_WIDE_INT *scratch,
 unsigned int precision, const_tree
x)
{
   unsigned int len = TREE_INT_CST_NUNITS (x);
   const HOST_WIDE_INT *val = (const HOST_WIDE_INT *) &TREE_INT_CST_ELT (x,
0);
   return wi::storage_ref (val, len, precision);
}

not a valid implementation together with making sure that the
INTEGER_CST tree rep has that extra word of zeros if required?

The fundamental problem here is that we're trying to support two cases:

(a) doing N-bit arithemtic in cases where the inputs have N bits
(b) doing N-bit arithmetic in cases where the inputs have fewer than N bits
 and are extended according to TYPE_SIGN.

Let's assume 32-bit HWIs.  The 16-bit (4-hex-digit) constant 0x8000 is
0x8000 regardless of whether the type is signed or unsigned.  But if it's
extended to 32-bits you get two different numbers 0x8000 and 0x8000,
depending on the sign.

So for one value of the "precision" parameter (i.e. xprecision), signed
and unsigned constants produce the same number.  But for another value
of the "precision" parameter (those greater than xprecision), signed and
unsigned constants produce different numbers.  Yet at the moment the tree
constant has a single representation.

But a correctly extended one, up to its len!  (as opposed to RTL)


So I think the possibilities are:

(1) Use the represent

Re: [wide-int] int_traits

2013-10-17 Thread Kenneth Zadeck

On 10/17/2013 09:16 AM, Richard Biener wrote:

On Thu, 17 Oct 2013, Kenneth Zadeck wrote:


On 10/17/2013 08:29 AM, Richard Biener wrote:

On Thu, 17 Oct 2013, Richard Sandiford wrote:


Richard Biener  writes:

The new tree representation can have a length greater than max_len
for an unsigned tree constant that occupies a whole number of HWIs.
The tree representation of an unsigned 0x8000 is 0x00 0x80 0x00.
When extended to max_wide_int the representation is the same.
But a 2-HWI addr_wide_int would be 0x80 0x00, without the leading
zero.
The MIN trims the length from 3 to 2 in the last case.

Oh, so it was the tree rep that changed?  _Why_ was it changed?
We still cannot use it directly from wide-int and the extra
word is redundant because we have access to TYPE_UNSIGNED (TREE_TYPE
()).

It means that we can now use the tree's HWI array directly, without any
copying, for addr_wide_int and max_wide_int.  The only part of decompose
()
that does a copy is the small_prec case, which is trivially compiled out
for addr_wide_int and max_wide_int.

" 2) addr_wide_int.  This is a fixed size representation that is
   guaranteed to be large enough to compute any bit or byte sized
   address calculation on the target.  Currently the value is 64 + 4
   bits rounded up to the next number even multiple of
   HOST_BITS_PER_WIDE_INT (but this can be changed when the first
   port needs more than 64 bits for the size of a pointer).

   This flavor can be used for all address math on the target.  In
   this representation, the values are sign or zero extended based
   on their input types to the internal precision.  All math is done
   in this precision and then the values are truncated to fit in the
   result type.  Unlike most gimple or rtl intermediate code, it is
   not useful to perform the address arithmetic at the same
   precision in which the operands are represented because there has
   been no effort by the front ends to convert most addressing
   arithmetic to canonical types.

   In the addr_wide_int, all numbers are represented as signed
   numbers.  There are enough bits in the internal representation so
   that no infomation is lost by representing them this way."

so I guess from that that addr_wide_int.get_precision is always
that "64 + 4 rounded up".  Thus decompose gets that constant precision
input and the extra zeros make the necessary extension always a no-op.
Aha.

it is until someone comes up with a port that this will not work for, then
they will have to add some machinery to sniff the port and make this bigger.
I am hoping to be retired by the time this happens.

For max_wide_int the same rules apply, just its size is larger.

Ok.  So the reps are only canonical wide-int because we only
ever use them with precision > xprecision (maybe we should assert
that).

It is now asserted for (as of a few days ago).


Btw, we are not using them directly, but every time we actually
build a addr_wide_int / max_wide_int we copy them anyway:

/* Initialize the storage from integer X, in precision N.  */
template 
template 
inline fixed_wide_int_storage ::fixed_wide_int_storage (const T &x)
{
/* Check for type compatibility.  We don't want to initialize a
   fixed-width integer from something like a wide_int.  */
WI_BINARY_RESULT (T, FIXED_WIDE_INT (N)) *assertion ATTRIBUTE_UNUSED;
wide_int_ref xi (x, N);
len = xi.len;
for (unsigned int i = 0; i < len; ++i)
  val[i] = xi.val[i];
}



it avoids a 2nd copy though, which shows nicely what was rummaging in
my head for the last two days - that the int_trais <> abstraction
was somehow at the wrong level - it should have been traits that
are specific to the storage model?  or the above should use
int_traits<>::decompose manually with it always doing the
copy (that would also optimize away one copy and eventually
would make the extra zeros not necessary).

this came in with richard's storage manager patch.In my older code, we
tried and succeeded many times to just borrow the underlying rep.I think
that richard needs to work this out.

I originally thought that extra zeros get rid of all copying from trees
to all wide-int kinds.

What's the reason again to not use my original proposed encoding
of the MSB being the sign bit?  RTL constants simply are all signed
then.  Just you have to also sign-extend in functions like lts_p
as not all constants are sign-extended.  But we can use both tree
(with the now appended zero) and RTL constants representation
unchanged.

I am not following you here.   In trees, the msb is effectively a sign bit,
even for unsigned numbers because we add that extra block.

but inside of wide int, we do not add extra blocks beyond the precision.
That would be messy for a lot of other reasons.

Can you elaborate?  It would make tree and RTX reps directly usable,
only wide-int-to-tr

Re: [wide-int] int_traits

2013-10-17 Thread Kenneth Zadeck

On 10/17/2013 09:48 AM, Richard Biener wrote:

On Thu, 17 Oct 2013, Richard Sandiford wrote:


Richard Biener  writes:

On Thu, 17 Oct 2013, Richard Sandiford wrote:


Richard Biener  writes:

The new tree representation can have a length greater than max_len
for an unsigned tree constant that occupies a whole number of HWIs.
The tree representation of an unsigned 0x8000 is 0x00 0x80 0x00.
When extended to max_wide_int the representation is the same.
But a 2-HWI addr_wide_int would be 0x80 0x00, without the leading zero.
The MIN trims the length from 3 to 2 in the last case.

Oh, so it was the tree rep that changed?  _Why_ was it changed?
We still cannot use it directly from wide-int and the extra
word is redundant because we have access to TYPE_UNSIGNED (TREE_TYPE ()).

It means that we can now use the tree's HWI array directly, without any
copying, for addr_wide_int and max_wide_int.  The only part of decompose ()
that does a copy is the small_prec case, which is trivially compiled out
for addr_wide_int and max_wide_int.

" 2) addr_wide_int.  This is a fixed size representation that is
  guaranteed to be large enough to compute any bit or byte sized
  address calculation on the target.  Currently the value is 64 + 4
  bits rounded up to the next number even multiple of
  HOST_BITS_PER_WIDE_INT (but this can be changed when the first
  port needs more than 64 bits for the size of a pointer).

  This flavor can be used for all address math on the target.  In
  this representation, the values are sign or zero extended based
  on their input types to the internal precision.  All math is done
  in this precision and then the values are truncated to fit in the
  result type.  Unlike most gimple or rtl intermediate code, it is
  not useful to perform the address arithmetic at the same
  precision in which the operands are represented because there has
  been no effort by the front ends to convert most addressing
  arithmetic to canonical types.

  In the addr_wide_int, all numbers are represented as signed
  numbers.  There are enough bits in the internal representation so
  that no infomation is lost by representing them this way."

so I guess from that that addr_wide_int.get_precision is always
that "64 + 4 rounded up".  Thus decompose gets that constant precision
input and the extra zeros make the necessary extension always a no-op.
Aha.

For max_wide_int the same rules apply, just its size is larger.

Ok.  So the reps are only canonical wide-int because we only
ever use them with precision > xprecision (maybe we should assert
that).

No, we allow precision == xprecision for addr_wide_int and max_wide_int too.
But all we do in that case is trim the length.

Requiring precision > xprecision was option (5) from my message.


Btw, we are not using them directly, but every time we actually
build a addr_wide_int / max_wide_int we copy them anyway:

/* Initialize the storage from integer X, in precision N.  */
template 
template 
inline fixed_wide_int_storage ::fixed_wide_int_storage (const T &x)
{
   /* Check for type compatibility.  We don't want to initialize a
  fixed-width integer from something like a wide_int.  */
   WI_BINARY_RESULT (T, FIXED_WIDE_INT (N)) *assertion ATTRIBUTE_UNUSED;
   wide_int_ref xi (x, N);
   len = xi.len;
   for (unsigned int i = 0; i < len; ++i)
 val[i] = xi.val[i];
}

Are you saying that:

  max_wide_int x = (tree) y;

should just copy a pointer?  Then what about:

No, it should do a copy.  But only one - which it does now with
the append-extra-zeros-in-tree-rep, I was just thinking how to
avoid it when not doing that which means adjusting the rep in
the copy we need to do anyway.  If fixed_wide_int_storage is
really the only reason to enlarge the tree rep.
  

  max_wide_int z = x;

Should that just copy a pointer too, or is it OK for the second constructor
to do a copy?

In most cases we only use the fixed_wide_int to store the result of the
computation.  We don't use the above constructor for things like:

  max_wide_int x = wi::add ((tree) y, (int) z);

wi::add uses y and z without going through max_wide_int.

Yes, I am aware of that.


What's the reason again to not use my original proposed encoding
of the MSB being the sign bit?  RTL constants simply are all signed
then.  Just you have to also sign-extend in functions like lts_p
as not all constants are sign-extended.  But we can use both tree
(with the now appended zero) and RTL constants representation
unchanged.

The answer's the same as always.  RTL constants don't have a sign.
Any time we extend an RTL constant, we need to say whether the extension
should be sign extension or zero extension.  So the natural model for
RTL is that an SImode addition is done to SImode width, not some
arbitrary wider width.

RTL constants are sign-extended (whether you call them then "signed"
is up to you).  They have a precision.  This is how they a

Re: [wide-int] int_traits

2013-10-17 Thread Kenneth Zadeck

On 10/17/2013 10:05 AM, Richard Sandiford wrote:

Richard Biener  writes:

What's the reason again to not use my original proposed encoding
of the MSB being the sign bit?  RTL constants simply are all signed
then.  Just you have to also sign-extend in functions like lts_p
as not all constants are sign-extended.  But we can use both tree
(with the now appended zero) and RTL constants representation
unchanged.

The answer's the same as always.  RTL constants don't have a sign.
Any time we extend an RTL constant, we need to say whether the extension
should be sign extension or zero extension.  So the natural model for
RTL is that an SImode addition is done to SImode width, not some
arbitrary wider width.

RTL constants are sign-extended (whether you call them then "signed"
is up to you).  They have a precision.  This is how they are
valid reps for wide-ints, and that doesn't change.

I was saying that if we make not _all_ wide-ints sign-extended
then we can use the tree rep as-is.  We'd then have the wide-int
rep being either zero or sign extended but not arbitrary random
bits outside of the precision (same as the tree rep).

OK, but does that have any practical value over leaving them arbitrary,
as in Kenny's original implementation?  Saying that upper bits can be
either signs or zeros means that readers wouldn't be able to rely on
one particular extension (so would have to do the work that they did
in Kenny's implementation).  At the same time it means that writers
can't leave the upper bits in an arbitrary state, so would have to
make sure that they are signs or zeros (presumably having a free
choice of which).

Thanks,
Richard

not so easy here.We would still need to clean things up when we 
convert back to tree-cst or rtl at least for the short precisions.
Otherwise, the parts of the compiler that look inside of trees, in 
particular the hash functions have to clean the value them selves.


As much as my arm still hurts from richi forcing me to change this, it 
is true that it seems to have been the right thing to do.


the change for the wider unsigned trees perhaps should be backed out.

kenny



Re: [wide-int] int_traits

2013-10-18 Thread Kenneth Zadeck

On 10/18/2013 04:45 AM, Richard Biener wrote:

On Thu, 17 Oct 2013, Mike Stump wrote:


On Oct 17, 2013, at 1:46 AM, Richard Biener  wrote:

[This function shows another optimization issue:

case BOOLEAN_TYPE:
  /* Cache false or true.  */
  limit = 2;
  if (wi::leu_p (cst, 1))
ix = cst.to_uhwi ();

I would have expected cst <= 1 be optimized to cst.len == 1 &&
cst.val[0] <= 1.

So lts_p has the same problem, and is used just below.  If we do:

@@ -1461,12 +1470,22 @@ wi::ne_p (const T1 &x, const T2 &y)
  inline bool
  wi::lts_p (const wide_int_ref &x, const wide_int_ref &y)
  {
-  if (x.precision <= HOST_BITS_PER_WIDE_INT
-  && y.precision <= HOST_BITS_PER_WIDE_INT)
-return x.slow () < y.slow ();
-  else
-return lts_p_large (x.val, x.len, x.precision, y.val, y.len,
-   y.precision);
+  // We optimize w < N, where N is 64 or fewer bits.
+  if (y.precision <= HOST_BITS_PER_WIDE_INT)
+{
+  // If it fits directly into a shwi, we can compare directly.
+  if (wi::fits_shwi_p (x))
+   return x.slow () < y.slow ();
+  // If it is doesn't fit, then it must be more negative than any
+  // value in y, and hence smaller.
+  if (neg_p (x, SIGNED))
+   return true;
+  // If it is positve, then it must be larger than any value in y,
+  // and hence greater than.
+  return false;
+}
+  return lts_p_large (x.val, x.len, x.precision, y.val, y.len,
+ y.precision);
  }

then for:

bool MGEN(wide_int cst) {
   return wi::lts_p (cst, 100);
}

we generate:

 movl520(%rsp), %eax
 cmpl$1, %eax
 je  .L5283
 subl$1, %eax
 movq8(%rsp,%rax,8), %rax
 shrq$63, %rax
 ret
 .p2align 4,,10
 .p2align 3
.L5283:
 cmpq$99, 8(%rsp)
 setle   %al
 ret

which as you can see, never calls lts_p_large, and hence, if that was the only 
reason the storage escapes, then it should be able to optimize better.  In the 
above code, minimal, no function calls, and the 100 is propagated all the way 
down into the cmpq.  Now, the reason why we should do it this way, most of the 
time, most types are 64 bits or less.  When that is true, then it will never 
call into lts_p_large.

If the above 100 is changed to l, from a parameter long l, then the code is the 
same except the last part is:

 cmpq8(%rsp), %rdi
 setg%al
 ret

If we have two wide_int's, then the code does this:

 .cfi_startproc
 subq$8, %rsp
.cfi_def_cfa_offset 16
 movl1052(%rsp), %r9d
 movl1048(%rsp), %r8d
 movl532(%rsp), %edx
 movl528(%rsp), %esi
 cmpl$64, %r9d
 ja  .L5281
 cmpl$1, %esi
je  .L5284
 subl$1, %esi
 movq16(%rsp,%rsi,8), %rax
 addq$8, %rsp
 .cfi_remember_state
 .cfi_def_cfa_offset 8
 shrq$63, %rax
 ret
 .p2align 4,,10
 .p2align 3
.L5281:
 .cfi_restore_state
 leaq536(%rsp), %rcx
 leaq16(%rsp), %rdi
 call_ZN2wi11lts_p_largeEPKljjS1_jj
addq$8, %rsp
 .cfi_remember_state
 .cfi_def_cfa_offset 8
 ret
 .p2align 4,,10
 .p2align 3
.L5284:
 .cfi_restore_state
 movq536(%rsp), %rax
 cmpq%rax, 16(%rsp)
 setl%al
 addq$8, %rsp
 .cfi_def_cfa_offset 8
 ret
 .cfi_endproc

which is still pretty nice.

Does this look ok?  Kenny, can you double check the cases, think I have them 
right, but?  a double check would be good.

That works for me.

i talked to mike about this last night, but did not follow up with an 
email until now.   The problem is that this code is wrong!!!   He is 
working to fix that and so i would expect something from him later 
(tomorrow for those in europe).The problem is if the variable that 
comes in is a unsigned HWI that has to the top bit set.In that case, 
the wide-int canonicalized version would have two hwi's, the top one 
being a 0 block and so you cannot do the fast path test coming in on 
that even though it's precison fits.

the representation should guarantee the
compare with a low precision (32 bit) constant is evaluatable
at compile-time if len of the larger value is > 1, no?

I agree, though, I didn't check the symmetric case, as constants and
smaller values can be put on the right by convention.


The question is whether we should try to optimize wide-int for
such cases or simply not use wi:leu_p (cst, 1) but rather

if (cst.fits_uhwi_p () == 1 && cst.to_uhwi () < 1)

?

Since we can do an excellent job with the simple interface, I'd argue
for the simple interface and the simple change to do the conditional
better.  I'd rather up the ante on meta programming to get any
performance we want, retaining the s

Re: [wide-int] int_traits

2013-10-18 Thread Kenneth Zadeck

Richi,

Do you want me to back out the patch that changes the rep for unsigned 
tree-csts?


kenny


Re: [wide-int] int_traits

2013-10-19 Thread Kenneth Zadeck

On 10/19/2013 05:01 AM, Richard Sandiford wrote:

Mike Stump  writes:

+  // We optimize x < y, where y is 64 or fewer bits.
+  // We have to be careful to not allow comparison to a large positive
+  // unsigned value like 0x8000, those would be encoded
+  // with a y.len == 2.
+  if (y.precision <= HOST_BITS_PER_WIDE_INT
+  && y.len == 1)

I don't get this.  If y.precision <= HOST_BITS_PER_WIDE_INT then
y.len must be 1.  I realise that tree constants can be stored with
TREE_INT_CST_NUNITS > TYPE_PRECISION / HOST_BITS_PER_WIDE_INT
(so that extensions beyond TYPE_PRECISION are free).  But the
wide-int code is shielded from that by the ::decompose routine.
A wide int never has len > precision / HOST_BITS_PER_WIDE_INT.

Thanks,
Richard
I think that part of this is that neither mike or i really understand 
how this stuff works anymore.


in the old version where we had precision 0, we would wait to 
canonicalize a constant or a simple integer until we saw what the 
precision of the other operand was.   That was what precison 0 meant.
it was kind of like what you are now proposing with this new trait, but 
for the reason that we actually did not know what to do than some 
concern about escapement.


What i do not understand is what happens what do you get when you bring 
in an integer variable that is an unsigned HOST_WIDE_INT with the top 
bit set.   In the precision 0 days, if the prec of the other side was 64 
or less, the incoming integer took 1 hwi and if the precision was 
larger, it took two hwis.  The canonicalization happened late enough so 
that there was never a question.


Here we are trying to do this at compile time to avoid the escape. This 
is why my emails about this have continued to talk about the unsigned 
HOST_WIDE_INT as a boundary case.It is clear, that if the value is a 
constant, then you should be able to see at compile time if the top bit 
is set, but if the value is a simple integer variable, you should still 
be able to do the non escaping test as long as the type is signed 
HOST_WIDE_INT or smaller.


I think that mike certainly has not captured this yet.   But those are 
the issues as i see them.






Re: [wide-int] Go back to having undefined exccess bits on read

2013-10-19 Thread Kenneth Zadeck

On 10/19/2013 02:41 PM, Mike Stump wrote:

On Oct 19, 2013, at 7:22 AM, Richard Sandiford  
wrote:

As discussed, this patch effectively goes back to your original idea of
having excess upper bits in a HWI being undefined on read (at least as
the default assumption).  wide_int itself still ensures that the excess
bits are stored as signs though.
Or we could decide that it isn't worth the hassle and just leave excess
upper bits as undefined on write too, which really is going back to your
original model. :-)

I don't get it.  If the bits are undefined upon read, then, they should be 
undefined upon write.  It doesn't make any sense to me to have them be 
undefined upon read and defined upon write.  To me, one describes the semantics 
of the data, then once that is done, all else falls out from that.  If no one 
is allowed to deviate their behavior upon a bit (that bit is don't care), then, 
trivially reads need to zap it out, and writes can write anything into it.
no, they should not be undefined on write.if they are undefined on 
write, then it makes tree_fits* (the successor functions from 
integer_hostp) much more complex.


kenny


Re: [wide-int] int_traits

2013-10-19 Thread Kenneth Zadeck

On 10/19/2013 10:18 AM, Richard Sandiford wrote:

Kenneth Zadeck  writes:

On 10/19/2013 05:01 AM, Richard Sandiford wrote:

Mike Stump  writes:

+  // We optimize x < y, where y is 64 or fewer bits.
+  // We have to be careful to not allow comparison to a large positive
+  // unsigned value like 0x8000, those would be encoded
+  // with a y.len == 2.
+  if (y.precision <= HOST_BITS_PER_WIDE_INT
+  && y.len == 1)

I don't get this.  If y.precision <= HOST_BITS_PER_WIDE_INT then
y.len must be 1.  I realise that tree constants can be stored with
TREE_INT_CST_NUNITS > TYPE_PRECISION / HOST_BITS_PER_WIDE_INT
(so that extensions beyond TYPE_PRECISION are free).  But the
wide-int code is shielded from that by the ::decompose routine.
A wide int never has len > precision / HOST_BITS_PER_WIDE_INT.

Thanks,
Richard

I think that part of this is that neither mike or i really understand
how this stuff works anymore.

in the old version where we had precision 0, we would wait to
canonicalize a constant or a simple integer until we saw what the
precision of the other operand was.   That was what precison 0 meant.
it was kind of like what you are now proposing with this new trait, but
for the reason that we actually did not know what to do than some
concern about escapement.

What i do not understand is what happens what do you get when you bring
in an integer variable that is an unsigned HOST_WIDE_INT with the top
bit set.   In the precision 0 days, if the prec of the other side was 64
or less, the incoming integer took 1 hwi and if the precision was
larger, it took two hwis.  The canonicalization happened late enough so
that there was never a question.

Ah, I think I know what you mean.  The original implementation was:

   template 
   static inline const HOST_WIDE_INT*
   to_shwi1 (HOST_WIDE_INT *s, unsigned int *l, unsigned int *p, const T& x)
   {
 s[0] = x;
 if (signedp(x)
 || sizeof (T) < sizeof (HOST_WIDE_INT)
 || ! top_bit_set (x))
   {
 *l = 1;
   }
 else
   {
 s[1] = 0;
 *l = 2;
   }
 *p = 0;
 return s;
   }

   void
   wide_int_ro::check_precision (unsigned int *p1, unsigned int *p2,
 bool check_equal ATTRIBUTE_UNUSED,
 bool check_zero ATTRIBUTE_UNUSED)
   {
 gcc_checking_assert ((!check_zero) || *p1 != 0 || *p2 != 0);

 if (*p1 == 0)
   *p1 = *p2;

 if (*p2 == 0)
   *p2 = *p1;

 gcc_checking_assert ((!check_equal) || *p1 == *p2);
   }

   /* Return true if C1 < C2 using signed comparisons.  */
   template 
   static inline bool
   lts_p (const T1 &c1, const T2 &c2)
   {
 bool result;
 HOST_WIDE_INT ws1[WIDE_INT_MAX_ELTS];
 HOST_WIDE_INT ws2[WIDE_INT_MAX_ELTS];
 const HOST_WIDE_INT *s1, *s2;  /* Returned data */
 unsigned int cl1, cl2; /* array lengths  */
 unsigned int p1, p2;   /* precisions */
 
 s1 = to_shwi1 (ws1, &cl1, &p1, c1);

 s2 = to_shwi1 (ws2, &cl2, &p2, c2);
 check_precision (&p1, &p2, false, true);
 
 if (p1 <= HOST_BITS_PER_WIDE_INT

 && p2 <= HOST_BITS_PER_WIDE_INT)
   {
 HOST_WIDE_INT x0 = sext_hwi (s1[0], p1);
 HOST_WIDE_INT x1 = sext_hwi (s2[0], p2);
 result = x0 < x1;
   }
 else
   result = lts_p_large (s1, cl1, p1, s2, cl2, p2);
 
#ifdef DEBUG_WIDE_INT

 debug_vaa ("wide_int_ro:: %d = (%s lts_p %s\n", result, s1, cl1, p1, s2, 
cl2, p2);
#endif
 return result;
   }
you need to be careful about asserting too much from the old code. the 
time line was:


1) we developed the stuff on x86-64
2) you did your patch
3) we ported everything to ppc and our private port.

i really only became very sensitive to this issue during step 3 because 
the x86 does not exhibit these bugs.




So if you had a 128-bit wide_int and T == unsigned HOST_WIDE_INT,
this lts_p would zero-extend the unsigned HOST_WIDE_INT to 128 bits and
then do a signed comparison.

The thing here is that the "check_equal" argument is false.
So if instead you were comparing a 128-bit wide_int with a 64-bit tree
constant, lts_p would treat that tree constant as a signed 64-bit number,
even if it was TYPE_UNSIGNED.  Similarly if you were comparing a 128-bit
tree constant and a 64-bit tree constant.  You also allowed a comparison
of a 128-bit wide_int with a 64-bit rtx, again treating the 64-bit rtx
as signed.
I do not think that this is what check_equal meant because the 0 
precision was a wild card.  The 0 precision allowed the values to come 
in from simple vars and constants and be converted on the fly.   
However, as i said above, i am not sure how well this worked since the 
testing for wide stuff was so thin.

So when doing the wi:: conversion, I'd interpreted the desired semantics
for lts_p as being "treat bot

Re: [wide-int] Reduce the size of the scratch arrays

2013-10-20 Thread Kenneth Zadeck

On 10/20/2013 06:30 AM, Richard Sandiford wrote:

If yesterday's patch goes in, we'll only need the scratch array for
HWI inputs.  We can therefore shrink it to 2 HWIs.

Tested on x86_64-linux-gnu.  OK for wide-int?

Thanks,
Richard


Index: gcc/wide-int.h
===
--- gcc/wide-int.h  2013-10-19 20:08:15.571987579 +0100
+++ gcc/wide-int.h  2013-10-19 20:09:12.299436972 +0100
@@ -778,7 +778,7 @@ struct wide_int_ref_storage : public wi:
  private:
/* Scratch space that can be used when decomposing the original integer.
   It must live as long as this object.  */
-  HOST_WIDE_INT scratch[WIDE_INT_MAX_ELTS];
+  HOST_WIDE_INT scratch[2];
  
  public:

template 
we now allow partial int precisions that are any size.   this will break 
that.


kenny


Re: [wide-int] Make trees more like rtxes

2013-10-23 Thread Kenneth Zadeck

On 10/23/2013 08:13 AM, Richard Biener wrote:

On Wed, 23 Oct 2013, Richard Sandiford wrote:


Richard Biener  writes:

The patch does that by adding:

   wi::address (t)

for when we want to extend tree t to addr_wide_int precision and:

   wi::extend (t)

for when we want to extend it to max_wide_int precision.  (Better names
welcome.)  These act just like addr_wide_int (t) and max_wide_int (t)
would on current sources, except that they use the tree representation
directly, so there's no copying.

Good.  Better names - ah well, wi::to_max_wide_int (t) and
wi::to_addr_wide_int (t)?  Btw, "addr_wide_int" is an odd name as it
has at least the precision of the maximum _bit_ offset possible, right?
So more like [bit_]offset_wide_int?  Or just [bit_]offset_int?
And then wi::to_offset (t) and wi::to_max (t)?

offset_int, max_int, wi::to_offset and wi::to_max sound OK to me.
Kenny?  Mike?


Most of the patch is mechanical and many of the "wi::address (...)"s
and "wi::extend (...)"s reinstate "addr_wide_int (...)"s and
"max_wide_int (...)"s from the initial implementation.  Sorry for the
run-around on this.

One change I'd like to point out though is:

@@ -7287,7 +7287,9 @@ native_encode_int (const_tree expr, unsi
for (byte = 0; byte < total_bytes; byte++)
  {
int bitpos = byte * BITS_PER_UNIT;
-  value = wi::extract_uhwi (expr, bitpos, BITS_PER_UNIT);
+  /* Extend EXPR according to TYPE_SIGN if the precision isn't a whole
+number of bytes.  */
+  value = wi::extract_uhwi (wi::extend (expr), bitpos, BITS_PER_UNIT);
  
if (total_bytes > UNITS_PER_WORD)

{

I think this preserves the existing trunk behaviour but I wasn't sure
whether it was supposed to work like that or whether upper bits should
be zero.

I think the upper bits are undefined, the trunk native_interpret_int
does

   result = double_int::from_buffer (ptr, total_bytes);

   return double_int_to_tree (type, result);

where the call to double_int_to_tree re-extends according to the types
precision and sign.  wide_int_to_tree doesn't though?

This is native_encode_int rather than native_interpret_int though.

Yes, I was looking at the matched interpret variant though to see
what we do.

the wide_int_to_tree really needs to canonicalize the value before 
making it into a tree.
the calls to tree_fits*_p (the successor to host_integer_p) depend on 
this being clean.
Otherwise, these functions will have to clean the short integers and 
they get called all over the place.





AIUI it's used for VIEW_CONVERT_EXPRs, so I thought the upper bits
might get used.

Yeah, that might happen, but still relying on the upper bits in any
way would be brittle here.

Richard.




Re: [wide-int] More optimisations

2013-10-29 Thread Kenneth Zadeck

On 10/29/2013 08:43 AM, Richard Sandiford wrote:

Richard Biener  writes:

On Sun, 27 Oct 2013, Richard Sandiford wrote:

This patch adds some more optimisations to the wi:: comparison functions.
It uses the:

   #define CONSTANT(X) (__builtin_constant_p (X) && (X))

idiom that was mentioned before, except that I thought CONSTANT would be
too easily confused with CONSTANT_P, so I went for CAN_TELL instead.
Better names welcome.

STATIC_CONSTANT_P similar to static_assert?

Sounds good.


The changes are:

- Add a fast path to eq_p for when one of the inputs isn't sign-extended.
   This includes code to handle compile-time 0 specially.

- Add the opposite optimisation to Mike's lts_p change, if we can tell at
   compile time that it applies.

I think the cases Mike added should only be enabled when we can figure
them out at compile-time, too.

Well, the idea with most of these functions was to handle the simple
"everything is one HWI" cases inline.  For operations like addition this
needs to be based on the precision, since that's the only cheap way of
knowing whether the result is also a single HWI.  But for logic operations
like AND and OR, it can be based on whether the inputs are single HWIs,
since single-HWI inputs give a single-HWI output.  This catches more cases.

lts_p is a bit like AND and OR in this respect.  fits_shwi_p is the
same as "len == 1", which is simple to check and is often true.  If we
restrict the optimisation to STATIC_CONSTANT_P we'll end up using the
out-of-line comparison for all TREE_INT_CST_LTs, even the int-int ones
that we currently handle inline.


This has always been my way of looking at it.

kenny

Thanks,
Richard





Re: [wide-int] Update main comment

2013-10-29 Thread Kenneth Zadeck

On 10/29/2013 06:37 PM, Richard Sandiford wrote:

This patch tries to update the main wide_int comment to reflect the current
implementation.

- bitsizetype is TImode on x86_64 and others, so I don't think it's
   necessarily true that all offset_ints are signed.  (widest_int are
   though.)
i am wondering if this is too conservative an interpretation.I 
believe that they are ti mode because that is the next thing after di 
mode and so they wanted to accommodate the 3 extra bits. Certainly there 
is no x86 that is able to address more than 64 bits.


aside from that all of this looks OK to me.


- As discussed in the early threads, I think the first reason for
   using widest_int is bogus.  Extensions should be done in the sign
   of source regardless of which wide_int type you're using.
   Extending directly to another wide_int is fine.

- offset_int now only contains the HWIs that it needs, rather than
   the full MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT.

- offset_int and widest_int no longer store the precision.

- Precision 0 is gone (both as a constant marker and for zero-width
   bitfields -- thanks Richard).

Does it look OK?

Thanks,
Richard


Index: gcc/wide-int.h
===
--- gcc/wide-int.h  (revision 204183)
+++ gcc/wide-int.h  (working copy)
@@ -70,28 +70,18 @@
   been no effort by the front ends to convert most addressing
   arithmetic to canonical types.
  
- In the offset_int, all numbers are represented as signed numbers.

- There are enough bits in the internal representation so that no
- infomation is lost by representing them this way.
-
   3) widest_int.  This representation is an approximation of
   infinite precision math.  However, it is not really infinite
   precision math as in the GMP library.  It is really finite
   precision math where the precision is 4 times the size of the
   largest integer that the target port can represent.
  
- Like the offset_ints, all numbers are inherently signed.

+ widest_int is supposed to be wider than any number that it needs to
+ store, meaning that there is always at least one leading sign bit.
+ All widest_int values are therefore signed.
  
   There are several places in the GCC where this should/must be used:
  
- * Code that does widening conversions.  The canonical way that

-   this is performed is to sign or zero extend the input value to
-   the max width based on the sign of the type of the source and
-   then to truncate that value to the target type.  This is in
-   preference to using the sign of the target type to extend the
-   value directly (which gets the wrong value for the conversion
-   of large unsigned numbers to larger signed types).
-
   * Code that does induction variable optimizations.  This code
 works with induction variables of many different types at the
 same time.  Because of this, it ends up doing many different
@@ -122,17 +112,17 @@
 two, the default is the prefered representation.
  
 All three flavors of wide_int are represented as a vector of

-   HOST_WIDE_INTs.  The vector contains enough elements to hold a
-   value of MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT which is
-   a derived for each host/target combination.  The values are stored
-   in the vector with the least significant HOST_BITS_PER_WIDE_INT
-   bits of the value stored in element 0.
+   HOST_WIDE_INTs.  The default and widest_int vectors contain enough elements
+   to hold a value of MAX_BITSIZE_MODE_ANY_INT bits.  offset_int contains only
+   enough elements to hold ADDR_MAX_PRECISION bits.  The values are stored
+   in the vector with the least significant HOST_BITS_PER_WIDE_INT bits
+   in element 0.
  
-   A wide_int contains three fields: the vector (VAL), precision and a

-   length (LEN).  The length is the number of HWIs needed to
-   represent the value.  For the widest_int and the offset_int,
-   the precision is a constant that cannot be changed.  For the
-   default wide_int, the precision is set from the constructor.
+   The default wide_int contains three fields: the vector (VAL),
+   the precision and a length (LEN).  The length is the number of HWIs
+   needed to represent the value.  widest_int and offset_int have a
+   constant precision that cannot be changed, so they only store the
+   VAL and LEN fields.
  
 Since most integers used in a compiler are small values, it is

 generally profitable to use a representation of the value that is
@@ -143,75 +133,90 @@
 as long as they can be reconstructed from the top bit that is being
 represented.
  
+   The precision and length of a wide_int are always greater than 0.

+   Any bits in a wide_int above the precision are sign-extended from the
+   most significant bit.  For example, a 4-bit value 0x8 is represented as
+   VAL = { 0xf...fff8 }.  However, as an o

Re: [wide-int] More optimisations

2013-10-29 Thread Kenneth Zadeck

On 10/29/2013 08:43 AM, Richard Sandiford wrote:

Richard Biener  writes:

On Sun, 27 Oct 2013, Richard Sandiford wrote:

This patch adds some more optimisations to the wi:: comparison functions.
It uses the:

   #define CONSTANT(X) (__builtin_constant_p (X) && (X))

idiom that was mentioned before, except that I thought CONSTANT would be
too easily confused with CONSTANT_P, so I went for CAN_TELL instead.
Better names welcome.

STATIC_CONSTANT_P similar to static_assert?

Sounds good.


The changes are:

- Add a fast path to eq_p for when one of the inputs isn't sign-extended.
   This includes code to handle compile-time 0 specially.

- Add the opposite optimisation to Mike's lts_p change, if we can tell at
   compile time that it applies.

I think the cases Mike added should only be enabled when we can figure
them out at compile-time, too.

Well, the idea with most of these functions was to handle the simple
"everything is one HWI" cases inline.  For operations like addition this
needs to be based on the precision, since that's the only cheap way of
knowing whether the result is also a single HWI.  But for logic operations
like AND and OR, it can be based on whether the inputs are single HWIs,
since single-HWI inputs give a single-HWI output.  This catches more cases.

lts_p is a bit like AND and OR in this respect.  fits_shwi_p is the
same as "len == 1", which is simple to check and is often true.  If we
restrict the optimisation to STATIC_CONSTANT_P we'll end up using the
out-of-line comparison for all TREE_INT_CST_LTs, even the int-int ones
that we currently handle inline.

Thanks,
Richard

yes, this is a very boring and mechanical patch and i found nothing to 
talk about.


kenny


  1   2   3   4   >