On Thu, 17 Oct 2013, Richard Sandiford wrote:

> Kenneth Zadeck <zad...@naturalbridge.com> writes:
> >> As mentioned in my message yesterday, I thought your new way of canonising
> >> unsigned tree constants meant that there was always an upper zero bit.
> >> Is that right?
> > i believe this is correct.
> >> If so, xprecision < precision is a no-op, because the number always
> >> has the right form for wider precisions.  The only difficult case is
> >> xprecision == precision, since then we need to peel off any upper -1 HWIs.
> > say my HWI size is 8 bits (just to keep from typing a million 'f's.   if 
> > i have a 16 bit unsigned number that is all 1s in the tree world it is 3 
> > hwis
> > 0x00 0xff 0xff.
> >
> > but inside regular wide int, it would take 1 wide int whose value is 0xff.
> > inside of max it would be the same as the tree, but then the test 
> > precision < xprecision + hbpwi never kicks in because precision is 
> > guaranteed to be huge.
> >
> > inside of addr_wide_int, i think we tank with the assertion.
> 
> It should be OK for addr_wide_int too.  The precision still fits 2 HWIs.
> The initial length is greater than the maximum length of an addr_wide_int,
> but your "len = MAX (len, max_len)" deals with that.

It's

  len = MIN (len, max_len)

which looked suspicious to me, but with precision >= xprecision
precision can only be zero if xprecision is zero which looked to
me like it cannot happen - or rather it should be fixed.

> > the case actually comes up on the ppc because they do a lot of 128 bit 
> > math.    I think i got thru the x86-64 without noticing this.
> 
> Well, it'd be suspicious if we're directly using 128-bit numbers
> in addr_wide_int.  The justification for the assertion was that we
> should explicitly truncate to addr_wide_int when deliberately
> ignoring upper bits, beyond bit or byte address width.  128 bits
> definitely falls into that category on powerpc.

My question is whether with 8-bit HWI 0x00 0xff 0xff is a valid
wide-int value if it has precision 16.  AFAIK that is what the
code produces, but now Kenny says this is only for some kind
of wide-ints but not all?  That is, why is

inline wi::storage_ref
wi::int_traits <const_tree>::decompose (HOST_WIDE_INT *scratch,
                                        unsigned int precision, const_tree 
x)
{
  unsigned int len = TREE_INT_CST_NUNITS (x);
  const HOST_WIDE_INT *val = (const HOST_WIDE_INT *) &TREE_INT_CST_ELT (x, 
0);
  return wi::storage_ref (val, len, precision);
}

not a valid implementation together with making sure that the
INTEGER_CST tree rep has that extra word of zeros if required?
I would like to see us move in that direction (given that the
tree rep of INTEGER_CST has transitioned to variable-length already).

Btw, code such as

tree
wide_int_to_tree (tree type, const wide_int_ref &pcst)
{
...
  unsigned int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
  bool recanonize = sgn == UNSIGNED
    && small_prec
    && (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT == 
len;

definitely needs a comment.  I would have thought _all_ unsigned
numbers need re-canonicalization.  Well, maybe only if we're
forcing the extra zeros.

[This function shows another optimization issue:

    case BOOLEAN_TYPE:
      /* Cache false or true.  */
      limit = 2;
      if (wi::leu_p (cst, 1))
        ix = cst.to_uhwi ();

I would have expected cst <= 1 be optimized to cst.len == 1 &&
cst.val[0] <= 1.  It expands to

<L27>:
  MEM[(long int *)&D.50698 + 16B] = 1;
  MEM[(struct wide_int_ref_storage *)&D.50698] = &MEM[(struct 
wide_int_ref_storage *)&D.50698].scratch;
  MEM[(struct wide_int_ref_storage *)&D.50698 + 8B] = 1;
  MEM[(struct wide_int_ref_storage *)&D.50698 + 12B] = 32;
  _277 = MEM[(const struct wide_int_storage *)&cst + 260B];
  if (_277 <= 64)
    goto <bb 42>;
  else
    goto <bb 43>;

  <bb 42>:
  xl_491 = zext_hwi (1, 32);  // ok, checking enabled and thus out-of-line
  _494 = MEM[(const long int *)&cst];
  _495 = (long unsigned int) _494;
  yl_496 = zext_hwi (_495, _277);
  _497 = xl_491 < yl_496;
  goto <bb 44>;

  <bb 43>:
  _503 = wi::ltu_p_large (&MEM[(struct wide_int_ref_storage 
*)&D.50698].scratch, 1, 32, &MEM[(const struct wide_int_storage 
*)&cst].val, len_274, _277);

this keeps D.50698 and cst un-SRAable - inline storage is problematic
for this reason.  But the representation should guarantee the
compare with a low precision (32 bit) constant is evaluatable
at compile-time if len of the larger value is > 1, no?

  <bb 44>:
  # _504 = PHI <_497(42), _503(43)>
  D.50698 ={v} {CLOBBER};
  if (_504 != 0)
    goto <bb 45>;
  else
    goto <bb 46>;

  <bb 45>:
  pretmp_563 = MEM[(const struct wide_int_storage *)&cst + 256B];
  goto <bb 229> (<L131>);

  <bb 46>:
  _65 = generic_wide_int<wide_int_storage>::to_uhwi (&cst, 0);
  ix_66 = (int) _65;
  goto <bb 91>;

The question is whether we should try to optimize wide-int for
such cases or simply not use wi:leu_p (cst, 1) but rather

 if (cst.fits_uhwi_p () == 1 && cst.to_uhwi () < 1)

?


> Thanks,
> Richard
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

Reply via email to