On Thu, 17 Oct 2013, Richard Sandiford wrote:
> Kenneth Zadeck <[email protected]> writes:
> >> As mentioned in my message yesterday, I thought your new way of canonising
> >> unsigned tree constants meant that there was always an upper zero bit.
> >> Is that right?
> > i believe this is correct.
> >> If so, xprecision < precision is a no-op, because the number always
> >> has the right form for wider precisions. The only difficult case is
> >> xprecision == precision, since then we need to peel off any upper -1 HWIs.
> > say my HWI size is 8 bits (just to keep from typing a million 'f's. if
> > i have a 16 bit unsigned number that is all 1s in the tree world it is 3
> > hwis
> > 0x00 0xff 0xff.
> >
> > but inside regular wide int, it would take 1 wide int whose value is 0xff.
> > inside of max it would be the same as the tree, but then the test
> > precision < xprecision + hbpwi never kicks in because precision is
> > guaranteed to be huge.
> >
> > inside of addr_wide_int, i think we tank with the assertion.
>
> It should be OK for addr_wide_int too. The precision still fits 2 HWIs.
> The initial length is greater than the maximum length of an addr_wide_int,
> but your "len = MAX (len, max_len)" deals with that.
It's
len = MIN (len, max_len)
which looked suspicious to me, but with precision >= xprecision
precision can only be zero if xprecision is zero which looked to
me like it cannot happen - or rather it should be fixed.
> > the case actually comes up on the ppc because they do a lot of 128 bit
> > math. I think i got thru the x86-64 without noticing this.
>
> Well, it'd be suspicious if we're directly using 128-bit numbers
> in addr_wide_int. The justification for the assertion was that we
> should explicitly truncate to addr_wide_int when deliberately
> ignoring upper bits, beyond bit or byte address width. 128 bits
> definitely falls into that category on powerpc.
My question is whether with 8-bit HWI 0x00 0xff 0xff is a valid
wide-int value if it has precision 16. AFAIK that is what the
code produces, but now Kenny says this is only for some kind
of wide-ints but not all? That is, why is
inline wi::storage_ref
wi::int_traits <const_tree>::decompose (HOST_WIDE_INT *scratch,
unsigned int precision, const_tree
x)
{
unsigned int len = TREE_INT_CST_NUNITS (x);
const HOST_WIDE_INT *val = (const HOST_WIDE_INT *) &TREE_INT_CST_ELT (x,
0);
return wi::storage_ref (val, len, precision);
}
not a valid implementation together with making sure that the
INTEGER_CST tree rep has that extra word of zeros if required?
I would like to see us move in that direction (given that the
tree rep of INTEGER_CST has transitioned to variable-length already).
Btw, code such as
tree
wide_int_to_tree (tree type, const wide_int_ref &pcst)
{
...
unsigned int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
bool recanonize = sgn == UNSIGNED
&& small_prec
&& (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT ==
len;
definitely needs a comment. I would have thought _all_ unsigned
numbers need re-canonicalization. Well, maybe only if we're
forcing the extra zeros.
[This function shows another optimization issue:
case BOOLEAN_TYPE:
/* Cache false or true. */
limit = 2;
if (wi::leu_p (cst, 1))
ix = cst.to_uhwi ();
I would have expected cst <= 1 be optimized to cst.len == 1 &&
cst.val[0] <= 1. It expands to
<L27>:
MEM[(long int *)&D.50698 + 16B] = 1;
MEM[(struct wide_int_ref_storage *)&D.50698] = &MEM[(struct
wide_int_ref_storage *)&D.50698].scratch;
MEM[(struct wide_int_ref_storage *)&D.50698 + 8B] = 1;
MEM[(struct wide_int_ref_storage *)&D.50698 + 12B] = 32;
_277 = MEM[(const struct wide_int_storage *)&cst + 260B];
if (_277 <= 64)
goto <bb 42>;
else
goto <bb 43>;
<bb 42>:
xl_491 = zext_hwi (1, 32); // ok, checking enabled and thus out-of-line
_494 = MEM[(const long int *)&cst];
_495 = (long unsigned int) _494;
yl_496 = zext_hwi (_495, _277);
_497 = xl_491 < yl_496;
goto <bb 44>;
<bb 43>:
_503 = wi::ltu_p_large (&MEM[(struct wide_int_ref_storage
*)&D.50698].scratch, 1, 32, &MEM[(const struct wide_int_storage
*)&cst].val, len_274, _277);
this keeps D.50698 and cst un-SRAable - inline storage is problematic
for this reason. But the representation should guarantee the
compare with a low precision (32 bit) constant is evaluatable
at compile-time if len of the larger value is > 1, no?
<bb 44>:
# _504 = PHI <_497(42), _503(43)>
D.50698 ={v} {CLOBBER};
if (_504 != 0)
goto <bb 45>;
else
goto <bb 46>;
<bb 45>:
pretmp_563 = MEM[(const struct wide_int_storage *)&cst + 256B];
goto <bb 229> (<L131>);
<bb 46>:
_65 = generic_wide_int<wide_int_storage>::to_uhwi (&cst, 0);
ix_66 = (int) _65;
goto <bb 91>;
The question is whether we should try to optimize wide-int for
such cases or simply not use wi:leu_p (cst, 1) but rather
if (cst.fits_uhwi_p () == 1 && cst.to_uhwi () < 1)
?
> Thanks,
> Richard
>
>
--
Richard Biener <[email protected]>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend