On Thu, 17 Oct 2013, Richard Sandiford wrote: > Kenneth Zadeck <zad...@naturalbridge.com> writes: > >> As mentioned in my message yesterday, I thought your new way of canonising > >> unsigned tree constants meant that there was always an upper zero bit. > >> Is that right? > > i believe this is correct. > >> If so, xprecision < precision is a no-op, because the number always > >> has the right form for wider precisions. The only difficult case is > >> xprecision == precision, since then we need to peel off any upper -1 HWIs. > > say my HWI size is 8 bits (just to keep from typing a million 'f's. if > > i have a 16 bit unsigned number that is all 1s in the tree world it is 3 > > hwis > > 0x00 0xff 0xff. > > > > but inside regular wide int, it would take 1 wide int whose value is 0xff. > > inside of max it would be the same as the tree, but then the test > > precision < xprecision + hbpwi never kicks in because precision is > > guaranteed to be huge. > > > > inside of addr_wide_int, i think we tank with the assertion. > > It should be OK for addr_wide_int too. The precision still fits 2 HWIs. > The initial length is greater than the maximum length of an addr_wide_int, > but your "len = MAX (len, max_len)" deals with that.
It's len = MIN (len, max_len) which looked suspicious to me, but with precision >= xprecision precision can only be zero if xprecision is zero which looked to me like it cannot happen - or rather it should be fixed. > > the case actually comes up on the ppc because they do a lot of 128 bit > > math. I think i got thru the x86-64 without noticing this. > > Well, it'd be suspicious if we're directly using 128-bit numbers > in addr_wide_int. The justification for the assertion was that we > should explicitly truncate to addr_wide_int when deliberately > ignoring upper bits, beyond bit or byte address width. 128 bits > definitely falls into that category on powerpc. My question is whether with 8-bit HWI 0x00 0xff 0xff is a valid wide-int value if it has precision 16. AFAIK that is what the code produces, but now Kenny says this is only for some kind of wide-ints but not all? That is, why is inline wi::storage_ref wi::int_traits <const_tree>::decompose (HOST_WIDE_INT *scratch, unsigned int precision, const_tree x) { unsigned int len = TREE_INT_CST_NUNITS (x); const HOST_WIDE_INT *val = (const HOST_WIDE_INT *) &TREE_INT_CST_ELT (x, 0); return wi::storage_ref (val, len, precision); } not a valid implementation together with making sure that the INTEGER_CST tree rep has that extra word of zeros if required? I would like to see us move in that direction (given that the tree rep of INTEGER_CST has transitioned to variable-length already). Btw, code such as tree wide_int_to_tree (tree type, const wide_int_ref &pcst) { ... unsigned int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1); bool recanonize = sgn == UNSIGNED && small_prec && (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT == len; definitely needs a comment. I would have thought _all_ unsigned numbers need re-canonicalization. Well, maybe only if we're forcing the extra zeros. [This function shows another optimization issue: case BOOLEAN_TYPE: /* Cache false or true. */ limit = 2; if (wi::leu_p (cst, 1)) ix = cst.to_uhwi (); I would have expected cst <= 1 be optimized to cst.len == 1 && cst.val[0] <= 1. It expands to <L27>: MEM[(long int *)&D.50698 + 16B] = 1; MEM[(struct wide_int_ref_storage *)&D.50698] = &MEM[(struct wide_int_ref_storage *)&D.50698].scratch; MEM[(struct wide_int_ref_storage *)&D.50698 + 8B] = 1; MEM[(struct wide_int_ref_storage *)&D.50698 + 12B] = 32; _277 = MEM[(const struct wide_int_storage *)&cst + 260B]; if (_277 <= 64) goto <bb 42>; else goto <bb 43>; <bb 42>: xl_491 = zext_hwi (1, 32); // ok, checking enabled and thus out-of-line _494 = MEM[(const long int *)&cst]; _495 = (long unsigned int) _494; yl_496 = zext_hwi (_495, _277); _497 = xl_491 < yl_496; goto <bb 44>; <bb 43>: _503 = wi::ltu_p_large (&MEM[(struct wide_int_ref_storage *)&D.50698].scratch, 1, 32, &MEM[(const struct wide_int_storage *)&cst].val, len_274, _277); this keeps D.50698 and cst un-SRAable - inline storage is problematic for this reason. But the representation should guarantee the compare with a low precision (32 bit) constant is evaluatable at compile-time if len of the larger value is > 1, no? <bb 44>: # _504 = PHI <_497(42), _503(43)> D.50698 ={v} {CLOBBER}; if (_504 != 0) goto <bb 45>; else goto <bb 46>; <bb 45>: pretmp_563 = MEM[(const struct wide_int_storage *)&cst + 256B]; goto <bb 229> (<L131>); <bb 46>: _65 = generic_wide_int<wide_int_storage>::to_uhwi (&cst, 0); ix_66 = (int) _65; goto <bb 91>; The question is whether we should try to optimize wide-int for such cases or simply not use wi:leu_p (cst, 1) but rather if (cst.fits_uhwi_p () == 1 && cst.to_uhwi () < 1) ? > Thanks, > Richard > > -- Richard Biener <rguent...@suse.de> SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend