Richard Biener <richard.guent...@gmail.com> writes: > @@ -958,6 +961,12 @@ streamer_write_tree_header (struct output_block > *ob, tree expr) > streamer_write_uhwi (ob, BINFO_N_BASE_BINFOS (expr)); > else if (TREE_CODE (expr) == CALL_EXPR) > streamer_write_uhwi (ob, call_expr_nargs (expr)); > + else if (CODE_CONTAINS_STRUCT (code, TS_INT_CST)) > + { > + gcc_checking_assert (TREE_INT_CST_NUNITS (expr)); > + streamer_write_uhwi (ob, TREE_INT_CST_NUNITS (expr)); > + streamer_write_uhwi (ob, TREE_INT_CST_EXT_NUNITS (expr)); > + } > > isn't ext_nunits always either nunits or nunits + 1? So it should be > sufficient to write a single bit in the tree_base bitpack and only > write the nunits that are required for the actual allocation in > write_tree_header, not both.
ext_nunits can be > nunits + 1. E.g. imagine a 256-bit all-1s unsigned integer. As a 256-bit number it's simply { -1 }, with all other bits being sign-extensions of the -1 HWI. As a >256-bit number it's { -1, -1, -1, -1, 0 }, assuming 64-bit HWIs. > index 4a24c66..f3e0ffe 100644 > --- a/gcc/tree-vrp.c > +++ b/gcc/tree-vrp.c > @@ -1141,15 +1142,7 @@ operand_less_p (tree val, tree val2) > { > /* LT is folded faster than GE and others. Inline the common case. */ > if (TREE_CODE (val) == INTEGER_CST && TREE_CODE (val2) == INTEGER_CST) > - { > - if (TYPE_UNSIGNED (TREE_TYPE (val))) > - return INT_CST_LT_UNSIGNED (val, val2); > - else > - { > - if (INT_CST_LT (val, val2)) > - return 1; > - } > - } > + return INT_CST_LT (val, val2); > else > { > tree tcmp; > > Note that INT_CST_LT and INT_CST_LT_UNSIGNED were supposed to > be very fast. Now you made them > > #define INT_CST_LT(A, B) (wi::lts_p (wi::to_widest (A), wi::to_widest (B))) > > which a) doesn't look the same to me and b) hides complexity. Not sure what complexity you mean here: to_widest just reads the constant in its "infinite-precision" form. I'm not sure: > So why not write > > if (TREE_CODE (val) == INTEGER_CST && TREE_CODE (val2) == INTEGER_CST) > if (TYPE_UNSIGNED (...) > return wi::ltu_p (val, va2); > else > return wi::lts_p (val, val2); > > ? ...this would be faster in practice, since the fast path in lts_p is length-based and can handle small integers quickly whatever their precision. The second form means that VAL1 and VAL2 must have the same precision. I don't know whether that's a good thing or not in this instance. > @@ -6966,12 +6975,7 @@ tree_int_cst_lt (const_tree t1, const_tree t2) > int > tree_int_cst_compare (const_tree t1, const_tree t2) > { > - if (tree_int_cst_lt (t1, t2)) > - return -1; > - else if (tree_int_cst_lt (t2, t1)) > - return 1; > - else > - return 0; > + return wi::cmps (wi::to_widest (t1), wi::to_widest (t2)); > } > > /* Return true if T is an INTEGER_CST whose numerical value (extended > > How does this work for comparing UHWI_MAX to 0 for example? Looking > at > > template <typename T1, typename T2> > inline int > wi::cmps (const T1 &x, const T2 &y) > { > unsigned int precision = get_binary_precision (x, y); > WIDE_INT_REF_FOR (T1) xi (x, precision); > WIDE_INT_REF_FOR (T2) yi (y, precision); > if (precision <= HOST_BITS_PER_WIDE_INT) > { > HOST_WIDE_INT xl = xi.to_shwi (); > HOST_WIDE_INT yl = yi.to_shwi (); > if (xl < yl) > return -1; > > this seems to happily return -1 instead of 1? Similar issues elsewhere > where you change compares to unconditonally perform signed compares. > (it's at least non-obvious to me). > > Ah, the to_widest () will make it XImode * 4 extended and thus > get_precision returns 2048 (or so) so we don't take the shortcut > (which means it's slow). > > Shouldn't the shortcuts be taken on 'len' == 1 rather than > precision <= HWI? I suppose we could use the same approach as for lts_p, if that seems OK to you. > Side-note: we have too many ways to compare trees / integers Do you mean in wide-int, or in the tree routines? At the wide-int level there are only really two ways: compare trees as sequences of bits (ignoring their sign) and compare trees as "infinite-precision" numbers. Thanks, Richard