Richard Biener <rguent...@suse.de> writes:
> When investigating the code-bloat / slowness of wide-ints I noticed
> that when creating a wide_int_ref that operation (optionally, but
> in practice almost always) gets a precision input that is ultimately
> determined by get_*_result of the appropriate traits.  I don't
> see code dealing with precision mismatches, but instead for
> example sub:
>
> wi::sub (const T1 &x, const T2 &y)
> {
>   WI_BINARY_RESULT_VAR (result, val, T1, x, T2, y);
>   unsigned int precision = get_precision (result);
>   wide_int_ref xi (x, precision);
>   wide_int_ref yi (y, precision);
>
> and
>
> template <typename T1, typename T2>
> inline wide_int
> wi::int_traits <wide_int_storage>::get_binary_result (const T1 &x, const 
> T2 &y)
> {
>   /* This shouldn't be used for two flexible-precision inputs.  */
>   STATIC_ASSERT (wi::int_traits <T1>::precision_type != FLEXIBLE_PRECISION
>                  || wi::int_traits <T2>::precision_type != 
> FLEXIBLE_PRECISION);
>   if (wi::int_traits <T1>::precision_type == FLEXIBLE_PRECISION)
>     return wide_int::create (wi::get_precision (y));
>   else
>     return wide_int::create (wi::get_precision (x));
> }
>
> suggest that we simply take the precision of 'x', even if for example
> the precision of 'y' is larger.
>
> I'd like to get rid of the precision argument to decompose (because
> it complicates things and looks like a compound operation).

decompose is where the check for mismatched precisions happens though:

template <typename storage>
inline wi::storage_ref
wi::int_traits < generic_wide_int <storage> >::
decompose (HOST_WIDE_INT *, unsigned int precision,
           const generic_wide_int <storage> &x)
{
  gcc_checking_assert (precision == x.get_precision ());
  return wi::storage_ref (x.get_val (), x.get_len (), precision);
}

I really don't want to lose it, because it's important for type safety
in the variable-width wide_int.

For addr_wide_int and max_wide_int, the assert is compiled away
because of the way that binary_traits is defined.

> What was the reason to allow truncation / extension here (at least
> the 'tree' decompose variant asserts that we don't truncate) given
> that the canonical representation should guarantee "proper"
> extension automagically?  (well, maybe I'm still not 100% familiar
> with the wide-int representation)

Well, we need to allow extension for addr_wide_int and max_wide_int
to work, since the idea is that we can do:

   max_wide_int x = (tree) y;

even though "y" is usually going to be less precise than max_wide_int.

As mentioned in the other message, the problem case in terms of canonical
trees vs canonical wide_ints is when the top bit of an unsigned tree constant
is set.  E.g. suppose we have a 32-bit HWI and 60-bit unsigned tree constant:

   ?8000000 00000000

If the canonical tree representation matches the representation of a 60-bit
wide_int, the "?" ought to be "f".  If the canonical tree representation
matches the representation of a >60-bit wide_int extended according to
TYPE_SIGN, the "?" ought to be 0.  Either way, the decompose routine has
to do work for one case and not the other.

If instead we have a 32-bit HWI and 64-bit unsigned tree constant:

   80000000 00000000

and the canonical tree representation is for >64-bit wide_int extended
according to TYPE_SIGN, we need an extra zero HWI.  The decompose routine
then needs to trim the length for 64-bit wide_ints.  If instead the canonical
tree representation is for a 64-bit wide_int, the decompose routine needs
to add the zero HWI for >64-bit wide_ints.

My understanding is that with Kenny's patch, we now canonise tree constants
for the wider-precision case.

> I guess
>
> Index: wide-int.h
> ===================================================================
> --- wide-int.h  (revision 203690)
> +++ wide-int.h  (working copy)
> @@ -790,7 +790,8 @@ inline wide_int_ref_storage::wide_int_re
>  template <typename T>
>  inline wide_int_ref_storage::wide_int_ref_storage (const T &x,
>                                                    unsigned int precision)
> -  : storage_ref (wi::int_traits <T>::decompose (scratch, precision, x))
> +  : storage_ref (wi::int_traits <T>::decompose (scratch,
> +                                               wi::get_precision (x), x))
>  {
>  }
>
> should tell me reasons enough ...?  But looking at 'sub' again,
> what's the "users" way of specifying the desired precision for the
> result?  Wasn't it that the precision of both operands were
> supposed to match (apart from the "special" precisions which are
> dealt with in the traits?)?  Ok, testsuite fails in obvious way
> with the above, even with just short-cutting precision in the
> tree decompose.  It seems that the RTX decompose can
> assert that precision == get_precision (x), but not the tree
> one.  That's weird.

Not really.  The difference is that rtl constants don't have a sign,
so can't be implicitly extended to wider precisions.  There's no sensible
meaning for:

    max_wide_int x = (rtx) y;

Whereas tree constants do have a sign and so the meaning of:

    max_wide_int x = (tree) y;

is more obvious.  That's why the rtl decompose routine has a stricter assert.

If you're saying that even trees should have explicit extensions
when used with addr_wide_int and max_wide_int, then that would be
fine with me, but I thought you wanted to avoid that.

> Btw, in wi::int_traits <const_tree>::decompose you seem to assume
> that only "small_prec" integers can have non-mode precision.  But
> instead you can perfectly well have a 113 bit integer type via
> bitfields and thus 113 bit integer operations.  Also...
>
>   gcc_assert (precision >= xprecision);
> ...
>   /* Got to be careful of precision 0 values.  */
>   if (precision)
>     len = MIN (len, max_len);
> ...
>       if (precision < xprecision + HOST_BITS_PER_WIDE_INT)
>         {
>
> what's this all about?

That's a question for Kenny, but I think this part was covered by
your later message.

> Where exactly are we making use of the fact that wide-ints are
> sign-extended (are they?)?

Yeah, len < max_len wide-ints are implicitly sign-extended to
the full precision.

We're using the representation in things like wi::neg_p() (to give
just one example).

> I think we should be able to drop
> the re-canonicalization of tree constants given that we only
> ever combine two tree constants or two rtx constants (hopefully).

Do you mean re-canonicalisation in decompose?  That's not a trees
vs. rtl thing.  It's a "doing arithmetic in the precision of the inputs"
vs. "doing arithmetic in a wider precision than the inputs" thing.

Thanks,
Richard

Reply via email to