Re: wide-int branch now up for public comment and review

Richard Biener Wed, 28 Aug 2013 01:57:46 -0700

On Fri, 23 Aug 2013, Richard Sandiford wrote:

> Hi Kenny,
> 
> This is the first time I've looked at the implementation of wide-int.h
> (rather than just looking at the rtl changes, which as you know I like
> in general), so FWIW here are some comments on wide-int.h.  I expect
> a lot of them overlap with Richard B.'s comments.
> 
> I also expect many of them are going to be annoying, sorry, but this
> first one definitely will.  The coding conventions say that functions
> should be defined outside the class:
> 
>     http://gcc.gnu.org/codingconventions.html
> 
> and that opening braces should be on their own line, so most of the file
> needs to be reformatted.  I went through and made that change with the
> patch below, in the process of reading through.  I also removed "SGN
> must be SIGNED or UNSIGNED." because it seemed redundant when those are
> the only two values available.  The patch fixes a few other coding standard
> problems and typos, but I've not made any actual code changes (or at least,
> I didn't mean to).
> 
> Does it look OK to install?
> 
> I'm still unsure about these "infinite" precision types, but I understand
> the motivation and I have no objections.  However:
> 
> >     * Code that does widening conversions.  The canonical way that
> >       this is performed is to sign or zero extend the input value to
> >       the max width based on the sign of the type of the source and
> >       then to truncate that value to the target type.  This is in
> >       preference to using the sign of the target type to extend the
> >       value directly (which gets the wrong value for the conversion
> >       of large unsigned numbers to larger signed types).
> 
> I don't understand this particular reason.  Using the sign of the source
> type is obviously right, but why does that mean we need "infinite" precision,
> rather than just doubling the precision of the source?


The comment indeed looks redundant - of course it is not correct
to extend the value "directly".

> >   * When a constant that has an integer type is converted to a
> >     wide-int it comes in with precision 0.  For these constants the
> >     top bit does accurately reflect the sign of that constant; this
> >     is an exception to the normal rule that the signedness is not
> >     represented.  When used in a binary operation, the wide-int
> >     implementation properly extends these constants so that they
> >     properly match the other operand of the computation.  This allows
> >     you write:
> >
> >                tree t = ...
> >                wide_int x = t + 6;
> >
> >     assuming t is a int_cst.
> 
> This seems dangerous.  Not all code that uses "unsigned HOST_WIDE_INT"
> actually wants it to be an unsigned value.  Some code uses it to avoid
> the undefinedness of signed overflow.  So these overloads could lead
> to us accidentally zero-extending what's conceptually a signed value
> without any obvious indication that that's happening.  Also, hex constants
> are unsigned int, but it doesn't seem safe to assume that 0x80000000 was
> meant to be zero-extended.
>
> I realise the same thing can happen if you mix "unsigned int" with
> HOST_WIDE_INT, but the point is that you shouldn't really do that
> in general, whereas we're defining these overloads precisely so that
> a mixture can be used.
> 
> I'd prefer some explicit indication of the sign, at least for anything
> other than plain "int" (so that the compiler will complain about uses
> of "unsigned int" and above).

I prefer the automatic promotion - it is exactly what regular C types
do.  Now, consider

  wide_int x = ... construct 5 with precision 16 ...
  wide_int y = x + 6;

now, '6' is 'int' (precision 32), but with wide-int we treat it
as precision '0' ('infinite').  For x + 6 we the _truncate_ its
precision to that of 'x'(?) not exactly matching C behavior
(where we'd promote 'x' to 'int', perform the add and then truncate
to the precision of 'y' - which for wide-int gets its precision
from the result of x + 6).

Mimicing C would support dropping those 'require equal precision'
asserts but also would require us to properly track a sign to be
able to promote properly (or as I argued all the time always
properly sign-extend values so we effectively have infinite precision
anyway).

The fits_uhwi_p implementation changes scares me off that
"upper bits are undefined" thing a lot again... (I hate introducing
'undefinedness' into the compiler by 'design')

> >   Note that the bits above the precision are not defined and the
> >   algorithms used here are careful not to depend on their value.  In
> >   particular, values that come in from rtx constants may have random
> >   bits.

Which is a red herring.  It should be fixed.  I cannot even believe
that sentence given the uses of CONST_DOUBLE_LOW/CONST_DOUBLE_HIGH
or INTVAL/UINTVAL.  I don't see accesses masking out 'undefined' bits
anywhere.

> I have a feeling I'm rehashing a past debate, sorry, but rtx constants can't
> have random bits.  The upper bits must be a sign extension of the value.
> There's exactly one valid rtx for each (value, mode) pair.  If you saw
> something different then that sounds like a bug.  The rules should already
> be fairly well enforced though, since something like (const_int 128) --
> or (const_int 256) -- will not match a QImode operand.

See.  We're saved ;)

> This is probably the part of the representation that I disagree most with.
> There seem to be two main ways we could hande the extension to whole HWIs:
> 
> (1) leave the stored upper bits undefined and extend them on read
> (2) keep the stored upper bits in extended form
> 
> The patch goes for (1) but (2) seems better to me, for a few reasons:

I agree whole-heartedly.

> * As above, constants coming from rtl are already in the right form,
>   so if you create a wide_int from an rtx and only query it, no explicit
>   extension is needed.
> 
> * Things like logical operations and right shifts naturally preserve
>   the sign-extended form, so only a subset of write operations need
>   to take special measures.
> 
> * You have a public interface that exposes the underlying HWIs
>   (which is fine with me FWIW), so it seems better to expose a fully-defined
>   HWI rather than only a partially-defined HWI.
> 
> E.g. zero_p is:
> 
>   HOST_WIDE_INT x;
> 
>   if (precision && precision < HOST_BITS_PER_WIDE_INT)
>     x = sext_hwi (val[0], precision);
>   else if (len == 0)
>     {
>       gcc_assert (precision == 0);
>       return true;
>     }
>   else
>     x = val[0];
> 
>   return len == 1 && x == 0;
> 
> but I think it really ought to be just:
> 
>   return len == 1 && val[0] == 0;

Yes!

But then - what value does keeping track of a 'precision' have
in this case?  It seems to me it's only a "convenient carrier"
for

  wide_int x = wide-int-from-RTX (y);
  machine_mode saved_mode = mode-available? GET_MODE (y) : magic-mode;
  ... process x ...
  RTX = RTX-from-wide_int (x, saved_mode);

that is, wide-int doesn't do anything with 'precision' but you
can extract it later to not need to remember a mode you were
interested in?

Oh, and of course some operations require a 'precision', like rotate.

> >   When the precision is 0, all the bits in the LEN elements of
> >   VEC are significant with no undefined bits.  Precisionless
> >   constants are limited to being one or two HOST_WIDE_INTs.  When two
> >   are used the upper value is 0, and the high order bit of the first
> >   value is set.  (Note that this may need to be generalized if it is
> >   ever necessary to support 32bit HWIs again).
> 
> I didn't understand this.  When are two HOST_WIDE_INTs needed for
> "precision 0"?

For the wide_int containing unsigned HOST_WIDE_INT ~0.  As we
sign-extend the representation (heh, yes, we do or should!) we
require an extra HWI to store the fact that ~0 is unsigned.

> The main thing that's changed since the early patches is that we now
> have a mixture of wide-int types.  This seems to have led to a lot of
> boiler-plate forwarding functions (or at least it felt like that while
> moving them all out the class).  And that in turn seems to be because
> you're trying to keep everything as member functions.  E.g. a lot of the
> forwarders are from a member function to a static function.
> 
> Wouldn't it be better to have the actual classes be light-weight,
> with little more than accessors, and do the actual work with non-member
> template functions?  There seems to be 3 grades of wide-int:
> 
>   (1) read-only, constant precision  (from int, etc.)
>   (2) read-write, constant precision  (fixed_wide_int)
>   (3) read-write, variable precision  (wide_int proper)
> 
> but we should be able to hide that behind templates, with compiler errors
> if you try to write to (1), etc.

Yeah, I'm probably trying to clean up the implementation once I
got past recovering from two month without GCC ...

Richard.

Re: wide-int branch now up for public comment and review

Reply via email to