Re: Rework subreg_get_info

Richard Sandiford Wed, 16 Nov 2016 04:34:33 -0800

Eric Botcazou <ebotca...@adacore.com> writes:
>> This isn't intended to change the behaviour, just rewrite the
>> existing logic in a different (and hopefully clearer) way.
>
> Yes, I agree that it's an improvement.  A few remarks below.
>
>> diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
>> index ca6cced..7c0acf5 100644
>> --- a/gcc/rtlanal.c
>> +++ b/gcc/rtlanal.c
>> @@ -3601,31 +3601,28 @@ subreg_get_info (unsigned int xregno, machine_mode
>> xmode, unsigned int offset, machine_mode ymode,
>>               struct subreg_info *info)
>>  {
>> -  int nregs_xmode, nregs_ymode;
>> -  int mode_multiple, nregs_multiple;
>> -  int offset_adj, y_offset, y_offset_adj;
>> -  int regsize_xmode, regsize_ymode;
>> -  bool rknown;
>> +  unsigned int nregs_xmode, nregs_ymode;
>> 
>>    gcc_assert (xregno < FIRST_PSEUDO_REGISTER);
>> 
>> -  rknown = false;
>> +  unsigned int xsize = GET_MODE_SIZE (xmode);
>> +  unsigned int ysize = GET_MODE_SIZE (ymode);
>> +  bool rknown = false;
>> 
>>    /* If there are holes in a non-scalar mode in registers, we expect
>> -     that it is made up of its units concatenated together.  */
>> +     that it is made up of its units concatenated together.  Each scalar
>> +     unit occupies at least one register.  */
>
> Why using "scalar unit" while the first sentence uses "unit"?  Are they 
> different units?


No, the same.  I should probably have updated both.  I added "scalar"
to make it clear that we were talking about units in the GET_MODE_NUNITS
sense rather than the UNITS_PER_WORD sense (i.e. number of scalar values
rather than "unit" as an abstraction of "byte").

> What's the status of the new sentence?  Assertion?  
> Expectation?

Assertion.  It was a necessary but not sufficient condition to satisfy
the pre-patch:

      gcc_assert (hard_regno_nregs[xregno][xmode]
                  == (hard_regno_nregs[xregno][xmode_unit]
                      * GET_MODE_NUNITS (xmode)));

> I'd also make the first sentence grammatical.

Well, I think it's probably grammatical, but how about:

  If the register representation of a non-scalar mode has holes in it,
  we expect the scalar units to be concatenated together, with the holes
  distributed evenly among the scalar units.  Each scalar unit must occupy
  at least one register.

>> @@ -3651,18 +3646,17 @@ subreg_get_info (unsigned int xregno, machine_mode
>> xmode, nregs_ymode = hard_regno_nregs[xregno][ymode];
>> 
>>    /* Paradoxical subregs are otherwise valid.  */
>> -  if (!rknown
>> -      && offset == 0
>> -      && GET_MODE_PRECISION (ymode) > GET_MODE_PRECISION (xmode))
>> +  if (!rknown && offset == 0 && ysize > xsize)
>>      {
>>        info->representable_p = true;
>>        /* If this is a big endian paradoxical subreg, which uses more
>>       actual hard registers than the original register, we must
>>       return a negative offset so that we find the proper highpart
>>       of the register.  */
>> -      if (GET_MODE_SIZE (ymode) > UNITS_PER_WORD
>> -      ? REG_WORDS_BIG_ENDIAN : BYTES_BIG_ENDIAN)
>> -    info->offset = nregs_xmode - nregs_ymode;
>> +      if (REG_WORDS_BIG_ENDIAN != WORDS_BIG_ENDIAN && ysize >
>> UNITS_PER_WORD +       ? REG_WORDS_BIG_ENDIAN
>> +      : byte_lowpart_offset (ymode, xmode) != 0)
>> +    info->offset = (int) nregs_xmode - (int) nregs_ymode;
>>        else
>>      info->offset = 0;
>>        info->nregs = nregs_ymode;
>
> This part is not OK, it turns straightforward code into convoluted code.

This actually was one of the more important changes :-)  In combination
with later patches, the idea is to move away from UNITS_PER_WORD tests
when endianness is regular (all big or all little) and only do them
when the distinction between bytes and words makes a real difference.

The specific motivating examples were SVE predicate registers, which
occupy VL*2 bytes for some runtime VL.  They are smaller than a word
when VL<4, word-sized when VL==4, and bigger than a word when VL>4.
We therefore can't calculate:

  GET_MODE_SIZE (ymode) > UNITS_PER_WORD

at compile time.  This is one of the patches that avoids forcing the
issue unless the answer really matters.

>> @@ -3717,7 +3703,7 @@ subreg_get_info (unsigned int xregno, machine_mode
>> xmode, info->representable_p = true;
>>        info->nregs = nregs_ymode;
>>        info->offset = offset / regsize_ymode;
>> -      gcc_assert (info->offset + info->nregs <= nregs_xmode);
>> +      gcc_assert (info->offset + nregs_ymode <= nregs_xmode);
>>        return;
>>      }
>>      }
>
> This part is not OK, the assertion is intended to be on INFO.

This was a cheap way of avoiding having to add a cast.  I'll add the
cast instead.

>> @@ -3736,47 +3722,39 @@ subreg_get_info (unsigned int xregno, machine_mode
>> xmode, }
>>      }
>> 
>> -  /* This should always pass, otherwise we don't know how to verify
>> -     the constraint.  These conditions may be relaxed but
>> -     subreg_regno_offset would need to be redesigned.  */
>> -  gcc_assert ((GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)) == 0);
>> +  /* Set NUM_BLOCKS to the number of independently-representable YMODE
>> +     values there are in (reg:XMODE XREGNO).  We can view the register
>> +     as consisting of this number of independent "blocks", where each
>> +     block occupies NREGS_YMODE registers and contains exactly one
>> +     representable YMODE value.  */
>>    gcc_assert ((nregs_xmode % nregs_ymode) == 0);
>> +  unsigned int num_blocks = nregs_xmode / nregs_ymode;
>
> I find the "exactly" slightly confusing here, because it can make you think 
> that it contains the number of bytes of a YMODE value; a possible better 
> wording would be "a single" instead of "exactly one".

OK.

>> -  if (WORDS_BIG_ENDIAN != REG_WORDS_BIG_ENDIAN
>> -      && GET_MODE_SIZE (xmode) > UNITS_PER_WORD)
>> -    {
>> -      HOST_WIDE_INT xsize = GET_MODE_SIZE (xmode);
>> -      HOST_WIDE_INT ysize = GET_MODE_SIZE (ymode);
>> -      HOST_WIDE_INT off_low = offset & (ysize - 1);
>> -      HOST_WIDE_INT off_high = offset & ~(ysize - 1);
>> -      offset = (xsize - ysize - off_high) | off_low;
>> -    }
>> -  /* The XMODE value can be seen as a vector of NREGS_XMODE
>> -     values.  The subreg must represent a lowpart of given field.
>> -     Compute what field it is.  */
>> -  offset_adj = offset;
>> -  offset_adj -= subreg_lowpart_offset (ymode,
>> -                                   mode_for_size (GET_MODE_BITSIZE 
> (xmode)
>> -                                                  / nregs_xmode,
>> -                                                  MODE_INT, 0));
>> -
>> -  /* Size of ymode must not be greater than the size of xmode.  */
>> -  mode_multiple = GET_MODE_SIZE (xmode) / GET_MODE_SIZE (ymode);
>> -  gcc_assert (mode_multiple != 0);
>> -
>> -  y_offset = offset / GET_MODE_SIZE (ymode);
>> -  y_offset_adj = offset_adj / GET_MODE_SIZE (ymode);
>> -  nregs_multiple = nregs_xmode / nregs_ymode;
>> -
>> -  gcc_assert ((offset_adj % GET_MODE_SIZE (ymode)) == 0);
>> -  gcc_assert ((mode_multiple % nregs_multiple) == 0);
>> +  /* Calculate the number of bytes in each block.  This must always
>> +     be exact, otherwise we don't know how to verify the constraint.
>> +     These conditions may be relaxed but subreg_regno_offset would
>> +     need to be redesigned.  */
>> +  gcc_assert ((xsize % num_blocks) == 0);
>> +  unsigned int bytes_per_block = xsize / num_blocks;
>> +
>> +  /* Get the number of the first block that contains the subreg and the
>> byte +     offset of the subreg from the start of that block.  */
>> +  unsigned int block_number = offset / bytes_per_block;
>> +  unsigned int subblock_offset = offset % bytes_per_block;
>> 
>>    if (!rknown)
>>      {
>> -      info->representable_p = (!(y_offset_adj % (mode_multiple /
>> nregs_multiple))); +      /* Only the lowpart of each block is
>> representable.  */
>> +      info->representable_p
>> +    = (subblock_offset
>> +       == subreg_size_lowpart_offset (ysize, bytes_per_block));
>>        rknown = true;
>>      }
>> -  info->offset = (y_offset / (mode_multiple / nregs_multiple)) *
>> nregs_ymode; +
>
> This part is OK.
>
>> +  if (WORDS_BIG_ENDIAN != REG_WORDS_BIG_ENDIAN)
>> +    info->offset = (num_blocks - block_number - 1) * nregs_ymode;
>> +  else
>> +    info->offset = block_number * nregs_ymode;
>>    info->nregs = nregs_ymode;
>>  }
>
> Why was the test on the mode size dropped?

For WORDS_BIG_ENDIAN != REG_WORDS_BIG_ENDIAN targets?  In practice
the old code didn't handle the case in which a single word spans more
than one register: if xmode was bigger than a word, ymode was smaller
than a word, and the number of registers in a ymode was smaller than
the number of registers in a word, we would need to take "normal"
endianness into account to resolve the subword register offset while
using REG_WORDS_BIG_ENDIAN for the word component.  Instead the old
code reversed the endianness relative the size of ymode, regardless of
whether ymode was bigger than a word or smaller than a word.  In other
words, the assumption seems to have been that REG_WORDS_BIG_ENDIAN is
effectively "endianness across multiple registers" and there is no need
to subdivide register offsets into words and subwords.

In practice that was OK, since AFAICT no target with WORDS_BIG_ENDIAN !=
REG_WORDS_BIG_ENDIAN had subword-sized registers.  This in turn means
that "block endianness" is always word endianness for these targets.

But I suppose we should have an assert, even though it was already
an implicit assumption... :-)

Thanks,
Richard

Re: Rework subreg_get_info

Reply via email to