Re: [Patch, RTL] Eliminate redundant vec_select moves.

H.J. Lu Sat, 14 Dec 2013 08:33:03 -0800

On Wed, Dec 11, 2013 at 7:49 AM, Richard Sandiford
<rdsandif...@googlemail.com> wrote:
> "H.J. Lu" <hjl.to...@gmail.com> writes:
>> On Wed, Dec 11, 2013 at 1:13 AM, Richard Sandiford
>> <rdsandif...@googlemail.com> wrote:
>>> Richard Henderson <r...@redhat.com> writes:
>>>> On 12/10/2013 10:44 AM, Richard Sandiford wrote:
>>>>> Sorry, I don't understand.  I never said it was invalid.  I said
>>>>> (subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
>>>>> a single register.  On a little-endian target, the offset cannot be
>>>>> anything other than 0 in that case.
>>>>>
>>>>> So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
>>>>> something that is always invalid, regardless of the target.  That kind
>>>>> of situation should be rejected by target-independent code instead.
>>>>
>>>> But, we want to disable the subreg before we know whether or not (reg:V4SF 
>>>> X)
>>>> will be allocated to a single hard register.  That is something that we 
>>>> can't
>>>> know in target-independent code before register allocation.
>>>
>>> I was thinking that if we've got a class, we've also got things like
>>> CLASS_MAX_NREGS.  Maybe that doesn't cope with padding properly though.
>>> But even in the padding cases an offset-based check in C_C_M_C could
>>> be derived from other information.
>>>
>>> subreg_get_info handles padding with:
>>>
>>>       nregs_xmode = HARD_REGNO_NREGS_WITH_PADDING (xregno, xmode);
>>>       if (GET_MODE_INNER (xmode) == VOIDmode)
>>>         xmode_unit = xmode;
>>>       else
>>>         xmode_unit = GET_MODE_INNER (xmode);
>>>       gcc_assert (HARD_REGNO_NREGS_HAS_PADDING (xregno, xmode_unit));
>>>       gcc_assert (nregs_xmode
>>>                   == (GET_MODE_NUNITS (xmode)
>>>                       * HARD_REGNO_NREGS_WITH_PADDING (xregno, 
>>> xmode_unit)));
>>>       gcc_assert (hard_regno_nregs[xregno][xmode]
>>>                   == (hard_regno_nregs[xregno][xmode_unit]
>>>                       * GET_MODE_NUNITS (xmode)));
>>>
>>>       /* You can only ask for a SUBREG of a value with holes in the middle
>>>          if you don't cross the holes.  (Such a SUBREG should be done by
>>>          picking a different register class, or doing it in memory if
>>>          necessary.)  An example of a value with holes is XCmode on 32-bit
>>>          x86 with -m128bit-long-double; it's represented in 6 32-bit 
>>> registers,
>>>          3 for each part, but in memory it's two 128-bit parts.
>>>          Padding is assumed to be at the end (not necessarily the 'high 
>>> part')
>>>          of each unit.  */
>>>       if ((offset / GET_MODE_SIZE (xmode_unit) + 1
>>>            < GET_MODE_NUNITS (xmode))
>>>           && (offset / GET_MODE_SIZE (xmode_unit)
>>>               != ((offset + GET_MODE_SIZE (ymode) - 1)
>>>                   / GET_MODE_SIZE (xmode_unit))))
>>>         {
>>>           info->representable_p = false;
>>>           rknown = true;
>>>         }
>>>
>>> and I wouldn't really want to force targets to individually reproduce
>>> that kind of logic at the class level.  If the worst comes to the worst
>>> we could cache the difficult cases.
>>>
>>
>> My case is x86 CANNOT_CHANGE_MODE_CLASS only needs
>> to know if the subreg byte is zero or not.  It doesn't care about mode
>> padding.  You are concerned about information passed to
>> CANNOT_CHANGE_MODE_CLASS is too expensive for target
>> to process.  It isn't the case for x86.
>
> No, I'm concerned that by going this route, we're forcing every target
> (or at least every target with wider-than-word registers, which is most
> of the common ones) to implement the same target-independent restriction.
> This is not an x86-specific issue.
>


It may not be x86 specific. However, the decision is made
based on enum reg_class:

/* Return true if the registers in CLASS cannot represent the change from
   modes FROM at offset SUBREG_BYTE to TO.  */

bool
ix86_cannot_change_mode_class (enum machine_mode from,
                               unsigned int subreg_byte,
                               enum machine_mode to,
                               enum reg_class regclass)
{
  if (from == to)
    return false;

  /* x87 registers can't do subreg at all, as all values are reformatted
     to extended precision.  */
  if (MAYBE_FLOAT_CLASS_P (regclass))
    return true;

  if (MAYBE_SSE_CLASS_P (regclass) || MAYBE_MMX_CLASS_P (regclass))
    {
      /* Vector registers do not support QI or HImode loads.  If we don't
         disallow a change to these modes, reload will assume it's ok to
         drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
         the vec_dupv4hi pattern.  */
      if (GET_MODE_SIZE (from) < 4)
        return true;

      /* Vector registers do not support subreg with nonzero offsets, which
         are otherwise valid for integer registers.  */
      if (subreg_byte != 0 && GET_MODE_SIZE (to) < GET_MODE_SIZE (from))
        return true;
    }

  return false;
}

We check subreg_byte only for SSE or MMX register classes.
We could add a target-independent hook or add subreg_byte to
CANNOT_CHANGE_MODE_CLASS like my patch does.


-- 
H.J.

Re: [Patch, RTL] Eliminate redundant vec_select moves.

Reply via email to