Re: RFA: tweak integer type used for memcpy folding

Richard Biener Tue, 22 Apr 2014 02:11:34 -0700

On Tue, Apr 22, 2014 at 10:43 AM, Richard Sandiford
<rdsandif...@googlemail.com> wrote:
> Richard Biener <richard.guent...@gmail.com> writes:
>> On Sat, Apr 19, 2014 at 9:51 AM, Richard Sandiford
>> <rdsandif...@googlemail.com> wrote:
>>> wide-int fails to build libitm because of a bad interaction between:
>>>
>>> /* Keep the OI and XI modes from confusing the compiler into thinking
>>>    that these modes could actually be used for computation.  They are
>>>    only holders for vectors during data movement.  */
>>> #define MAX_BITSIZE_MODE_ANY_INT (128)
>>>
>>> and the memcpy folding code:
>>>
>>>       /* Make sure we are not copying using a floating-point mode or
>>>          a type whose size possibly does not match its precision.  */
>>>       if (FLOAT_MODE_P (TYPE_MODE (desttype))
>>>           || TREE_CODE (desttype) == BOOLEAN_TYPE
>>>           || TREE_CODE (desttype) == ENUMERAL_TYPE)
>>>         {
>>>           /* A more suitable int_mode_for_mode would return a vector
>>>              integer mode for a vector float mode or a integer complex
>>>              mode for a float complex mode if there isn't a regular
>>>              integer mode covering the mode of desttype.  */
>>>           enum machine_mode mode = int_mode_for_mode (TYPE_MODE (desttype));
>>>           if (mode == BLKmode)
>>>             desttype = NULL_TREE;
>>>           else
>>>             desttype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
>>> (mode),
>>>                                                        1);
>>>         }
>>>       if (FLOAT_MODE_P (TYPE_MODE (srctype))
>>>           || TREE_CODE (srctype) == BOOLEAN_TYPE
>>>           || TREE_CODE (srctype) == ENUMERAL_TYPE)
>>>         {
>>>           enum machine_mode mode = int_mode_for_mode (TYPE_MODE (srctype));
>>>           if (mode == BLKmode)
>>>             srctype = NULL_TREE;
>>>           else
>>>             srctype = build_nonstandard_integer_type (GET_MODE_BITSIZE 
>>> (mode),
>>>                                                       1);
>>>         }
>>>
>>> The failure occurs for complex long double, which we try to copy as
>>> a 256-bit integer type (OImode).
>>>
>>> This patch tries to do what the comment suggests by introducing a new
>>> form of int_mode_for_mode that replaces vector modes with vector modes
>>> and complex modes with complex modes.  The fallback case of using a
>>> MODE_INT is limited by MAX_FIXED_MODE_SIZE, so can never go above
>>> 128 bits on x86_64.
>>>
>>> The question then is what to do about 128-bit types for i386.
>>> MAX_FIXED_MODE_SIZE is 64 there, which says that int128_t shouldn't be
>>> used for optimisation.  However, gcc.target/i386/pr49168-1.c only passes
>>> for -m32 -msse2 because we use int128_t to copy a float128_t.
>>>
>>> I handled that by allowing MODE_VECTOR_INT to be used instead of
>>> MODE_INT if the mode size is greater than MAX_FIXED_MODE_SIZE,
>>> even if the original type wasn't a vector.
>>
>> Hmm.  Sounds reasonable unless there are very weird targets that
>> cannot efficiently load/store vectors unaligned but can handle
>> efficient load/store of unaligned scalars.
>
> Yeah, in general there's no guarantee that even int_mode_for_mode
> will return a mode with the same alignment as the original.  Callers
> need to check that (like the memcpy folder does).
>
>>> It might be that other callers to int_mode_for_mode should use
>>> the new function too, but I'll look at that separately.
>>>
>>> I used the attached testcase (with printfs added to gcc) to check that
>>> the right modes and types were being chosen.  The patch fixes the
>>> complex float and complex double cases, since the integer type that we
>>> previously picked had a larger alignment than the original complex type.
>>
>> As of complex int modes - are we positively sure that targets even
>> try to do sth "optimal" for loads/stores of those?
>
> Complex modes usually aren't handled directly by .md patterns,
> either int or float.  They're really treated as a pair of values.
> So IMO it still makes sense to fold this case.
>
>>> One possibly subtle side-effect of FLOAT_MODE_P (TYPE_MODE (desttype))
>>> is that vectors are copied as integer vectors if the target supports
>>> them directly but are copied as float vectors otherwise, since in the
>>> latter case the mode will be BLKmode.  E.g. the 1024-bit vectors in the
>>> test are copied as vector floats and vector doubles both before and
>>> after the patch.
>>
>> That wasn't intended ... the folding should have failed if we can't
>> copy using an integer mode ...
>
> Does that mean that the fold give up if TYPE_MODE is BLKmode?
> I can do that as a separate patch if so.


Looking at the code again it should always choose an integer mode/type
via setting desttype/srctype to NULL for BLKmode and

      if (!srctype)
        srctype = desttype;
      if (!desttype)
        desttype = srctype;
      if (!srctype)
        return NULL_TREE;

no?  Thus if we can't get a integer type for either src or dest then
we fail.  But we should never end up with srctype or desttype
being a float mode.  No?

Richard.


> Thanks,
> Richard

Re: RFA: tweak integer type used for memcpy folding

Reply via email to