On Thu, Aug 11, 2016 at 6:12 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
> On Thu, Aug 11, 2016 at 5:51 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
>
>>>>>>>>> Use TImode for piecewise move in 64-bit mode.  When vector register
>>>>>>>>> is used for piecewise move, we don't increase stack_alignment_needed
>>>>>>>>> since vector register spill isn't required for piecewise move.  Since
>>>>>>>>> stack_realign_needed is set to true by checking 
>>>>>>>>> stack_alignment_estimated
>>>>>>>>> set by pseudo vector register usage, we also need to check
>>>>>>>>> stack_realign_needed to eliminate frame pointer.
>>>>>>>>
>>>>>>>> Why only in 64-bit mode? We can use SSE moves also in 32-bit mode.
>>>>>>>
>>>>>>> I will extend it to 32-bit mode.
>>>>>>
>>>>>> It doesn't work in 32-bit mode due to
>>>>>>
>>>>>> #define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (TARGET_64BIT ? TImode : 
>>>>>> DImode):
>>>>>>
>>>>>> /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
>>>>>> -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O2
>>>>>> -fno-asynchronous-unwind-tables -m32 -S -o x.s x.i
>>>>>> x.i: In function ‘foo’:
>>>>>> x.i:6:10: internal compiler error: in by_pieces_ninsns, at expr.c:799
>>>>>>    return __builtin_mempcpy (dst, src, 32);
>>>>>>           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>
>>>>> This happens since by_pieces_ninsns determines widest mode by calling
>>>>> widest_*INT*_mode_for_size, while moves can also use vector-mode
>>>>> moves. This is an infrastructure problem, and will bite you on 64bit
>>>>> targets when MOVE_MAX_PIECES returns OImode or XImode size.
>>>>
>>>> I opened:
>>>>
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74113
>>>>
>>>>> +#define MOVE_MAX_PIECES \
>>>>> +  ((TARGET_64BIT \
>>>>> +    && TARGET_SSE2 \
>>>>> +    && TARGET_SSE_UNALIGNED_LOAD_OPTIMAL \
>>>>> +    && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) ? 16 : UNITS_PER_WORD)
>>>>>
>>>>> The above part is OK with an appropriate ??? comment, describing the
>>>>> infrastructure limitation. Also, please use GET_MODE_SIZE (TImode)
>>>>> instead of magic constant.
>>>>>
>>>>> Can you please submit the realignment patch as a separate follow-up
>>>>> patch? Let's keep two issues separate.
>>>>>
>>>>> Uros.
>>>>
>>>> Here is the updated patch.  OK for trunk?
>>>
>>> OK, but please do not yet introduce:
>>>
>>> +/* No need to dynamically realign the stack here.  */
>>> +/* { dg-final { scan-assembler-not "and\[^\n\r]*%\[re\]sp" } } */
>>> +/* Nor use a frame pointer.  */
>>> +/* { dg-final { scan-assembler-not "%\[re\]bp" } } */
>>>
>>> in the testcases. This should be part of a followup patch.
>>
>> This is what I checked in.
>
> Playing a bit with a patched gcc, I found no stack realignment insns
> in the assembly of the provided testcases. However, if
> -mincoming-stack-boundary=3 is added, then no vector instructions are
> generated (and also no realignment insns).

Ah yes, STV pass is disabled for -mincoming-stack-boundary={2,3}.

It looks that we don't need extra realignment patch.

Uros.

Reply via email to