On Wed, Aug 10, 2016 at 6:44 PM, H.J. Lu <hjl.to...@gmail.com> wrote:

>>>> Use TImode for piecewise move in 64-bit mode.  When vector register
>>>> is used for piecewise move, we don't increase stack_alignment_needed
>>>> since vector register spill isn't required for piecewise move.  Since
>>>> stack_realign_needed is set to true by checking stack_alignment_estimated
>>>> set by pseudo vector register usage, we also need to check
>>>> stack_realign_needed to eliminate frame pointer.
>>>
>>> Why only in 64-bit mode? We can use SSE moves also in 32-bit mode.
>>
>> I will extend it to 32-bit mode.
>
> It doesn't work in 32-bit mode due to
>
> #define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (TARGET_64BIT ? TImode : DImode):
>
> /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O2
> -fno-asynchronous-unwind-tables -m32 -S -o x.s x.i
> x.i: In function ‘foo’:
> x.i:6:10: internal compiler error: in by_pieces_ninsns, at expr.c:799
>    return __builtin_mempcpy (dst, src, 32);
>           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This happens since by_pieces_ninsns determines widest mode by calling
widest_*INT*_mode_for_size, while moves can also use vector-mode
moves. This is an infrastructure problem, and will bite you on 64bit
targets when MOVE_MAX_PIECES returns OImode or XImode size.

+#define MOVE_MAX_PIECES \
+  ((TARGET_64BIT \
+    && TARGET_SSE2 \
+    && TARGET_SSE_UNALIGNED_LOAD_OPTIMAL \
+    && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) ? 16 : UNITS_PER_WORD)

The above part is OK with an appropriate ??? comment, describing the
infrastructure limitation. Also, please use GET_MODE_SIZE (TImode)
instead of magic constant.

Can you please submit the realignment patch as a separate follow-up
patch? Let's keep two issues separate.

Uros.

Reply via email to