On Wed, Aug 10, 2016 at 6:44 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
>>>> Use TImode for piecewise move in 64-bit mode. When vector register >>>> is used for piecewise move, we don't increase stack_alignment_needed >>>> since vector register spill isn't required for piecewise move. Since >>>> stack_realign_needed is set to true by checking stack_alignment_estimated >>>> set by pseudo vector register usage, we also need to check >>>> stack_realign_needed to eliminate frame pointer. >>> >>> Why only in 64-bit mode? We can use SSE moves also in 32-bit mode. >> >> I will extend it to 32-bit mode. > > It doesn't work in 32-bit mode due to > > #define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (TARGET_64BIT ? TImode : DImode): > > /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc > -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O2 > -fno-asynchronous-unwind-tables -m32 -S -o x.s x.i > x.i: In function ‘foo’: > x.i:6:10: internal compiler error: in by_pieces_ninsns, at expr.c:799 > return __builtin_mempcpy (dst, src, 32); > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This happens since by_pieces_ninsns determines widest mode by calling widest_*INT*_mode_for_size, while moves can also use vector-mode moves. This is an infrastructure problem, and will bite you on 64bit targets when MOVE_MAX_PIECES returns OImode or XImode size. +#define MOVE_MAX_PIECES \ + ((TARGET_64BIT \ + && TARGET_SSE2 \ + && TARGET_SSE_UNALIGNED_LOAD_OPTIMAL \ + && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) ? 16 : UNITS_PER_WORD) The above part is OK with an appropriate ??? comment, describing the infrastructure limitation. Also, please use GET_MODE_SIZE (TImode) instead of magic constant. Can you please submit the realignment patch as a separate follow-up patch? Let's keep two issues separate. Uros.