On Thu, Aug 11, 2016 at 5:26 PM, H.J. Lu <hjl.to...@gmail.com> wrote: > On Thu, Aug 11, 2016 at 1:16 AM, Uros Bizjak <ubiz...@gmail.com> wrote: >> On Wed, Aug 10, 2016 at 6:44 PM, H.J. Lu <hjl.to...@gmail.com> wrote: >> >>>>>> Use TImode for piecewise move in 64-bit mode. When vector register >>>>>> is used for piecewise move, we don't increase stack_alignment_needed >>>>>> since vector register spill isn't required for piecewise move. Since >>>>>> stack_realign_needed is set to true by checking stack_alignment_estimated >>>>>> set by pseudo vector register usage, we also need to check >>>>>> stack_realign_needed to eliminate frame pointer. >>>>> >>>>> Why only in 64-bit mode? We can use SSE moves also in 32-bit mode. >>>> >>>> I will extend it to 32-bit mode. >>> >>> It doesn't work in 32-bit mode due to >>> >>> #define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (TARGET_64BIT ? TImode : >>> DImode): >>> >>> /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc >>> -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O2 >>> -fno-asynchronous-unwind-tables -m32 -S -o x.s x.i >>> x.i: In function ‘foo’: >>> x.i:6:10: internal compiler error: in by_pieces_ninsns, at expr.c:799 >>> return __builtin_mempcpy (dst, src, 32); >>> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> >> This happens since by_pieces_ninsns determines widest mode by calling >> widest_*INT*_mode_for_size, while moves can also use vector-mode >> moves. This is an infrastructure problem, and will bite you on 64bit >> targets when MOVE_MAX_PIECES returns OImode or XImode size. > > I opened: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74113 > >> +#define MOVE_MAX_PIECES \ >> + ((TARGET_64BIT \ >> + && TARGET_SSE2 \ >> + && TARGET_SSE_UNALIGNED_LOAD_OPTIMAL \ >> + && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) ? 16 : UNITS_PER_WORD) >> >> The above part is OK with an appropriate ??? comment, describing the >> infrastructure limitation. Also, please use GET_MODE_SIZE (TImode) >> instead of magic constant. >> >> Can you please submit the realignment patch as a separate follow-up >> patch? Let's keep two issues separate. >> >> Uros. > > Here is the updated patch. OK for trunk?
OK, but please do not yet introduce: +/* No need to dynamically realign the stack here. */ +/* { dg-final { scan-assembler-not "and\[^\n\r]*%\[re\]sp" } } */ +/* Nor use a frame pointer. */ +/* { dg-final { scan-assembler-not "%\[re\]bp" } } */ in the testcases. This should be part of a followup patch. Thanks, Uros.