On 04/08/16 12:00, Wilco Dijkstra wrote: > This patch adds legitimize_address_displacement hook so that stack accesses > with large offsets are split into a more efficient sequence. Byte and > halfword > accesses use a 4KB range, wider accesses use a 16KB range to maximise the > available addressing range and increase opportunities to share the base > address. > > int f(int x) > { > int arr[8192]; > arr[4096] = 0; > arr[6000] = 0; > arr[7000] = 0; > arr[8191] = 0; > return arr[x]; > } > > Now generates: > > sub sp, sp, #32768 > add x1, sp, 16384 > str wzr, [x1] > str wzr, [x1, 7616] > str wzr, [x1, 11616] > str wzr, [x1, 16380] > ldr w0, [sp, w0, sxtw 2] > add sp, sp, 32768 > ret > > instead of: > > sub sp, sp, #32768 > mov x2, 28000 > add x1, sp, 16384 > mov x3, 32764 > str wzr, [x1] > mov x1, 24000 > add x1, sp, x1 > str wzr, [x1] > add x1, sp, x2 > str wzr, [x1] > add x1, sp, x3 > str wzr, [x1] > ldr w0, [sp, w0, sxtw 2] > add sp, sp, 32768 > ret > > Bootstrap, GCC regression OK.
OK. But please enhance the comment with some explanation as to WHY you've chosen to use just two base pairings rather than separate bases for each access size. R. > > ChangeLog: > 2016-08-04 Wilco Dijkstra <wdijk...@arm.com> > > gcc/ > * config/aarch64/aarch64.c (aarch64_legitimize_address_displacement): > New function. > (TARGET_LEGITIMIZE_ADDRESS_DISPLACEMENT): Define. > -- > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index > a0e7680ad0946a27d95a67a9892bb7e264a90451..7bf12475494fb004f5a92445ae31fdc52af43c3b > 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -4132,6 +4132,19 @@ aarch64_legitimate_address_p (machine_mode mode, rtx x, > return aarch64_classify_address (&addr, x, mode, outer_code, strict_p); > } > > +/* Split an out-of-range address displacement into a base and offset. > + Use 4KB range for 1- and 2-byte accesses and a 16KB range otherwise. */ > + > +static bool > +aarch64_legitimize_address_displacement (rtx *disp, rtx *off, machine_mode > mode) > +{ > + HOST_WIDE_INT mask = GET_MODE_SIZE (mode) < 4 ? 0xfff : 0x3fff; > + > + *off = GEN_INT (INTVAL (*disp) & ~mask); > + *disp = GEN_INT (INTVAL (*disp) & mask); > + return true; > +} > + > /* Return TRUE if rtx X is immediate constant 0.0 */ > bool > aarch64_float_const_zero_rtx_p (rtx x) > @@ -14096,6 +14109,10 @@ aarch64_optab_supported_p (int op, machine_mode > mode1, machine_mode, > #undef TARGET_LEGITIMATE_CONSTANT_P > #define TARGET_LEGITIMATE_CONSTANT_P aarch64_legitimate_constant_p > > +#undef TARGET_LEGITIMIZE_ADDRESS_DISPLACEMENT > +#define TARGET_LEGITIMIZE_ADDRESS_DISPLACEMENT \ > + aarch64_legitimize_address_displacement > + > #undef TARGET_LIBGCC_CMP_RETURN_MODE > #define TARGET_LIBGCC_CMP_RETURN_MODE aarch64_libgcc_cmp_return_mode > >