On 07/13/16 11:14, Kyrill Tkachov wrote:
Hi all,
The most common way to load and store TImode value in aarch64 is to
perform an LDP/STP of two X-registers.
This is the *movti_aarch64 pattern in aarch64.md.
There is a bug in the logic in aarch64_classify_address where it
validates the offset in the address used
to load a TImode value. It passes down TImode to the
aarch64_offset_7bit_signed_scaled_p check which rejects
offsets that are not a multiple of the mode size of TImode (16).
However, this is too conservative as X-reg LDP/STP
instructions accept immediate offsets that are a multiple of 8.
Also, considering that the definition of
aarch64_offset_7bit_signed_scaled_p is:
return (offset >= -64 * GET_MODE_SIZE (mode)
&& offset < 64 * GET_MODE_SIZE (mode)
&& offset % GET_MODE_SIZE (mode) == 0);
I think the range check may even be wrong for TImode as this will
accept offsets in the range [-1024, 1024)
(as long as they are a multiple of 16)
whereas X-reg LDP/STP instructions only accept offsets in the range
[-512, 512).
So since the check is for an X-reg LDP/STP address we should be
passing down DImode.
This patch does that and enables more aggressive generation of REG+IMM
addressing modes for 64-bit aligned
TImode values, eliminating many address calculation instructions.
For the testcase in the patch we currently generate:
bar:
add x1, x1, 8
add x0, x0, 8
ldp x2, x3, [x1]
stp x2, x3, [x0]
ret
whereas with this patch we generate:
bar:
ldp x2, x3, [x1, 8]
stp x2, x3, [x0, 8]
ret
Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
LGTM
--
Evandro Menezes