On 07/13/16 11:14, Kyrill Tkachov wrote:
Hi all,

The most common way to load and store TImode value in aarch64 is to perform an LDP/STP of two X-registers.
This is the *movti_aarch64 pattern in aarch64.md.
There is a bug in the logic in aarch64_classify_address where it validates the offset in the address used to load a TImode value. It passes down TImode to the aarch64_offset_7bit_signed_scaled_p check which rejects offsets that are not a multiple of the mode size of TImode (16). However, this is too conservative as X-reg LDP/STP
instructions accept immediate offsets that are a multiple of 8.

Also, considering that the definition of aarch64_offset_7bit_signed_scaled_p is:
  return (offset >= -64 * GET_MODE_SIZE (mode)
      && offset < 64 * GET_MODE_SIZE (mode)
      && offset % GET_MODE_SIZE (mode) == 0);

I think the range check may even be wrong for TImode as this will accept offsets in the range [-1024, 1024)
(as long as they are a multiple of 16)
whereas X-reg LDP/STP instructions only accept offsets in the range [-512, 512). So since the check is for an X-reg LDP/STP address we should be passing down DImode.

This patch does that and enables more aggressive generation of REG+IMM addressing modes for 64-bit aligned
TImode values, eliminating many address calculation instructions.
For the testcase in the patch we currently generate:
bar:
        add     x1, x1, 8
        add     x0, x0, 8
        ldp     x2, x3, [x1]
        stp     x2, x3, [x0]
        ret

whereas with this patch we generate:
bar:
        ldp     x2, x3, [x1, 8]
        stp     x2, x3, [x0, 8]
        ret

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

LGTM

--
Evandro Menezes

Reply via email to