Missing patch. On Wed, May 28, 2014 at 3:02 PM, bin.cheng <[email protected]> wrote: > Hi, > I was surprised that GCC didn't support addressing modes like > [REG+OFF]/[REG_REG] for instructions ldr/str in vectorization scenarios. > The generated assembly is bad since all address expressions have to be > computed outside of memory reference. The root cause is because aarch64 > effectively rejects reg-indexing (and const-offset) addressing modes in > aarch64_classify_address and miscellaneous simd patterns. > > By fixing this issue, performance of fp benchmarks can be obviously > improved. It can also help vectorized int cases. > > The patch passes bootstrap and regression test on aarch64/little-endian. It > also passes regression test on aarch64/big-endian except for case > "gcc.target/aarch64/vect-mull.c". I analyzed the failed case and now > believe it reveals a latent bug in vectorizer on aarch64/big-endian. The > analysis report is posted at > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00182.html. > > So is it OK? > > Thanks, > bin > > > 2014-05-28 Bin Cheng <[email protected]> > > * config/aarch64/aarch64.c (aarch64_classify_address) > (aarch64_legitimize_reload_address): Support full addressing modes > for vector modes. > * config/aarch64/aarch64.md (mov<mode>, movmisalign<mode>) > (*aarch64_simd_mov<mode>, *aarch64_simd_mov<mode>): Relax > predicates. > > >
-- Best Regards.
Index: gcc/config/aarch64/aarch64-simd.md =================================================================== --- gcc/config/aarch64/aarch64-simd.md (revision 210319) +++ gcc/config/aarch64/aarch64-simd.md (working copy) @@ -19,8 +19,8 @@ ;; <http://www.gnu.org/licenses/>. (define_expand "mov<mode>" - [(set (match_operand:VALL 0 "aarch64_simd_nonimmediate_operand" "") - (match_operand:VALL 1 "aarch64_simd_general_operand" ""))] + [(set (match_operand:VALL 0 "nonimmediate_operand" "") + (match_operand:VALL 1 "general_operand" ""))] "TARGET_SIMD" " if (GET_CODE (operands[0]) == MEM) @@ -29,8 +29,8 @@ ) (define_expand "movmisalign<mode>" - [(set (match_operand:VALL 0 "aarch64_simd_nonimmediate_operand" "") - (match_operand:VALL 1 "aarch64_simd_general_operand" ""))] + [(set (match_operand:VALL 0 "nonimmediate_operand" "") + (match_operand:VALL 1 "general_operand" ""))] "TARGET_SIMD" { /* This pattern is not permitted to fail during expansion: if both arguments @@ -91,9 +91,9 @@ ) (define_insn "*aarch64_simd_mov<mode>" - [(set (match_operand:VD 0 "aarch64_simd_nonimmediate_operand" + [(set (match_operand:VD 0 "nonimmediate_operand" "=w, m, w, ?r, ?w, ?r, w") - (match_operand:VD 1 "aarch64_simd_general_operand" + (match_operand:VD 1 "general_operand" "m, w, w, w, r, r, Dn"))] "TARGET_SIMD && (register_operand (operands[0], <MODE>mode) @@ -119,9 +119,9 @@ ) (define_insn "*aarch64_simd_mov<mode>" - [(set (match_operand:VQ 0 "aarch64_simd_nonimmediate_operand" + [(set (match_operand:VQ 0 "nonimmediate_operand" "=w, m, w, ?r, ?w, ?r, w") - (match_operand:VQ 1 "aarch64_simd_general_operand" + (match_operand:VQ 1 "general_operand" "m, w, w, w, r, r, Dn"))] "TARGET_SIMD && (register_operand (operands[0], <MODE>mode) Index: gcc/config/aarch64/aarch64.c =================================================================== --- gcc/config/aarch64/aarch64.c (revision 210319) +++ gcc/config/aarch64/aarch64.c (working copy) @@ -3075,11 +3075,11 @@ aarch64_classify_address (struct aarch64_address_i enum rtx_code code = GET_CODE (x); rtx op0, op1; bool allow_reg_index_p = - outer_code != PARALLEL && GET_MODE_SIZE(mode) != 16; - + outer_code != PARALLEL && (GET_MODE_SIZE (mode) != 16 + || aarch64_vector_mode_supported_p (mode)); /* Don't support anything other than POST_INC or REG addressing for AdvSIMD. */ - if (aarch64_vector_mode_p (mode) + if (aarch64_vect_struct_mode_p (mode) && (code != POST_INC && code != REG)) return false; @@ -4010,7 +4010,7 @@ aarch64_legitimize_reload_address (rtx *x_p, rtx x = *x_p; /* Do not allow mem (plus (reg, const)) if vector mode. */ - if (aarch64_vector_mode_p (mode) + if (aarch64_vect_struct_mode_p (mode) && GET_CODE (x) == PLUS && REG_P (XEXP (x, 0)) && CONST_INT_P (XEXP (x, 1)))
