https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518
ktkachov at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ktkachov at gcc dot gnu.org --- Comment #50 from ktkachov at gcc dot gnu.org --- (In reply to Wilco from comment #49) > AArch64 does this: > > (define_expand "vec_store_lanesoi<mode>" > [(set (match_operand:OI 0 "aarch64_simd_struct_operand" "=Utv") > (unspec:OI [(match_operand:OI 1 "register_operand" "w") > (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] > UNSPEC_ST2))] > "TARGET_SIMD" > { > if (BYTES_BIG_ENDIAN) > { > rtx tmp = gen_reg_rtx (OImode); > rtx mask = aarch64_reverse_mask (<MODE>mode, <nunits>); > emit_insn (gen_aarch64_rev_reglistoi (tmp, operands[1], mask)); > emit_insn (gen_aarch64_simd_st2<mode> (operands[0], tmp)); > } > else > emit_insn (gen_aarch64_simd_st2<mode> (operands[0], operands[1])); > DONE; > }) > > ARM seems to be missing the swap: > > (define_expand "vec_store_lanesoi<mode>" > [(set (match_operand:OI 0 "neon_struct_operand") > (unspec:OI [(match_operand:OI 1 "s_register_operand") > (unspec:VQ2 [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] > UNSPEC_VST2))] > "TARGET_NEON") > > So clearly looks like a backend issue. Indeed, and arm is missing the equivalent logic, including the reverse_mask, rev_reglist etc. For GCC 8 and the branches the least invasive fix would be to return false for BYTES_BIG_ENDIAN in arm_array_mode_supported_p. That will disable the use of the vec_load, vec_store lanes on big-endian. vectorisation on arm NEON is already severely restricted (look at all the patterns in neon.md gated on !BYTES_BIG_ENDIAN) and the vec_load/store_lanes has never worked correctly on that target as far as I can see, so switching it off properly is not a radical change. At some point we'll want to take a holistic approach for NEON big-endian and fix up (and document!) the lane-ordering everywhere, but the priority at this stage is to fix the wrong-code in a not-too-invasive way.