https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518

ktkachov at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ktkachov at gcc dot gnu.org

--- Comment #50 from ktkachov at gcc dot gnu.org ---
(In reply to Wilco from comment #49)
> AArch64 does this:
> 
> (define_expand "vec_store_lanesoi<mode>"
>   [(set (match_operand:OI 0 "aarch64_simd_struct_operand" "=Utv")
>         (unspec:OI [(match_operand:OI 1 "register_operand" "w")
>                     (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
>                    UNSPEC_ST2))]
>   "TARGET_SIMD"
> {
>   if (BYTES_BIG_ENDIAN)
>     {
>       rtx tmp = gen_reg_rtx (OImode);
>       rtx mask = aarch64_reverse_mask (<MODE>mode, <nunits>);
>       emit_insn (gen_aarch64_rev_reglistoi (tmp, operands[1], mask));
>       emit_insn (gen_aarch64_simd_st2<mode> (operands[0], tmp));
>     }
>   else
>     emit_insn (gen_aarch64_simd_st2<mode> (operands[0], operands[1]));
>   DONE;
> })
> 
> ARM seems to be missing the swap:
> 
> (define_expand "vec_store_lanesoi<mode>"
>   [(set (match_operand:OI 0 "neon_struct_operand")
>         (unspec:OI [(match_operand:OI 1 "s_register_operand")
>                     (unspec:VQ2 [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
>                    UNSPEC_VST2))]
>   "TARGET_NEON")
> 
> So clearly looks like a backend issue.

Indeed, and arm is missing the equivalent logic, including the reverse_mask,
rev_reglist etc.

For GCC 8 and the branches the least invasive fix would be to return false for
BYTES_BIG_ENDIAN in arm_array_mode_supported_p. That will disable the use of
the vec_load, vec_store lanes on big-endian. vectorisation on arm NEON is
already severely restricted (look at all the patterns in neon.md gated on
!BYTES_BIG_ENDIAN) and the vec_load/store_lanes has never worked correctly on
that target as far as I can see, so switching it off properly is not a radical
change.

At some point we'll want to take a holistic approach for NEON big-endian and
fix up (and document!) the lane-ordering everywhere, but the priority at this
stage is to fix the wrong-code in a not-too-invasive way.

Reply via email to