https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518

--- Comment #46 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Wonder if that:
  vect_array.11[0] = vect_vec_iv_.7_45;
  vect_array.11[1] = vect__4.8_48;
on armeb shouldn't have been [1] and [0] instead, otherwise we end up with:
(insn 35 37 38 5 (set (subreg:V4SI (reg:OI 155 [ vect_array.11 ]) 0)
        (reg:V4SI 110 [ vect_vec_iv_.7 ])) "pr82518.c":8 939 {*neon_movv4si}
     (nil))
(insn 38 35 41 5 (set (subreg:V4SI (reg:OI 155 [ vect_array.11 ]) 16)
        (plus:V4SI (reg:V4SI 110 [ vect_vec_iv_.7 ])
            (reg:V4SI 171))) "pr82518.c":8 998 {*addv4si3_neon}
     (nil))
(insn 41 38 39 5 (set (reg:V4SI 110 [ vect_vec_iv_.7 ])
        (plus:V4SI (reg:V4SI 110 [ vect_vec_iv_.7 ])
            (reg:V4SI 169))) 998 {*addv4si3_neon}
     (nil))
(insn 39 41 43 5 (set (mem:OI (post_inc:SI (reg:SI 152 [ ivtmp.31 ])) [2
MEM[(int *)vectp_p.9_49]+0 S32 A32])
        (unspec:OI [
                (reg:OI 155 [ vect_array.11 ])
                (unspec:V4SI [
                        (const_int 0 [0])
                    ] UNSPEC_VSTRUCTDUMMY)
            ] UNSPEC_VST2)) "pr82518.c":8 2396 {neon_vst2v4si}
     (expr_list:REG_INC (reg:SI 152 [ ivtmp.31 ])
        (nil)))
where pseudo 110 is the vect_vec_iv_.7_45 ({i, i + 1, i + 2, i + 3}) and
insn 38 adds {1, 1, 1, 1} to that.  It really depends on what exactly the
neon_vst2v4si instruction does on armeb.
        vmov.i32        q10, #4  @ v4si
        vmov.i32        q9, #1  @ v4si
...
        vldr    d16, .L19
        vldr    d17, .L19+8
.L4:
        vadd.i32        q11, q8, q9
        vst1.64 {d16-d17}, [sp:64]
        vadd.i32        q8, q8, q10
        vstr    d22, [sp, #16]
        vstr    d23, [sp, #24]
        vld1.64 {d22-d25}, [sp:64]
        vst2.32 {d22-d25}, [r3]!
If it works like on armel, except the elements of the vectors are byte-swapped,
then it should be [1] and [0].

Reply via email to