https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74585
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> --- Bad case: (mem/c:BLK (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 64 [0x40])) [1 a+0 S64 A128]) #0 set_decl_rtl (t=<parm_decl 0x2aaaac199000 a>, x=0x2aaaac1ae960) at /space/rguenther/src/svn/trunk/gcc/emit-rtl.c:1282 #1 0x00000000008ad5ca in set_rtl (t=<parm_decl 0x2aaaac199000 a>, x=0x2aaaac1ae960) at /space/rguenther/src/svn/trunk/gcc/cfgexpand.c:302 #2 0x00000000008b0236 in set_parm_rtl (parm=<parm_decl 0x2aaaac199000 a>, x=0x2aaaac1ae960) at /space/rguenther/src/svn/trunk/gcc/cfgexpand.c:1275 #3 0x0000000000a7477b in assign_parm_setup_block (all=0x7fffffffd5c0, parm=<parm_decl 0x2aaaac199000 a>, data=0x7fffffffd540) at /space/rguenther/src/svn/trunk/gcc/function.c:3109 #4 0x0000000000a76f92 in assign_parms ( fndecl=<function_decl 0x2aaaac171a00 test_vecd8_rotate_left>) at /space/rguenther/src/svn/trunk/gcc/function.c:3775 #5 0x0000000000a7afc6 in expand_function_start ( subr=<function_decl 0x2aaaac171a00 test_vecd8_rotate_left>) at /space/rguenther/src/svn/trunk/gcc/function.c:5211 Good case: (mem/c:BLK (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 128 [0x80])) [1 a+0 S64 A128]) so no difference. We have in the good case ;; _1 = a.vx0; (insn 17 16 18 (set (reg:V2DF 176 [ _1 ]) (vec_select:V2DF (mem/c:V2DF (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 128 [0x80])) [2 a.vx0+0 S16 A128]) (parallel [ (const_int 1 [0x1]) (const_int 0 [0]) ]))) t.c:18 -1 (nil)) and in the bad case ;; a$vx0_22 = MEM[(struct *)&a]; (insn 17 16 18 (set (reg:V2DF 191 [ a$vx0 ]) (vec_select:V2DF (mem/c:V2DF (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 64 [0x40])) [1 MEM[(struct *)&a]+0 S16 A128]) (parallel [ (const_int 1 [0x1]) (const_int 0 [0]) ]))) -1 (nil)) again almost the same. Parameter setup in the bad case: (insn 2 15 3 2 (set (reg:V4SI 183) (reg:V4SI 79 2 [ a ])) t.c:14 -1 (nil)) (insn 3 2 4 2 (set (reg:V4SI 184) (reg:V4SI 80 3 [ a+16 ])) t.c:14 -1 (nil)) (insn 4 3 5 2 (set (reg:V4SI 185) (reg:V4SI 81 4 [ a+32 ])) t.c:14 -1 (nil)) (insn 5 4 6 2 (set (reg:V4SI 186) (reg:V4SI 82 5 [ a+48 ])) t.c:14 -1 (nil)) (insn 6 5 7 2 (set (reg:V4SI 187) (vec_select:V4SI (reg:V4SI 183) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 7 6 8 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 64 [0x40])) [1 a+0 S16 A128]) (vec_select:V4SI (reg:V4SI 187) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 8 7 9 2 (set (reg:V4SI 188) (vec_select:V4SI (reg:V4SI 184) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 9 8 10 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 80 [0x50])) [1 a+16 S16 A128]) (vec_select:V4SI (reg:V4SI 188) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 10 9 11 2 (set (reg:V4SI 189) (vec_select:V4SI (reg:V4SI 185) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 11 10 12 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 96 [0x60])) [1 a+32 S16 A128]) (vec_select:V4SI (reg:V4SI 189) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 12 11 13 2 (set (reg:V4SI 190) (vec_select:V4SI (reg:V4SI 186) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 13 12 14 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 112 [0x70])) [1 a+48 S16 A128]) (vec_select:V4SI (reg:V4SI 190) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (note 14 13 17 2 NOTE_INSN_FUNCTION_BEG) and in the good case: (insn 2 15 3 2 (set (reg:V4SI 168) (reg:V4SI 79 2 [ a ])) t.c:14 -1 (nil)) (insn 3 2 4 2 (set (reg:V4SI 169) (reg:V4SI 80 3 [ a+16 ])) t.c:14 -1 (nil)) (insn 4 3 5 2 (set (reg:V4SI 170) (reg:V4SI 81 4 [ a+32 ])) t.c:14 -1 (nil)) (insn 5 4 6 2 (set (reg:V4SI 171) (reg:V4SI 82 5 [ a+48 ])) t.c:14 -1 (nil)) (insn 6 5 7 2 (set (reg:V4SI 172) (vec_select:V4SI (reg:V4SI 168) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 7 6 8 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 128 [0x80])) [1 a+0 S16 A128]) (vec_select:V4SI (reg:V4SI 172) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 8 7 9 2 (set (reg:V4SI 173) (vec_select:V4SI (reg:V4SI 169) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 9 8 10 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 144 [0x90])) [1 a+16 S16 A128]) (vec_select:V4SI (reg:V4SI 173) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 10 9 11 2 (set (reg:V4SI 174) (vec_select:V4SI (reg:V4SI 170) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 11 10 12 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 160 [0xa0])) [1 a+32 S16 A128]) (vec_select:V4SI (reg:V4SI 174) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 12 11 13 2 (set (reg:V4SI 175) (vec_select:V4SI (reg:V4SI 171) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (insn 13 12 14 2 (set (mem/c:V4SI (plus:DI (reg/f:DI 150 virtual-stack-vars) (const_int 176 [0xb0])) [1 a+48 S16 A128]) (vec_select:V4SI (reg:V4SI 175) (parallel [ (const_int 2 [0x2]) (const_int 3 [0x3]) (const_int 0 [0]) (const_int 1 [0x1]) ]))) t.c:14 -1 (nil)) (note 14 13 17 2 NOTE_INSN_FUNCTION_BEG) again exactly the same. There must be downstream effects that cause the whole issue during RTL optimization.