https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107916
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Version|unknown |13.0 --- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- Vector lowering does only lower "operations", it doesn't touch data transfer which means {reg,mem} <-> {mem,reg} copies, even if performed as part of PHI node copies. In the end this means unsupported vector modes will be expanded to the stack variables. Note there's later forwprop which will deal with the loads/stores in most cases (but that's really an afterthought), nothing handles the (loop) PHI node case so we end up with <bb 4> [local count: 955630225]: # c_15 = PHI <c_12(4), { 0, 0, 0, 0, 0, 0, 0, 0 }(3)> # i_17 = PHI <i_13(4), 0(3)> _4 = BIT_FIELD_REF <c_15, 128, 0>; _6 = _4 + _5; _18 = BIT_FIELD_REF <c_15, 128, 128>; _19 = _14 + _18; c_12 = {_6, _19}; i_13 = i_17 + 1; if (n_7(D) != i_13) goto <bb 4>; [89.00%] else goto <bb 5>; [11.00%] <bb 5> [local count: 118111600]: # c_16 = PHI <c_12(4), { 0, 0, 0, 0, 0, 0, 0, 0 }(2)> Vector lowering would need to work more like Complex lowering to improve things here. I'm not sure if stmt-by-stmt lowering of PHIs and other reg-reg copies will give the desired results (esp. when backedges are involved).