https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99863
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- Also time to fix this stupid veclower behavior: _7 = (unsigned int) _14; _5 = {_7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7}; - v512u32_0_16 = _5 * v512u32_0_15(D); + _53 = {_7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7}; + _54 = BIT_FIELD_REF <_53, 32, 0>; + _55 = BIT_FIELD_REF <v512u32_0_15(D), 32, 0>; + _56 = _54 * _55; + _57 = {_7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7}; + _58 = BIT_FIELD_REF <_57, 32, 32>; + _59 = BIT_FIELD_REF <v512u32_0_15(D), 32, 32>; + _60 = _58 * _59; ... instead of _5 = {_7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7}; - v512u32_0_16 = _5 * v512u32_0_15(D); _54 = BIT_FIELD_REF <_5, 32, 0>; ... or even better _5 = {_7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7, _7}; - v512u32_0_16 = _5 * v512u32_0_15(D); + _55 = BIT_FIELD_REF <v512u32_0_15(D), 32, 0>; + _56 = _7 * _55; it's all "fixed" later by CSE of course. The error might be involving the clever handling of - v256u8_r_17 = _8 + _19; + _117 = BIT_FIELD_REF <_8, 64, 0>; + _118 = BIT_FIELD_REF <_19, 64, 0>; + _119 = _117 ^ _118; + _120 = _118 & 9187201950435737471; + _121 = _117 & 9187201950435737471; + _122 = _119 & 9259542123273814144; + _123 = _120 + _121; + _124 = _122 ^ _123; + _125 = BIT_FIELD_REF <_8, 64, 64>; + _126 = BIT_FIELD_REF <_19, 64, 64>; ... + _149 = {_124, _132, _140, _148}; + _150 = VIEW_CONVERT_EXPR<v256u8>(_149); + v256u8_r_17 = _150; it's enough to -fdisable-tree-forwprop4 (forwprop after veclower) to make the problem show up so it might be as well an RTL optimization issue.