https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102583
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2021-10-04 Ever confirmed|0 |1 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- We're lacking stripping of a load that's only used partially I guess. forwprop1 generates _1 = *srcp_11(D); _2 = BIT_FIELD_REF <_1, 32, 128>; _3 = (float) _2; _4 = BIT_FIELD_REF <_1, 32, 160>; _5 = (float) _4; _6 = BIT_FIELD_REF <_1, 32, 192>; _7 = (float) _6; _8 = BIT_FIELD_REF <_1, 32, 224>; _9 = (float) _8; _12 = VEC_PERM_EXPR <_1, _1, { 4, 5, 6, 7, 4, 5, 6, 7 }>; _15 = BIT_FIELD_REF <_12, 128, 0>; _16 = (vector(4) float) _15; and the second forwprop then sees _1 = *srcp_3(D); _4 = VEC_PERM_EXPR <_1, _1, { 4, 5, 6, 7, 4, 5, 6, 7 }>; _5 = BIT_FIELD_REF <_4, 128, 0>; _6 = (vector(4) float) _5; in a single-use chain we probably want to try swapping the BIT_FIELD_REF and the VEC_PERM_EXPR so that we expose the BIT_FIELD_REF directly to the load which we'd already handle.