https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78205
Bug ID: 78205 Summary: BB vectorization confused by too large load groups Product: gcc Version: 7.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Blocks: 53947 Target Milestone: --- double x, a[4], b[4], c[5]; void foo () { a[0] = c[0]; a[1] = c[1]; a[2] = c[0]; a[3] = c[1]; b[0] = c[2]; b[1] = c[3]; b[2] = c[2]; b[3] = c[3]; x = c[4]; } if you comment the load from c[4] the testcase will be vectorized. If not the we run into /* ??? The following is overly pessimistic (as well as the loop case above) in the case we can statically determine the excess elements loaded are within the bounds of a decl that is accessed. Likewise for BB vectorizations using masked loads is a possibility. */ if (bb_vinfo && slp_perm && group_size % nunits != 0) { dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "BB vectorization with gaps at the end of a load " "is not supported\n"); return false; } I do have a hackish patch. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations