Hi, In tree-vect-loop.c, it limits the vectorization only to loops that have 2 BBs:
/* Inner-most loop. We currently require that the number of BBs is exactly 2 (the header and latch). Vectorizable inner-most loops look like this: (pre-header) | header <--------+ | | | | +--> latch --+ | (exit-bb) */ if (loop->num_nodes != 2) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "not vectorized: control flow in loop."); return NULL; } Any insights why the limit is set to 2? We found that removing this limit actually improve performance for many applications. Thanks, Dehao