https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121120
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Blocks| |53947 Component|c++ |tree-optimization Ever confirmed|0 |1 Version|unknown |16.0 Last reconfirmed| |2025-07-16 Severity|normal |enhancement --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Confirmed. For 'f' we have <bb 2> [local count: 1073741824]: _4 = x_2(D)->a; _5 = y_3(D)->a; if (_4 == _5) goto <bb 3>; [50.00%] else goto <bb 5>; [50.00%] <bb 5> [local count: 536870912]: goto <bb 4>; [100.00%] <bb 3> [local count: 536870912]: _6 = x_2(D)->b; _7 = y_3(D)->b; _9 = _6 == _7; <bb 4> [local count: 1073741824]: # _8 = PHI <0(5), _9(3)> this isn't currently handled. For 'f2' we see <bb 2> [local count: 1073741824]: _4 = x_2(D)->a; _5 = y_3(D)->a; if (_4 == _5) goto <bb 3>; [50.00%] else goto <bb 5>; [50.00%] <bb 5> [local count: 536870912]: goto <bb 4>; [100.00%] <bb 3> [local count: 536870912]: _8 = MEM[(int *)x_2(D) + 8B]; _9 = MEM[(int *)x_2(D) + 12B]; _10 = MEM[(int *)y_3(D) + 8B]; _11 = MEM[(int *)y_3(D) + 12B]; _16 = _9 == _11; _17 = _8 == _10; _15 = _16 & _17; <bb 4> [local count: 1073741824]: # _7 = PHI <0(5), _15(3)> this isn't fully handled either, but we try to vectorize the _16 & _17 reduction. This fails also because of the mixed types involved. When making all elemnts int64_t we fail to vectorize the non-loop code because x86 doesn't implement bit-and reduction (and for cost reasons we do not consider open-coding it). That said, for the control-flow thing there is another duplicate bugreport. For mixed types SLP build would need to either detect this as memory compare or widen to a common type. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations