https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61194
--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> --- void bar2() { for (int i=0; i<1024; ++i) { k[i] = x[i]>0; j[i] = w[i]<0; z[i] = ( k[i] & j[i]) ? z[i] : y[i]; } } has similar issues (non-single-uses due to CSE and propagating from the conversion sources): _5 = x[i_20]; _6 = _5 > 0.0; _7 = (int) _6; k[i_20] = _7; _9 = w[i_20]; _10 = _9 < 0.0; _11 = (int) _10; j[i_20] = _11; _18 = _10 & _6; iftmp.0_14 = z[i_20]; iftmp.0_15 = y[i_20]; iftmp.0_2 = _18 ? iftmp.0_14 : iftmp.0_15; z[i_20] = iftmp.0_2; This is generally caused by optimizing code to use smaller precisions. So I think we need a more general solution for this than just the 2nd patch I attached (which I won't pursue - I figure the first one would be way more useful as it results in the same result for your initial large testcase where the 2nd patch doesn't make a difference).