https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85466
--- Comment #14 from Daniel Elliott <cpphackster at gmail dot com> --- I had a response from chandler carruth on twitter, who informed me that the benchark was hoisting the computation out of the loop. So thats why clang was faster. but also he said that the noconditional version was not vectorized.