https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91861
Bug ID: 91861 Summary: invalid vectorization of isless, islessequal, etc. Product: gcc Version: 9.2.0 Status: UNCONFIRMED Keywords: missed-optimization, wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Target: x86_64-*-*, i?86-*-* Test case (cf. https://godbolt.org/z/z3TH9F): #include <cmath> using V [[gnu::vector_size(16)]] = float; V f(V x, V y) { int r [[gnu::vector_size(16)]]; for (int i = 0; i < 4; ++i) { r[i] = -std::isless(x[i], y[i]); } return reinterpret_cast<V>(r); } Using `-O3`, the `std::isless` calls are vectorized to a cmpnltps instruction which will raise FE_INVALID if one of the arguments is NaN. However, the math.h compare functions are not allowed to raise FP exceptions. There's also a missed optimization here: Starting with AVX, one of the quiet compare instructions can be used. E.g. translate isless to cmpp[sd] with predicate LT_OQ (0x11). Without AVX, it's possible to use CMPORDPS and PCMPGTD: isless(x, y) => xi = reinterpret as int vector(x) yi = reinterpret as int vector(y) xp = xi < 0 ? -(xi & 0x7fffffff) : xi yp = yi < 0 ? -(yi & 0x7fffffff) : yi cmpord(x, y) && (xp < yp) Whether that's faster than a loop over ucomiss is still to be shown.