https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108953
--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> --- The description of the pass in feels like it is doing invalid transformations, transforming such comparisons into memcmp (provided no padding and all comparisons are 8-bit or little endian) is fine, but unless proven parts of the structs can't trap, I don't see how e.g. following could be turned into say unaligned 64-bit loads and comparison of those. That said, clang++ trunk doesn't optimize any of that at all, not even to memcmp. #include <compare> struct S { char a, b; auto operator<=>(const S &) const = default; }; struct T { struct S c; char d, e; auto operator<=>(const T &) const = default; }; struct U { struct T f; char g, h, i, j; auto operator<=>(const U &) const = default; }; bool foo (U *p, U *q) { if (p->f.c.a != q->f.c.a) return false; if (p->f.c.b != q->f.c.b) return false; if (p->f.d != q->f.d) return false; if (p->f.e != q->f.e) return false; if (p->g != q->g) return false; if (p->h != q->h) return false; if (p->i != q->i) return false; if (p->j != q->j) return false; return true; } std::strong_ordering bar (U *p, U *q) { return *p <=> *q; } bool baz (U *p, U *q) { return *p == *q; } Now, for the defaulted operator== or <=> perhaps loading all the bytes is ok, but for user written operators what clang is doing in #c0 feels very dangerous, with high chances of breaking real-world code. Not all code has all the bytes of the structures allocated, whether it is because of a union as a member with different sizes and using offsetof + sizeof to size the allocation etc. So, unless all the loads are aligned to their size, one can't assume that say g on the #c0 testcase will be mapped when a is. GCC sources itself is just one of many examples where that is violated.