https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99785
--- Comment #7 from Mike Hommey <mh+gcc at glandium dot org> --- It's worth noting that the clang variant of the code makes use of __builtin_shufflevector, which the gcc variant doesn't (per https://searchfox.org/mozilla-central/source/gfx/wr/swgl/src/vector_type.h), so the build time comparison might be influenced by that. clang does manage to inline blend_pixels, though, and the resulting code is much smaller than what GCC produces.