https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70157
--- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to H.J. Lu from comment #3) > (In reply to Uroš Bizjak from comment #2) > > (In reply to H.J. Lu from comment #1) > > > It is due to TARGET_SSE_TYPELESS_STORES. > > > > This is by design, movaps/movups is one byte shorter than movdqa/movdqu. > > movdqu y(%rip), %xmm0 <<<< Shouldn't movups be used here? > movups %xmm0, x(%rip) No, we only have typeless stores. __float128 is otherwise handled like TImode move; it is not a vector of SF or DFmode values.