https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98774
--- Comment #3 from Ivan Sorokin <vanyacpp at gmail dot com> --- (In reply to Hongtao.liu from comment #1) > It's fixed in current trunk https://godbolt.org/z/63576n I can confirm that now GCC does use packed multiplication mulpd. Although it is used somewhat inefficiently. The original program contained 8 multiplications and clang does 4 packed multiplication. GCC trunk does 6 packed multiplications. https://godbolt.org/z/EabPxT