https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96827
Bug ID: 96827 Summary: __m128i from _mm_set_epi32 is backwards with -O3 Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gcc at froghat dot ca Target Milestone: --- Created attachment 49142 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49142&action=edit minimalish preprocessed sources that compiles to differently behaved programs based on optimization level using gcc 10 Hi. Attached reproduces an issue where an __m128i value appears backwards on gcc versions 10 and later with -O3. I don't know what's going on but the compiler output does a different thing than compared to that of different compilers, older version of gcc, or gcc 10 with lower optimization levels. The output I expect is "6 4 2 0", with gcc 10 -O3 I get "0 2 4 6". There are a couple modifications to the source that make the issue go away, like adding... > if (dude_[0] == 1234) dude_[0]--; ...after the for loop (but not before). Or using a loop conditional of `i < 2` instead of `i < 3`. And again this happens with -O3 but not with -O2. clang 10 and apparently gcc 5 and 8 both give the expected output. But this is not the case with gcc 10 from my system as well as when compiled from a recent checkout of gcc (82030d51017323c5706d58d8c8626324ece007e4) My system gcc: gcc version 10.2.1 20200723 (Red Hat 10.2.1-1) (GCC) Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux A gcc from a recent git checkout: Target: x86_64-pc-linux-gnu Configured with: /home/sqwishy/src/gcc/configure --enable-languages=c --disable-multilib gcc version 11.0.0 20200827 (experimental) (GCC) This is the command and output of compiling the attached file (min.i) with the version of gcc I built from a recent checkout. > ./usr/local/bin/gcc -v -O3 -Wall -Wextra -fno-strict-aliasing -fwrapv > -fno-aggressive-loop-optimizations min.i Using built-in specs. COLLECT_GCC=./usr/local/bin/gcc COLLECT_LTO_WRAPPER=/opt/gcc-inst/usr/local/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /home/sqwishy/src/gcc/configure --enable-languages=c --disable-multilib Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.0.0 20200827 (experimental) (GCC) COLLECT_GCC_OPTIONS='-v' '-O3' '-Wall' '-Wextra' '-fno-strict-aliasing' '-fwrapv' '-fno-aggressive-loop-optimizations' '-mtune=generic' '-march=x86-64' '-dumpdir' 'a-' /opt/gcc-inst/usr/local/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.0.0/cc1 -fpreprocessed min.i -quiet -dumpdir a- -dumpbase min.i -dumpbase-ext .i -mtune=generic -march=x86-64 -O3 -Wall -Wextra -version -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations -o /tmp/ccUHO9zb.s GNU C17 (GCC) version 11.0.0 20200827 (experimental) (x86_64-pc-linux-gnu) compiled by GNU C version 11.0.0 20200827 (experimental), GMP version 6.1.2, MPFR version 4.0.2-p9, MPC version 1.1.0, isl version none GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 GNU C17 (GCC) version 11.0.0 20200827 (experimental) (x86_64-pc-linux-gnu) compiled by GNU C version 11.0.0 20200827 (experimental), GMP version 6.1.2, MPFR version 4.0.2-p9, MPC version 1.1.0, isl version none GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: 04849eadbd492e04db2b98f5ed9cc34b COLLECT_GCC_OPTIONS='-v' '-O3' '-Wall' '-Wextra' '-fno-strict-aliasing' '-fwrapv' '-fno-aggressive-loop-optimizations' '-mtune=generic' '-march=x86-64' '-dumpdir' 'a-' as -v --64 -o /tmp/ccMwZ4Yb.o /tmp/ccUHO9zb.s GNU assembler version 2.34 (x86_64-redhat-linux) using BFD version version 2.34-4.fc32 COMPILER_PATH=/opt/gcc-inst/usr/local/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.0.0/:/opt/gcc-inst/usr/local/bin/../libexec/gcc/ LIBRARY_PATH=/opt/gcc-inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/11.0.0/:/opt/gcc-inst/usr/local/bin/../lib/gcc/:/opt/gcc-inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/11.0.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/opt/gcc-inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/11.0.0/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-v' '-O3' '-Wall' '-Wextra' '-fno-strict-aliasing' '-fwrapv' '-fno-aggressive-loop-optimizations' '-mtune=generic' '-march=x86-64' '-dumpdir' 'a.' /opt/gcc-inst/usr/local/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.0.0/collect2 -plugin /opt/gcc-inst/usr/local/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.0.0/liblto_plugin.so -plugin-opt=/opt/gcc-inst/usr/local/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.0.0/lto-wrapper -plugin-opt=-fresolution=/tmp/ccF6LwQe.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /lib/../lib64/crt1.o /lib/../lib64/crti.o /opt/gcc-inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/11.0.0/crtbegin.o -L/opt/gcc-inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/11.0.0 -L/opt/gcc-inst/usr/local/bin/../lib/gcc -L/opt/gcc-inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/11.0.0/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/opt/gcc-inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/11.0.0/../../.. /tmp/ccMwZ4Yb.o -lgcc --push-state --as-needed -lgcc_s --pop-state -lc -lgcc --push-state --as-needed -lgcc_s --pop-state /opt/gcc-inst/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/11.0.0/crtend.o /lib/../lib64/crtn.o COLLECT_GCC_OPTIONS='-v' '-O3' '-Wall' '-Wextra' '-fno-strict-aliasing' '-fwrapv' '-fno-aggressive-loop-optimizations' '-mtune=generic' '-march=x86-64' '-dumpdir' 'a.'