https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902
Bug ID: 106902 Summary: Program compiled with -O3 -fmfa produces different result Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: jhllawrence963 at gmail dot com Target Milestone: --- Created attachment 53560 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53560&action=edit Sample C++ program Compiling the attached sample program with g++ -mfma -O3 and executing it leads to the wrong output starting with GCC version 11.1. The expected output is approximately 0.905017, but the actual output is -415762. GCC 10.4 and lower works as expected. Compiling with other optimization flags and -mno-fma works as expected too. About the program: It starts with an array of 1s, performs a local average for each element, then prints one result from the middle of the array. The algorithm has been reduced to remove code that is not needed to reproduce the bug, which is why the expected output is not exactly 1. The sample contains extra code which is not relevant to the bug, but removing them causes the bug to be not reproducible. The relevant parts have been commented with "FIXME". I'm not 100% certain, but there appears to be some loss of precision which gets compounded because the result of one loop iteration is used as an input to the next iterations. The program output becomes more incorrect as the input array size increases. GCC Version: $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /build/gcc/src/gcc/configure --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --with-build-config=bootstrap-lto --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-werror Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.2.0 (GCC)