https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102649
Bug ID: 102649 Summary: GCC 9.3.1 LTO bug -- incorrect function call, bad stack arguments pushed Product: gcc Version: 9.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: davidhaufegcc at gmail dot com CC: marxin at gcc dot gnu.org Target Milestone: --- Hello, We witnessed incorrect application behavior in a large binary built using LTO. Doing an assembly instruction stepping of the binary, the issue was identified. We have a function with 21 parameters. The function is called from many call-sites. In the instance that is not working properly, the C++ function caller passes a hard-coded integer '0' to a variable which is passed on the stack (ie not register passed). GCC ends up generating two versions of the called function under LTO. A version of the function that takes this integer parameter, and one that optimizes out the need for this integer to be passed at all, as it is a hardcoded 0. The issue is that the caller is still pushing an integer 0 function parameter onto the stack. The callee does not expect the caller to have done this and then is incorrectly popping stack function arguments that have been offset by this extra stack arg. This issue was complicated to track down because some time later in our codebase, unrelated classes/files in the same static library as the caller were touched. The bug has since stopped. Rolling back GIT we can reproduce the bug over about 10 checkins of unrelated code, and then unrelated code causes the bug to stop. GCC generates the proper variable passing stack for the optimized function. Compile flag investigation: All builds were done with -O3 -flto -fno-fat-lto-objects -ffast-math -funroll-loops Disabling LTO -- bug does not present itself With LTO on, we decomposed -ffast-math into its individual flags. If we leave all -ffast-math flags on but disable -freciprocal-math, the bug does not present itself. The code in question doesn't have any division anywhere around it. We speculate that disabling -freciprocal-math or the codebase generally changing fixed the bug because it simply changes the global state of the compile. This made us very nervous as there was no way to anticipate this bug going forward. We are using the devtoolset-9 (GCC 9.3.1) centos7/rh7 package. Moving to the devtoolset-10 (GCC 10.2.1) package "fixes" the issue with the same code and build flags. devtoolset-8 (GCC 8.3.1) does not present the bug either. Our concern is that the bug is not actually fixed though, and that moving versions of GCC is like changing our codebase by 10 unrelated check-ins or disabling -freciprocal-math. It is simply changing the state of the compile. The bug may or may not be fixed. I would like to help in any way I can. This build generates a binary that is 200MB w/o debug symbols. It is a lot of code. I do not think we can create a smaller test case showing this behavior. I thought about doing a bisect of the GCC repo, but even that might just be changing the state of GCC and not actually showing the bug is fixed. It is a concerning bug. I can try to provide any further information that would be useful. Thanks, Dave Haufe $ ./gcc -v Using built-in specs. COLLECT_GCC=./gcc COLLECT_LTO_WRAPPER=/opt/rh/devtoolset-9/root/usr/libexec/gcc/x86_64-redhat-linux/9/lto-wrapper Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/opt/rh/devtoolset-9/root/usr --mandir=/opt/rh/devtoolset-9/root/usr/share/man --infodir=/opt/rh/devtoolset-9/root/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --with-default-libstdcxx-abi=gcc4-compatible --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-9.3.1-20200408/obj-x86_64-redhat-linux/isl-install --disable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux Thread model: posix gcc version 9.3.1 20200408 (Red Hat 9.3.1-2) (GCC) $ cat /etc/*release* CentOS Linux release 7.9.2009 (Core) Derived from Red Hat Enterprise Linux 7.9 (Source) cat: /etc/lsb-release.d: Is a directory NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/" CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7" CentOS Linux release 7.9.2009 (Core) CentOS Linux release 7.9.2009 (Core) cpe:/o:centos:centos:7 Example of .cpp file compile with args g++ -m64 -std=c++17 -Wsuggest-override -Wduplicated-cond -Wduplicated-branches -Wcast-qual -Wmissing-include-dirs -Wall -Werror -Wextra -fno-strict-aliasing -ggdb -frecord-gcc-switches -I. -I...... -O3 -flto -fno-fat-lto-objects -ffast-math -funroll-loops -c ServiceThread.cpp -o release/gcc/ServiceThread.o Example of final link g++ -Werror -Wl,--fatal-warnings release/gcc/main.o ...many *.a libs ... -lcap -lnuma -lpthread -lrt -ldl -lutil -lstdc++ -lstdc++fs -lm -lcrypto -lz -flto=4 -O3 -ffast-math -funroll-loops -o ./release/gcc/app