https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799
Bug ID: 100799 Summary: Stackoverflow in optimized code on PPC Product: gcc Version: 10.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: alexander.gr...@tu-dresden.de Target Milestone: --- Created attachment 50879 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50879&action=edit Disassembly of dbgebal_ in debug and release modes Quick summary of the use case: When using FlexiBLAS with OpenBLAS I noticed corruption of the parameters passed to OpenBLAS functions. FlexiBLAS basically provides a BLAS interface where each function is a stub that forwards the arguments to a real BLAS lib, like OpenBLAS Example: void FC_GLOBAL(dgebal,DGEBAL)(char* job, blasint* n, double* a, blasint* lda, blasint* ilo, blasint* ihi, double* scale, blasint* info) { void (*fn) (void* job, void* n, void* a, void* lda, void* ilo, void* ihi, void* scale, void* info); fn = current_backend->lapack.dgebal.f77_blas_function; fn((void*) job, (void*) n, (void*) a, (void*) lda, (void*) ilo, (void*) ihi, (void*) scale, (void*) info); return; } void dgebal(char* job, blasint* n, double* a, blasint* lda, blasint* ilo, blasint* ihi, double* scale, blasint* info) __attribute__((alias(MTS(FC_GLOBAL(dgebal,DGEBAL))))); Due to the alias and the real BLAS lib being loader after FlexiBLAS also the calls from an OpenBLAS function to another OpenBLAS function get routed through FlexiBLAS. Now I noticed that the parameter "N" at https://github.com/xianyi/OpenBLAS/blob/v0.3.15/lapack-netlib/SRC/dgeev.f#L369 gets messed up during the call at https://github.com/xianyi/OpenBLAS/blob/v0.3.15/lapack-netlib/SRC/dgeev.f#L363 which I traced to FlexiBLAS pushing the register that holds it, calling the OpenBLAS DGEBAL and restoring it afterwards but the stack entry where it came from gets changed by DGEBAL So the actual Bug here is that GCC generates code for DGEBAL which uses a write outside of the allocated stack. The dissassembly of the dgebal_ function shows "stdu r1,-368(r1)" in the prologue and "std r25,440(r1)" later, which is the instruction that overwrites the saved register from the calling function. As far as I can tell an offset of 440 onto r1, which is bigger than the 368 "allocated" by the stdu is invalid. The line reported by GDB for the overwriting instruction is https://github.com/xianyi/OpenBLAS/blob/v0.3.15/lapack-netlib/SRC/dgebal.f#L328 The command used to compile the file is: gfortran -fno-math-errno -Wall -frecursive -fno-optimize-sibling-calls -m64 -fopenmp -fPIC -O2 -fno-fast-math -mcpu=power9 -mtune=power9 -DUSE_OPENMP -fopenmp -fno-optimize-sibling-calls -g -c -o dgebal.o dgebal.f Replacing the "O2" by "Og" changes the prologue to "stdu r1,-336(r1)" and the max offset used for std on r1 is 328. Using this works with FlexiBLAS, hence I suspect an optimization issue which leads to more spills but doesn't update the stack size. Reproduced with GCC 10.2.0, 10.3.0, 11.1.0