https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118348
Bug ID: 118348 Summary: [SVE] HACCKernels seems to miscompile with VLS SVE after 0c5c0c959c2e592b84739f19ca771fa69eb8dfee Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: prathamesh3492 at gcc dot gnu.org Target Milestone: --- Hi, HACCKernels (https://git.cels.anl.gov/hacc/HACCKernels) seems to miscompile and result in "bus error" after the following commit: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=0c5c0c959c2e592b84739f19ca771fa69eb8dfee with following options: -O3 -ffast-math -fopenmp -mcpu=neoverse-v2 -msve-vector-bits=128 and OMP_NUM_THREADS=1 Running under gdb shows: #0 0x003974873e78c382 in ?? () #1 0x0000000000401398 in _Z3runPFviPfS_S_S_fffffRfS0_S0_EPKc._omp_fn.0(void) () at main.cpp:159 #2 0x3ebecda63dc7e8f3 in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) which likely indicates that there is stack corruption happening, and compiling with -fstack-protector-strong shows: Maximum OpenMP Threads: 1 Iterations: 2000 *** stack smashing detected ***: terminated Aborted The omp clone of run function has following instruction at beginning: Dump of assembler code for function _Z3runPFviPfS_S_S_fffffRfS0_S0_EPKc._omp_fn.0(void): => 0x0000000000401020 <+0>: stp x29, x30, [sp, #-224]! 0x0000000000401024 <+4>: mov x29, sp After stp instruction, sp: 0xffffffffed00 stores x29 and *(sp + 8) stores x30. Setting watchpoint on 0xffffffffed00 shows that the value of x29 and x30 gets overwritten in _Z19GravityForceKernel4iPfS_S_S_fffffRfS0_S0_ at following st1w instruction: 0x00000000004018dc <+316>: st1w {z31.s}, p5, [x9, #-1, mul vl] => 0x00000000004018e0 <+320>: whilelo p6.s, w8, w0 z31: {0x3e7da4b63f191827, 0x3dbd38643de2b7de} with 0xffffffffed00 overwritten by lower half of z31 (0x3e7da4b63f191827) and 0xffffffffed08 being overwritten by upper half (0x3dbd38643de2b7de). Backtrace after st1w thus shows: #0 0x00000000004018e0 in GravityForceKernel<4, PolyCoefficients4> (n=619, x=0x433580, y=0x433f40, z=0x434900, mass=0x4352c0, x0=<optimized out>, y0=<optimized out>, z0=<optimized out>, MaxSepSqrd=<optimized out>, SofteningLenSqrd=<optimized out>, ax=@0xffffffffedcc: 0, ay=@0xffffffffedc8: 0, az=@0xffffffffedc4: 0) at GravityForceKernel.cpp:118 #1 GravityForceKernel4 (n=n@entry=619, x=x@entry=0x433580, y=y@entry=0x433f40, z=z@entry=0x434900, mass=mass@entry=0x4352c0, x0=<optimized out>, y0=<optimized out>, z0=<optimized out>, MaxSepSqrd=<optimized out>, SofteningLenSqrd=<optimized out>, ax=@0xffffffffedcc: 0, ay=@0xffffffffedc8: 0, az=@0xffffffffedc4: 0) at GravityForceKernel.cpp:132 #2 0x0000000000401398 in _Z3runPFviPfS_S_S_fffffRfS0_S0_EPKc._omp_fn.0(void) () at main.cpp:159 #3 0x3dbd38643de2b7de in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) Thanks, Prathamesh