https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118348

            Bug ID: 118348
           Summary: [SVE] HACCKernels seems to miscompile with VLS SVE
                    after 0c5c0c959c2e592b84739f19ca771fa69eb8dfee
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: prathamesh3492 at gcc dot gnu.org
  Target Milestone: ---

Hi,
HACCKernels (https://git.cels.anl.gov/hacc/HACCKernels) seems to miscompile and
result in "bus error" after the following commit:
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=0c5c0c959c2e592b84739f19ca771fa69eb8dfee

with following options: -O3 -ffast-math -fopenmp -mcpu=neoverse-v2
-msve-vector-bits=128 and OMP_NUM_THREADS=1

Running under gdb shows:
#0  0x003974873e78c382 in ?? ()
#1  0x0000000000401398 in _Z3runPFviPfS_S_S_fffffRfS0_S0_EPKc._omp_fn.0(void)
() at main.cpp:159
#2  0x3ebecda63dc7e8f3 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

which likely indicates that there is stack corruption happening, and compiling
with -fstack-protector-strong shows:
Maximum OpenMP Threads: 1
Iterations: 2000
*** stack smashing detected ***: terminated
Aborted

The omp clone of run function has following instruction at beginning:
Dump of assembler code for function
_Z3runPFviPfS_S_S_fffffRfS0_S0_EPKc._omp_fn.0(void):
=> 0x0000000000401020 <+0>:     stp     x29, x30, [sp, #-224]!
   0x0000000000401024 <+4>:     mov     x29, sp                                
                                                                               
                                                                               
          After stp instruction, sp: 0xffffffffed00 stores x29 and *(sp + 8)
stores x30.

Setting watchpoint on 0xffffffffed00 shows that the value of x29 and x30 gets  
                                             overwritten in
_Z19GravityForceKernel4iPfS_S_S_fffffRfS0_S0_ at following st1w instruction:
   0x00000000004018dc <+316>:   st1w    {z31.s}, p5, [x9, #-1, mul vl]
=> 0x00000000004018e0 <+320>:   whilelo p6.s, w8, w0

z31: {0x3e7da4b63f191827, 0x3dbd38643de2b7de}
with 0xffffffffed00 overwritten by lower half of z31 (0x3e7da4b63f191827) and
0xffffffffed08 being overwritten by upper half (0x3dbd38643de2b7de).

Backtrace after st1w thus shows:
#0  0x00000000004018e0 in GravityForceKernel<4, PolyCoefficients4> (n=619,
x=0x433580, y=0x433f40, z=0x434900,
    mass=0x4352c0, x0=<optimized out>, y0=<optimized out>, z0=<optimized out>,
MaxSepSqrd=<optimized out>,
    SofteningLenSqrd=<optimized out>, ax=@0xffffffffedcc: 0,
ay=@0xffffffffedc8: 0, az=@0xffffffffedc4: 0)
    at GravityForceKernel.cpp:118
#1  GravityForceKernel4 (n=n@entry=619, x=x@entry=0x433580, y=y@entry=0x433f40,
z=z@entry=0x434900,
    mass=mass@entry=0x4352c0, x0=<optimized out>, y0=<optimized out>,
z0=<optimized out>, MaxSepSqrd=<optimized out>,
    SofteningLenSqrd=<optimized out>, ax=@0xffffffffedcc: 0,
ay=@0xffffffffedc8: 0, az=@0xffffffffedc4: 0)
    at GravityForceKernel.cpp:132
#2  0x0000000000401398 in _Z3runPFviPfS_S_S_fffffRfS0_S0_EPKc._omp_fn.0(void)
() at main.cpp:159
#3  0x3dbd38643de2b7de in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thanks,
Prathamesh

Reply via email to