[Bug tree-optimization/119351] New: [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

ktkachov at gcc dot gnu.org via Gcc-bugs Tue, 18 Mar 2025 02:02:30 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351


            Bug ID: 119351
           Summary: [15 Regression] Wrong code in GROMACS for AArch64
                    generic SVE VLS target
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
                CC: acoplan at gcc dot gnu.org, tnfchris at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64

Tamar and I have been discussing this offline but now that we have a reproducer
with all public sources here's a bug report.

We see GROMACS giving an internal error when built for -march=armv9-a.
To reproduce you need to get GROMACS from:
https://gitlab.com/gromacs/gromacs.git

and use branch v2025.0-1

You may need to patch one of the header files trivially to make it build:
--- a/src/gromacs/mdtypes/energyhistory.h
+++ b/src/gromacs/mdtypes/energyhistory.h
@@ -51,6 +51,7 @@

 #include <memory>
 #include <vector>
+#include <cstdint>

Build it with cmake with the following options:
CMAKE_OPTIONS=" -DGMX_OPENMP=ON -DGMX_CYCLE_SUBCOUNTERS=ON \
                                        -DGMX_BUILD_OWN_FFTW=ON \
                                        -DGMX_GPU=OFF
-DCMAKE_BUILD_TYPE=Release -DGMX_DOUBLE=OFF -DGMX_CYCLE_SUBCOUNTERS=ON \
                                        -DGMX_PREFER_STATIC_LIBS=OFF
-DGMX_INSTALL_NBLIB_API=OFF -DGMXAPI=OFF \
                                        -DCMAKE_C_FLAGS="-march=armv9-a"
-DCMAKE_CXX_FLAGS="-march=armv9-a" \
                                        -DCMAKE_C_COMPILER=$COMPILERBIN
-DCMAKE_CXX_COMPILER=$COMPILERXXBIN \
                                        -DGMX_SIMD_ARM_SVE_LENGTH=128 \
                                        -DGMX_SIMD=ARM_SVE -DGMX_USE_NVTX=ON \
                                        -DGMX_MPI=OFF"

This builds GROMACS for -march=armv9-a and 128-bit VLS SVE.

The run command can be:
$PATH_TO_GMX_BUILD/gmx mdrun -v -resethway -noconfout -pin on -ntmpi 1 -ntomp
24 -nsteps 4000 -nb cpu -s benchmark

where "benchmark" is a benchmark.tpr input file that you can get from various
sources, for example https://www.hecbiosim.ac.uk/benchmark-files/gromacs.tar.gz
(you can use the 20k-atoms one)

The error at the bottom looks like:
Source file: src/gromacs/mdlib/sim_util.cpp (line 574)
Function:    void gmx::checkPotentialEnergyValidity(int64_t, const
gmx_enerdata_t&, const t_inputrec&)

Internal error (bug):
Step 100: The total potential energy is -nan, which is not finite. The LJ and
electrostatic contributions to the energy are 0 and 0, respectively. A
non-finite potential energy can be caused by overlapping interactions in
bonded interactions or very large or Nan coordinate values. Usually this is
caused by a badly- or non-equilibrated initial configuration, incorrect
interactions or parameters in the topology.

For more information and tips for troubleshooting, please check the GROMACS
website at https://manual.gromacs.org/current/user-guide/run-time-errors.html

I've bisected this to g:68326d5d1a593dc0bf098c03aac25916168bc5a9 
Before that commit the above command runs successfully.
Note, this triggers with -march=armv9-a in -DCMAKE_C_FLAGS and
-DCMAKE_CXX_FLAGS
Using values like -mcpu=neoverse-v2 doesn't trigger this

[Bug tree-optimization/119351] New: [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

Reply via email to