https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325
Tobias Burnus <burnus at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|[15 Regression] |[15 Regression] |libgomp.c/simd-math-1.c |libgomp.c/simd-math-1.c |(gcn offloading): timeout |(gcn offloading): timeout |(for fmodf, remainderf) |(for fmodf, remainderf) |since |since |r15-7284-g6b56e645a7b481 |r15-7257-g54bdeca3c62144 --- Comment #4 from Tobias Burnus <burnus at gcc dot gnu.org> --- Update: * When including newlib in the build (i.e. do proper bisecting); the fail-causing commit is r15-7257-g54bdeca3c62144 commit 54bdeca3c6214485d15454df30183a56ad3e473b Author: Richard Biener Date: Tue Jan 28 16:20:30 2025 +0100 middle-end/118684 - wrongly aligned stack local during expansion * The testcase has an inconsistency, which does not seem to affect the fail. (As there is then an implicit 'map(tofrom: b)'.) Still, it seems to be cleaner to add it explicitly (macro definition for TEST_FUN2): - _Pragma ("omp target parallel for simd map(to:a) map(from:res)") \ + _Pragma ("omp target parallel for simd map(to:a,b) map(from:res)") \ * * * Reduced example but still using offloading I tried -O1 but this will unbreak the example. ---------------------------- #include <math.h> static volatile int idx = 0; void test_fmodf (void) { float res[512], a[512], b[512]; for (int i = 0; i < 512; i++) { a[i] = -10.0 + ((10.0 - -10.0) / 512) * i; b[i] = 100.0 + ((-25.0 - 100.0) / 512) * i; } #pragma omp target parallel for simd map(to:a,b) map(from:res) for (int i = 0; i < 512; i++) res[i] = fmodf (a[i], b[i]); __builtin_printf ("%f\n", res[idx]); } int main (void) { test_fmodf (); } * * * If I compile the program – either the reduced or the full one - directly for offloading (w/o specifying '-fopenmp'), it WORKS. Namely, I tried (gcn compiler): $build/gcc/xgcc -B $build/gcc -lm -L $build/amdgcn-amdhsa/gfx908/newlib/ \ -march=gfx908 -I $inst/amdgcn-amdhsa/include/ -O2 -ftree-vectorize -fno-math-errno -fopenmp-simd LD_LIBRARY_PATH=/opt/rocm/lib .../accel/amdgcn-amdhsa/gcn-run ./a.out * * * For the OpenMP build, comparing a.xamdgcn-amdhsa.mkoffload.2.s shows no differences, contrary to gfx908/newlib/libm/machine/amdgcn/libm_a-v64sf_fmod.s. The code uses v64sf_fmodf, v32sf_fmodf and fmodf.