https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119325
Tobias Burnus <burnus at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[15 Regression] |[15 Regression]
|libgomp.c/simd-math-1.c |libgomp.c/simd-math-1.c
|(gcn offloading): timeout |(gcn offloading): timeout
|(for fmodf, remainderf) |(for fmodf, remainderf)
|since |since
|r15-7284-g6b56e645a7b481 |r15-7257-g54bdeca3c62144
--- Comment #4 from Tobias Burnus <burnus at gcc dot gnu.org> ---
Update:
* When including newlib in the build (i.e. do proper bisecting); the
fail-causing commit is r15-7257-g54bdeca3c62144
commit 54bdeca3c6214485d15454df30183a56ad3e473b
Author: Richard Biener
Date: Tue Jan 28 16:20:30 2025 +0100
middle-end/118684 - wrongly aligned stack local during expansion
* The testcase has an inconsistency, which does not seem to affect the fail.
(As there is then an implicit 'map(tofrom: b)'.) Still, it seems to be
cleaner to add it explicitly (macro definition for TEST_FUN2):
- _Pragma ("omp target parallel for simd map(to:a) map(from:res)") \
+ _Pragma ("omp target parallel for simd map(to:a,b) map(from:res)") \
* * *
Reduced example but still using offloading
I tried -O1 but this will unbreak the example.
----------------------------
#include <math.h>
static volatile int idx = 0;
void test_fmodf (void) {
float res[512], a[512], b[512];
for (int i = 0; i < 512; i++) {
a[i] = -10.0 + ((10.0 - -10.0) / 512) * i;
b[i] = 100.0 + ((-25.0 - 100.0) / 512) * i;
}
#pragma omp target parallel for simd map(to:a,b) map(from:res)
for (int i = 0; i < 512; i++)
res[i] = fmodf (a[i], b[i]);
__builtin_printf ("%f\n", res[idx]);
}
int main (void) { test_fmodf (); }
* * *
If I compile the program – either the reduced or the full one - directly for
offloading (w/o specifying '-fopenmp'), it WORKS.
Namely, I tried (gcn compiler):
$build/gcc/xgcc -B $build/gcc -lm -L $build/amdgcn-amdhsa/gfx908/newlib/ \
-march=gfx908 -I $inst/amdgcn-amdhsa/include/ -O2 -ftree-vectorize
-fno-math-errno -fopenmp-simd
LD_LIBRARY_PATH=/opt/rocm/lib .../accel/amdgcn-amdhsa/gcn-run ./a.out
* * *
For the OpenMP build, comparing a.xamdgcn-amdhsa.mkoffload.2.s shows no
differences, contrary to gfx908/newlib/libm/machine/amdgcn/libm_a-v64sf_fmod.s.
The code uses v64sf_fmodf, v32sf_fmodf and fmodf.