https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522
Brendan Dolan-Gavitt <brendandg at nyu dot edu> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |brendandg at nyu dot edu
--- Comment #19 from Brendan Dolan-Gavitt <brendandg at nyu dot edu> ---
I read through the crtfastmath.c implementations for the other affected targets
and confirmed that they do all set flush-to-zero in this thread:
https://threadreaderapp.com/thread/1567612053363347461.html
I agree that there should be a way for a shared library to link crtfastmath.o
if it wants that behavior. But is there a reason -l:crtfastmath.o isn't
sufficient in that case? Why does it need to be enabled automatically when
-Ofast/-ffast-math/-funsafe-math optimizations are turned on?
The other note I would add is that in multi-threaded applications,
crtfastmath.o is already not behaving as intended: FTZ/DAZ will only be set in
the CPU state of the thread that loaded the shared library; it's hard to
imagine a case where a user wants individual threads to have different FTZ/DAZ
(unless they explicitly manage that by hand). Example:
$ cat baz.c
#include <stdio.h>
#include <unistd.h>
#include <dlfcn.h>
#include <pthread.h>
void loadlib() {
void *handle = dlopen("./gofast.so", RTLD_LAZY);
if (!handle) {
fprintf(stderr, "dlopen: %s\n", dlerror());
}
}
#define MXCSR_DAZ (1 << 6) /* Enable denormals are zero mode */
#define MXCSR_FTZ (1 << 15) /* Enable flush to zero mode */
void printftz(int i) {
unsigned int mxcsr = __builtin_ia32_stmxcsr ();
printf("[%d] mxcsr.FTZ = %d, mxcsr.DAZ = %d\n", i, !!(mxcsr & MXCSR_FTZ),
!!(mxcsr & MXCSR_DAZ));
return;
}
void *thread(void *arg) {
// Print thread id
int i = *(int *)arg;
if (i == 0) loadlib();
sleep(1);
printftz(i);
}
int main(int argc, char **argv) {
// Create 4 threads
pthread_t threads[4];
int tids[4];
for (int i = 0; i < 4; i++) {
tids[i] = i;
pthread_create(&threads[i], NULL, thread, &tids[i]);
}
// Wait for all threads to finish
for (int i = 0; i < 4; i++) {
pthread_join(threads[i], NULL);
}
return 0;
}
$ touch gofast.c
$ gcc -Ofast -fpic -shared gofast.c -o gofast.so
$ gcc -pthread baz.c -o baz -ldl
$ ./baz
[3] mxcsr.FTZ = 0, mxcsr.DAZ = 0
[0] mxcsr.FTZ = 1, mxcsr.DAZ = 1
[2] mxcsr.FTZ = 0, mxcsr.DAZ = 0
[1] mxcsr.FTZ = 0, mxcsr.DAZ = 0