https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fweimer at redhat dot com

--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Brendan Dolan-Gavitt from comment #19)
> I read through the crtfastmath.c implementations for the other affected
> targets and confirmed that they do all set flush-to-zero in this thread:
> 
> https://threadreaderapp.com/thread/1567612053363347461.html
> 
> I agree that there should be a way for a shared library to link
> crtfastmath.o if it wants that behavior. But is there a reason
> -l:crtfastmath.o isn't sufficient in that case? Why does it need to be
> enabled automatically when -Ofast/-ffast-math/-funsafe-math optimizations
> are turned on?

The reasons for most of the "globbing" into -ffast-math/-Ofast are the
rules for SPEC CPU 2006 base flags which IIRC limited the number of flags
allowed (that's no longer a requirement for SPEC CPU 2017).  And of course
that users will not know of the flags but are likely not interested in
denormals when using -ffast-math.

> The other note I would add is that in multi-threaded applications,
> crtfastmath.o is already not behaving as intended: FTZ/DAZ will only be set
> in the CPU state of the thread that loaded the shared library; it's hard to
> imagine a case where a user wants individual threads to have different
> FTZ/DAZ (unless they explicitly manage that by hand). Example:

[...]

Yeah.  Not sure how often dynamic objects are opened from within threads
though.  That said, a possibility to enforce "consistency" at least would
be to save/restore the FP state around dlopen() so that shared objects
loaded not at program startup would not affect FP state at all?
The same could be done for shared objects loaded at program startup of
course.

The other way around would eventually be to make the CTOR __tls, that
should eventually force all threads to change their FP state(?).

Reply via email to