http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53513
--- Comment #6 from Rich Felker <bugdal at aerifal dot cx> --- On Sun, Mar 16, 2014 at 11:32:21PM +0000, olegendo at gcc dot gnu.org wrote: > If it's OK for a temporary mode switch to clobber other FPSCR bits (such as in > the PR = single mode above), it should also be OK to load the FPSCR value from > a thread local variable inside the thread-control-block: > > mov.l @(<disp>, gbr),r0 // r0 = address to fpscr value for a > // particular mode setting > lds.l @r0+,fpscr // mode switch IMO this is an ugly hack that shouldn't be taken. It has lots of complex interactions with other things: signal handlers, the ucontext.h functions, fenv, pthread, etc. that could probably be achieved correctly if somebody wanted to spend the effort on it, but it would be ugly and SH-specific and honestly we already have a shortage of people willing to spend time fixing SH problems without introducing even more work. > This would require that any FPSCR setting change is also propagated to the TLS > variables. E.g. setting the rounding mode would have to update FPSCR mode > values in all such TLS variables. > I guess that it would be useful to be able to select an FPSCR value for at > least all combinations of FR and SZ bit settings, in other words having a TLS > __fpscr_values array with 4 entries. However, it would make things such as > toggling FPSCR.FR via frchg inefficient due to the required updates of the TLS > variables. Other setting changes such as denorm or rounding mode are probably > not so critical. If I'm not mistaken, the toggle approach will be efficient without this TLS hack once it's implemented, right? I don't think it makes sense to introduce a hack for just a short-term mitigation of the performance regression. If you think the short-term fix for this issue is too costly, the proper solution is probably to add a -m option to turn it back off (using the old __fpscr_values approach).