I'm planning to support some new instructions found in recent sparc cpus, specifically VIS 3.0 adds a series of "X and halve" floating-point instructions where X is one of "add" or "subtract".
There are variants which negate the result as well. They operate similar to FMA in that all the operations are performed and then rounding occurs only one time at the very end. The advantage of having these "halve" variants is that since the chip does the calculations with a larger amount of precision internally, the final result cannot overflow. Does any other cpu support this, and in particular in GCC? If not, does anyone have any suggestions on how do you model this? It'd be real disappointing to have to unspec these things and only be able to access them using builtins.