https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66382
--- Comment #7 from Michael Meissner <meissner at gcc dot gnu.org> --- I do not think it is worthwhile to expand the IEEE 128-bit software emulation routines at the point of call in ISA 2.07 (power8). This is due to the fact that a lot of processing goes on in the emulation library. Now if people were motivated, we could replace the soft-fp functions with tuned versions written for the Power8 ISA. However, I don't think it is going to be an easy task. I'll list the things that come to mind first. They might be able to be done, but it is a time/effort calculation of whether the return on investment is worth it. The first issue is the next ISA (3.0) has support in it already for doing IEEE 128-bit floating point in hardware, including supporting the various rounding modes, etc. If you configure and build gcc with an assembler that supports the ISA 3.0 instruction set, it will add in IFUNC support so that when a program is run on ISA 3.0 hardware, it will automatically use a version of __addkf3 that uses the xsaddqp instruction instead of doing the emulation. So this effort would only be for the current generation of hardware. The next issue is right now you cannot do 64-bit scalar int arithmetic in the VSX unit. At present, the compiler does not allow DImode into the Altivec registers, but the only support for 64-bit integer arithmetic uses an Altivec encoding and only uses Altivec registers (vaddudm, etc.). I am working on patches to allow DImode variables in Altivec registers (and later SImode, HImode, QImode). My first run with the patch shows 1 benchmark 2% faster (perlbench) and one 2% slower (omnetpp), but I feel it needs a lot more tuning. At the moment, that work is lower priority, as I am trying to make __float128 _Complex work as my highest priority (obviously other people could take up this work). After allowing DImode into Altivec registers, and perhaps doing 64-bit arithmetic via the 64-bit integer vector instructions, another issue is that the cycle time of the vector unit is 1/2 that of the GPR unit, so it will need a lot of tuning. I don't think the ISA 2.07 instruction set is general enough to do inserts and extracts of 128-bit values that you would need for packing and unpacking the IEEE 128-bit floating point values. ISA 3.0 has all of this support, including specialized instructions to extract/set the exponent or mantissa, but then it also has the hardware support for IEEE 128-bit floating point. Load/stores are also problematical in ISA 2.07, given there is no d-form addressing for Altivec registers. So if you spill a DImode value in an Altivec register, you need to load up the offset in a GPR to do the memory operation.