On Tue, 2014-07-29 at 10:11 -0700, Mike Stump wrote:
> On Jul 29, 2014, at 7:56 AM, Peter Bergner <berg...@vnet.ibm.com> wrote:
> > Currently, the IBM long double routines in libgcc use a union to construct
> > a long double from two double values.  This causes horrific code generation
> > that copies the two double from the FP registers over to GPRs and back
> > again, giving us two loads and two stores, which leads to two load-hit-store
> > hazzards.
> 
> Gosh, it’s too bad we don’t have any sort of technology to optimize moving 
> data around.

Well the problem is we're trying to move it around, when we'd really like
the data to stay in the FP registers the entire time.  The problem is that
unions and structs that are the same size as a TImode/TFmode/TDmode are
always converted to TImode and that is what ends up causing the whole
fp -> int -> fp shuffle which leads to crappy code.  On power8 where we
have int <-> fp reg copy instructions, it's better than the copy thru
the stack frame, but even that is unnecessary.

Peter



Reply via email to