On Mon, 7 Oct 2013, Jan Hubicka wrote: > > On Fri, 4 Oct 2013, Jan Hubicka wrote: > > > > > Hi, > > > this patch makes -Ofast to also imply -mfpmath=sse. It is important win > > > on > > > SPECfP (2000 and 2006). Even though for exmaple the following > > > float a(float b) > > > { > > > return b+10; > > > } > > > > > > results in somewhat ridiculous > > > a: > > > .LFB0: > > > .cfi_startproc > > > subl $4, %esp > > > .cfi_def_cfa_offset 8 > > > movss .LC0, %xmm0 > > > addss 8(%esp), %xmm0 > > > movss %xmm0, (%esp) > > > flds (%esp) > > > addl $4, %esp > > > .cfi_def_cfa_offset 4 > > > ret > > > > > > I wonder if we can get rid at least of the redundant stack alignment on > > > ESP... > > > > > > Bootstrapped/regtested x86_64-linux, will commit it on weekend if there > > > are no > > > complains. I wonder if -ffast-math should do the same - it is documented > > > as enabling > > > explicit set of options, bu that can be changed I guess. > > > > I wonder if we can restrict -mfpmath=sse to local functions where we can > > We can, but why? Parameters are passed in memory that is equaly bad for 387 > and > SSE. Only return values are passed in registers, that is not that expensive > to > have one extra reload per function except for functions containing almost > nothing that should be inlined if they are local.
Ah, I forgot that detail. Still going through the FP stack for return values is bad. > In meantime I (partially, > since megrez stopped producing 32bit spec2k6 results) benchmarked > -mfpmath=sse,387 and it does not seem to be a loss anymore. So perhaps we can > give it a try? Not sure ... I would guess that it's not a win on any recent architecture (and LRA is probably not well-prepared here either). > > change the ABI ... (do we change the local functions ABI with > > -mfpmath=sse?) > > We don't. It is probably quite easy to default to sse_regparm and change > return value type. > I will look into it. Thanks. That's independent of enabling -mfpmath=sse at -Ofast of course. Richard.