Hi, On Mon, 15 Aug 2011, Richard Guenther wrote:
> > Adding -msse4, we now generate branchless code using roundsd: > > > > .LFB0: > > .cfi_startproc > > movsd .LC0(%rip), %xmm2 > > movapd %xmm0, %xmm1 > > andpd %xmm2, %xmm1 > > andnpd %xmm0, %xmm2 > > addsd .LC1(%rip), %xmm1 > > roundsd $1, %xmm1, %xmm1 > > orpd %xmm2, %xmm1 > > movapd %xmm1, %xmm0 > > ret > > Hm, why do we need the sign-copy? If I read the docs correctly > we can simply use roundsd directly, no? round-half-away-from-zero breaks your neck. round[ps][sd] only supports the usual four IEEE rounding modes. Ciao, Michael.