Hi, On Mon, 15 Aug 2011, Michael Matz wrote:
> > > .LFB0: > > > .cfi_startproc > > > movsd .LC0(%rip), %xmm2 > > > movapd %xmm0, %xmm1 > > > andpd %xmm2, %xmm1 > > > andnpd %xmm0, %xmm2 > > > addsd .LC1(%rip), %xmm1 > > > roundsd $1, %xmm1, %xmm1 > > > orpd %xmm2, %xmm1 > > > movapd %xmm1, %xmm0 > > > ret > > > > Hm, why do we need the sign-copy? If I read the docs correctly > > we can simply use roundsd directly, no? > > round-half-away-from-zero breaks your neck. round[ps][sd] only supports > the usual four IEEE rounding modes. But, you should be able to apply the sign to the 0.5, which wouldn't require building the absolute value of input: round(x) = trunc(x + (copysign (0.5, x))) which should roughly be expanded to: movsd signbits(%rip), %xmm1 andpd %xmm0, %xmm1 movsd nextof0.5(%rip), %xmm2 orpd %xmm1, %xmm2 addpd %xmm2, %xmm0 roundsd $1, %xmm0, %xmm0 ret Which has one logical operation less (and one move because I chose a more optimal register assignment). Ciao, Michael.