Re: [PATCH, i386]: Expand round(a) = sgn(a) * floor(fabs(a) + 0.5) using SSE4 ROUND insn

Michael Matz Mon, 15 Aug 2011 08:25:53 -0700

Hi,

On Mon, 15 Aug 2011, Michael Matz wrote:


> > > .LFB0:
> > >        .cfi_startproc
> > >        movsd   .LC0(%rip), %xmm2
> > >        movapd  %xmm0, %xmm1
> > >        andpd   %xmm2, %xmm1
> > >        andnpd  %xmm0, %xmm2
> > >        addsd   .LC1(%rip), %xmm1
> > >        roundsd $1, %xmm1, %xmm1
> > >        orpd    %xmm2, %xmm1
> > >        movapd  %xmm1, %xmm0
> > >        ret
> > 
> > Hm, why do we need the sign-copy?  If I read the docs correctly
> > we can simply use roundsd directly, no?
> 
> round-half-away-from-zero breaks your neck.  round[ps][sd] only supports 
> the usual four IEEE rounding modes.

But, you should be able to apply the sign to the 0.5, which wouldn't 
require building the absolute value of input:

round(x) = trunc(x + (copysign (0.5, x)))

which should roughly be expanded to:

       movsd   signbits(%rip), %xmm1
       andpd   %xmm0, %xmm1
       movsd   nextof0.5(%rip), %xmm2
       orpd    %xmm1, %xmm2
       addpd   %xmm2, %xmm0
       roundsd $1, %xmm0, %xmm0
       ret

Which has one logical operation less (and one move because I chose a more 
optimal register assignment).


Ciao,
Michael.

Re: [PATCH, i386]: Expand round(a) = sgn(a) * floor(fabs(a) + 0.5) using SSE4 ROUND insn

Reply via email to