Hi,

On Mon, 15 Aug 2011, Richard Guenther wrote:

> > Adding -msse4, we now generate branchless code using roundsd:
> >
> > .LFB0:
> >        .cfi_startproc
> >        movsd   .LC0(%rip), %xmm2
> >        movapd  %xmm0, %xmm1
> >        andpd   %xmm2, %xmm1
> >        andnpd  %xmm0, %xmm2
> >        addsd   .LC1(%rip), %xmm1
> >        roundsd $1, %xmm1, %xmm1
> >        orpd    %xmm2, %xmm1
> >        movapd  %xmm1, %xmm0
> >        ret
> 
> Hm, why do we need the sign-copy?  If I read the docs correctly
> we can simply use roundsd directly, no?

round-half-away-from-zero breaks your neck.  round[ps][sd] only supports 
the usual four IEEE rounding modes.


Ciao,
Michael.

Reply via email to