Prakash Punnoor wrote:
Why is movaps (SSE, floating point data) instead of movdqa (SSE2. integer
data) used as store? Bug or feature? Even with -O0 compiled it is used.
Testing further: The -march=k8 seems to cause this. Leaving it out, movdqa is
used, so I guess it is a feature.
Th
On the day of Saturday 23 February 2008 Prakash Punnoor hast written:
> On the day of Saturday 23 February 2008 Uros Bizjak hast written:
> > Hello!
> >
> > > f7: 0f 7f 5c 24 f0 movq %mm3,-0x10(%rsp)
> > > fc: 0f 7f 54 24 f8 movq %mm2,-0x8(%rsp)
> > > 101: 48 8b 5c
On the day of Saturday 23 February 2008 Uros Bizjak hast written:
> Hello!
>
> > f7: 0f 7f 5c 24 f0 movq %mm3,-0x10(%rsp)
> > fc: 0f 7f 54 24 f8 movq %mm2,-0x8(%rsp)
> > 101: 48 8b 5c 24 f8 mov-0x8(%rsp),%rbx
> > 106: 48 89 5c 38 40 mov%
Hello!
f7: 0f 7f 5c 24 f0 movq %mm3,-0x10(%rsp)
fc: 0f 7f 54 24 f8 movq %mm2,-0x8(%rsp)
101: 48 8b 5c 24 f8 mov-0x8(%rsp),%rbx
106: 48 89 5c 38 40 mov%rbx,0x40(%rax,%rdi,1)
10b: 48 8b 5c 24 f0 mov-0x10(%rsp),%rbx
110:
Hi,
I am playing with following code (from ffmpeg) translated to intrinsics:
Original code:
#define MOVQ_ZERO(regd) __asm __volatile ("pxor %%" #regd ", %%" #regd ::)
void diff_pixels_mmx(char *block, const uint8_t *s1, const uint8_t *s2, long
stride)
{
long offset = -128;
MOVQ_ZERO(