------- Comment #7 from uros at kss-loka dot si  2006-08-17 07:21 -------
(In reply to comment #6)

> I think that remaining time difference is due to strange loop above innermost:

... due to strange _header_ above innermost loop ...

The problem is that we load zero in both arms of "if".

This is what I get in .099t.optimized (using gcc-4.2 -O2 -fno-ivopts):

<L1>:;
  r.0 = (unsigned int) r;
  D.1556 = r.0 * 4;
  rowR = *((int *) D.1556 + row);
  rowRp1 = *((int *) D.1556 + row + 4B);
  if (rowR < rowRp1) goto <L41>; else goto <L42>;

<L42>:;
  sum = 0.0;
  goto <bb 5> (<L4>);

<L41>:;
  i = rowR;
  sum = 0.0;

Assignment to sum should be moved before if...

SSE is able to somehow CSE zero load during RTL:

.L8:
        movl 20(%ebp), %edx
        movapd  %xmm2, %xmm1
        movl (%edx,%ebx,4), %eax
        movl 4(%edx,%ebx,4), %ecx
        cmpl %ecx, %eax
        jge .L11
        movl %eax, %edx
        .p2align 4,,7
.L12:


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21676

Reply via email to