[Bug ada/29543] New: Ada produces substantially slower code than FORTRAN for identical inputs - looping over double subscripted arrays

jeff at thecreems dot com Sat, 21 Oct 2006 19:00:38 -0700

I understand comparing very very small benchmarks like this can be misleading
but I believe I've looked at this enough to have a sense that it is
demonstrating a basic truth and not a narrow performance issue.


The test case that has been attached shows a FORTRAN and Ada program that are
equivalent (within their matrix multiply loop). The Ada one runs about 2x
slower with about 3x the number of machine instructions in the inner loop.
(Note that running with Ada run time checks disabled).

I dumped the optimized trees (as the original tree of the Ada version was
difficult to read because of the node types not being known to the pretty
printer). The Ada tree is certainly a mess compared to the FORTRAN version.

The core of the FORTRAN code looks like

   do I = 1,N
      do J = 1,N
         sum = 0.0
         do R = 1,N
            sum = sum + A(I,R)*B(R,J)
         end do
         C(I,J) = sum
      end do
   end do


With the resulting optimized tree fragment (of the inner most loop) being

<L25>:;
  sum = MEM[base: (real4 *) ivtmp.97] * MEM[base: (real4 *) pretmp.81, index:
(real4 *) ivtmp.161 + (real4 *) ivtmp.94, step: 4B, offset: 4B] + sum;
  ivtmp.94 = ivtmp.94 + 1;
  ivtmp.97 = ivtmp.97 + ivtmp.157;
  if (ivtmp.94 == (<unnamed type>) D.1273) goto <L29>; else goto <L25>;

While the core of the Ada code looks like:

   for I in A'range(1) loop
      for J in A'range(2) loop
         Sum := 0.0;
         for R in A'range(2) loop
            Sum := Sum + A(I,R)*B(R,J);
         end loop;
         C(I,J) := Sum;
      end loop;
   end loop;

With the resulting optimized tree fragment of the inner most loop being :

<L15>:;
  D.2370 = (*D.2277)[pretmp.627]{lb: tst_array__L_3__T16b___L sz: pretmp.709 *
4}[(<unnamed type>) r]{lb: tst_array__L_4__T17b___L sz: 4};

<bb 51>:
  temp.721 = D.2344->LB0;

<bb 52>:
  temp.720 = D.2344->UB1;

<bb 53>:
  temp.719 = D.2344->LB1;

<bb 54>:
  j.73 = (<unnamed type>) j;
  D.2373 = (*D.2298)[(<unnamed type>) r]{lb: temp.721 sz: MAX_EXPR <(temp.720 +
1 - temp.719) * 4, 0> + 3 & -4}[j.73]{lb: temp.719 sz: 4};

<bb 55>:
  D.2374 = D.2370 * D.2373;

<bb 56>:
  sum = D.2374 + sum;

<bb 57>:
  if (r == tst_array__L_4__T17b___U) goto <L17>; else goto <L16>;

<L16>:;
  r = r + 1;
  goto <bb 50> (<L15>);

Now, I'll be the first to admit that I know very little about the innards of
compiler technology but that tree looks like a horrible mess. It is no wonder
the resulting assembly is such a mess.

I am attaching a tar file that has the complete source for the Ada and the
FORTRAN version.


-- 
           Summary: Ada produces substantially slower code than FORTRAN for
                    identical inputs - looping over double subscripted
                    arrays
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: ada
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: jeff at thecreems dot com
 GCC build triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29543

[Bug ada/29543] New: Ada produces substantially slower code than FORTRAN for identical inputs - looping over double subscripted arrays

Reply via email to