[RFC] Change (flatten) representation of memory references

Richard Guenther Mon, 04 Feb 2008 06:19:12 -0800

Following the old discussions at

  http://gcc.gnu.org/ml/gcc/2007-04/msg00096.html
  and http://gcc.gnu.org/ml/gcc/2007-04/msg00096.html


I'd like to get the ball rolling and start implementing a
unified flattened memory access operation for 4.4.  Following
my earlier proposal and keeping in mind the requirements
fulfilled by Zdeneks approach I came up with

  MEM_REF ( base, offset, alias_set )
  INDIRECT_MEM_REF ( base, offset, alias_set )

which has the following information:

 - the BASE object with its type attached
 - a gimple val OFFSET
 - an alias_set_type alias_set, for TBAA and to be used in place
   of the RTL MEM_REF tree
 - the access type on the MEM_REF itself

This is a compact representation for all our memory reference trees.

Multi-dimensional array accesses like a.x[i][j].y would be represented
as (following Zdeneks notation)

  idx_1 = IDX ( offsetof (x) + offsetof (y), j, sizeof (a.x[][j]) )
  idx_2 = IDX ( idx_1, i, sizeof (a.x[i]) )
  MEM_REF ( a, idx_2, 3)

introducing a new indexing operation IDX ( offset, idx, step ) effectively
capturing the information Zdenek keeps in the vector of indices in the
memory reference itself.  The result is of course offset + idx * step,
so this is a multiply-add expression with non-communtative multiplication
(as we want to preserve what is index and what is step).

The advantage is that with the above form the
individual IDXes can be CSEd and moved out of loops where they are invariant
easily.  The offset member of MEM_REF is not constrained to use IDX
operations, but they can be reconstructed by an analysis phase from
pointer arithmetic as well.  For example for

  for (i=0; i<n; ++i)
    for (j=0; j<m; ++j)
      *(p + j * n + i) = *(q + j * n + i);

can be transformed to

  atmp_1 = (T (*)[m][n])p;
  atmp_2 = (T (*)[m][n])q;
  for (i=0; i<n; ++i)
    {
      idx_1 = IDX (0, i, sizeof(T)*m);
      for (j=0; j<m; ++j)
        {
          idx_2 = IDX (idx_1, j, sizeof(T));
          tmp_1 = INDIRECT_MEM_REF (atmp_2, idx_2, 3);
          INDIRECT_MEM_REF (atmp_1, idx_2, 3) = tmp_1;
        }
    }

capturing the domain we loop over in the VLA type of atmp_1 and atmp_2
(that is, in the object, not the access as with Zdeneks proposal).  Note
that at this point (and not only to replace RTL MEM_REF) having the
alias set encoded in the access is crucial, as dependent on the FE it
may no longer be reconstructable from the base object type and the access
type (?).


Thoughts?

Thanks,
Richard.

[RFC] Change (flatten) representation of memory references

Reply via email to