Re: How could I get alias set information from data_reference_p

Tobias Grosser Wed, 15 Jul 2009 12:16:59 -0700

On Wed, 2009-07-15 at 13:26 +0200, Richard Guenther wrote:
> On Wed, Jul 15, 2009 at 1:00 PM, Tobias
> Grosser<gros...@fim.uni-passau.de> wrote:
> > On Tue, 2009-07-14 at 23:34 +0200, Richard Guenther wrote:
> >> On Tue, Jul 14, 2009 at 6:08 PM, Sebastian Pop<seb...@gmail.com> wrote:
> >> > On Tue, Jul 14, 2009 at 11:03, Sebastian Pop<seb...@gmail.com> wrote:
> >> >>> Why do you need alias-set numbers?
> >> >>
> >> >> We want to represent the alias set information as an extra subscript
> >> >> on memory accesses: for example,
> >> >>
> >> >> if we have A[10] and supposing that A is in alias set 6, this would be
> >> >> represented as "memory_access[6][10]".
> >> >>
> >> >> For B[3][4] with base array B in alias set 7 and 8, we would represent
> >> >> this as "memory_access[7][3][4] or memory_access[8][3][4]".
> >> >>
> >> >> The data dependence test that we currently have in the graphite branch
> >> >> would work with alias sets represented by an extra dimension to the
> >> >> array dimensions.
> >> >
> >> > And by the way, right now we consider that all the data refs alias each
> >> > other, that means that we set the alias dimension of the memory
> >> > accesses to 1 for every data reference.  This makes the test very
> >> > conservative: in the example above, with the current implementation,
> >> > we would have:
> >> >
> >> > A: memory_access[1][10]
> >> > B: memory_access[1][3][4]
> >>
> >> Well, so you really want to have sort-of the base objects partitioned
> >> into must(!)-conflict sets?  Thus,
> >
> > Is there anything like must-conflict? I thought the alias oracle just
> > returns may conflict relations.
> 
> True.  This is why I ask ;)
> 
> > This information should be passed to graphite without loosing any
> > information.
> >
> >>   (*p)[1][2]
> >>   (*q)[1][3]
> >>
> >> is only correct if the resulting accesses may not alias?  (that is,
> >> p == q?)
> >
> > What do you mean by is correct?
> 
> I try to ask what you are going to do with the computed alias-set
> numbers and what is the property required for the alias-set numbers
> so that their use do not result in invalid transformations from graphite.
> 
> So, what does the 1 in (*p)[1][2] semantically mean?
> 
> >>
> >> No, we don't have such information readily available.  You have
> >> to compute it.  Likely by building a complete conflict map of
> >> the bases of all memory references and partitioning it.
> >>
> >> Which doesn't sound like a great idea - it's quadratic.  Thus, can't
> >> you represent this in a more sparse way?
> >
> > It is quadratic in what? In the number of data references?
> 
> Yes.  You need to build the full conflict map (or a conflict graph,
> as Li proposes in his final algorithm).
>
> > If the only way we can get the information if two data references may
> > alias is asking the oracle, I do not think there is an alternative. We
> > have to get at least this information. So there will be nb_of_drs^2
> > calls to the alias oracle. Or is there any alternative?
> I don't know.  It depends on how you are going to use the information.


The idea is that we want to get rid of the need to call a function, to
know if two data references alias, but - as Sebastian explained - to map
them to virtual memory locations.

Lets say we have some data references A, B, C, D, E, F, G that alias
like this:

A -- B    F -- G
|  / |
| /  |
C -- D -- E   

What I would like is to create groups of the data references that may
touch the same memory location. So every pair of data references that
may alias has to share at least one alias set.

The easiest solution is to put all of them in one big set. This is
always correct, but very conservative.

AS1 = {A,B,C,D,E,F,G}

The next step would be to make groups of all connected components of the
graph.

AS1 = {A,B,C,D,E}
AS2 = {F,G}

The most exact solution I can think of is to take all maximal complete
sub graphs. This means all sub graphs where there is between every two
elements E1, E2 an edge.

AS1 = {A,B,C}
AS2 = {B,C,D}
AS3 = {D,E}
AS4 = {F,G}

So a statement

D[i][4j-5] = B [2j];

has these effects:

may-write [2] [i] [4j-5]
may-write [3] [i] [4j-5]
read [1] [2j]
read [2] [2j]

Another statement might be

... = E[5i][5i+j]

read [3] [5i] [5i+j]

The information is used in our data dependency checks to check and
calculate which dependencies exist.

There we have conflict equalities for every subscript. Lets look at the
possible conflict in between the write to D and the read of E.

It has to fulfill these conditions for some iterations (i,j) and (i',j')

(s0 == 2 == 3 == s0' and
s1 == i == 5i' == s1' and
s2 == 4j-5 == 5i'+j' == s2')

or 

(s0 == 3 == 3 == s0' and
s1 == i == 5i' == s1' and
s2 == 4j-5 == 5i'+j' == s2')

The first set is impossible as they are part of different alias sets and
therefor s0 != s0', however the second one might be possible e.g. for
j = 2, i = 0, i' = 0, j' = 3

The exact solution can be calculated by our linear programming solver
taking into concern even more restrictions.


> > I think the only way to get to less complexity is to limit the
> > information passed to graphite. E.g. we can put everything in the same
> > alias set as we do now. This is very conservative but definitely faster.
> >
> > I tend to take the high complexity at the moment, as I do not think we
> > get many SCoPs that are too big to handle and passing all the details
> > allows Graphite to do more optimizations. However we can switch to the
> > old approach if the number of data references passes a certain limit, so
> > gcc's speed will not suffer.
> 
> A note on Lis final graph algorithm.  I don't understand why you want
> to allow data-references to be part of multiple alias-sets?  (Of course
> I don't know how you are going to use the alias-sets ...)

Just to pass more information to Graphite. The easiest example might be
something like

A -- B -- C

if we have

AS1 = {A,B}
AS2 = {B,C}

we know that A and C do not alias and therefore do not have any
dependencies. If we put A,B,C in one big set we can not say this.


> IMHO what you want is to assign a different alias-set to all partitions
> of the graph.  Where for two partitions P1 and P2 there is no edge
> between them in the conflict graph.
> 
> That may of course be just as precise as using a single alias set.
> 
> Note that you likely want to distinguish read-read conflicts from
> true or anti dependences, no?  Depending on how you are going
> to use them again ...

Sure. Actually we even want to be able to distinguish between read,
write and may-write accesses.

At the moment we do not take advantage of the difference in between
write and may-write, but as soon as we do this there might show up more
interesting problems, as it is preferable to be sure a write happened as
it removes a lot of dependencies.

Tobi

Re: How could I get alias set information from data_reference_p

Reply via email to