Hi,

I tried the "m*n example" with my prototype code generator for fully
parametric domains (which is not publicly available, yet) - sorry for
the delayed reply.

To be precise, the input is the iteration domain
  0 <= j <= m-1
  0 <= i <= n-1
for iterators j,i and scattering
  k = j + m*i

(I based the domains at 0 (instead of 1) for easier analysis of the output.)

As has been said in the discussion, the desired code for this input is

      for (k=0; k<=n*m-1; k++) {
        j = k%m;
        i = k/m;
        S1(j,i);
      }

but we are not getting this code (yet?). Here are my observations. My
prototype code generator for fully parametric input generates the
following code (for the general case, I'm omitting the case distinctions
and code for special cases like m=1 etc.):

      for (k=0; k<=n*m-1; k++) {
        for (j=max(0,k+(-n+1)*m); j<=min(k,m-1); j++) {
          for (i=ceild(k-j,m); i<=floord(k-j,m); i++) {
            S1(j,i);
          }
        }
      }

This is obviously sub-optimal as the loop on j tries every value for j
between 0 and m-1 and then the loop on i checks whether k-j is a
multiple of m. To get a more reasonable code, we would have to infer
from the condition in the loop on i (that m must divide k-j) that the
loop on j should be executed with a stride of m (starting at a suitable
value for j). From there, using the bounds of the loop on j, we could
infer that the loop on j has only one iteration, leading to  j = k%m,
i = (k-j)/m. Finally (by using the bounds on j again), we could infer
that  i=k/m.

This required reasoning relies on the properties of integers and is, of
course, more difficult in the parametric case. I tried to run the
example with m=9 through cloog-isl and what I got is

      for (k=0;k<=9*n-1;k++) {
        for (j=max(0,k-9*n+9);j<=min(8,k);j++) {
          if ((k+8*j)%9 == 0) {
            S1(j,(k-j)/9);
          }
        }
      }

so cloog-isl does not seem to perform the required reasoning, either.
(Or did I miss a switch to turn this on?)

With cloog-isl, the situation is much better if the loop on j becomes
the innermost loop (i.e., j and i are exchanged):

      for (k=0;k<=9*n-1;k++) {
        i = floord(k,9);
        S1(i,k-9*i);
      }

My prototype for the fully parametric case does not fully profit from
the exchange of the inner loops and generates

      for (k=0; k<=n*m-1; k++) {
        for (i=max(0,ceild(k-m+1,m)); i<=min(floord(k,m),n-1); i++) {
          for (j=max(k-m*i,0); j<=min(k-m*i,m-1); j++) {
            S1(i,j);
          }
        }
      }

which shows the "integer" problem that it does optimize the loop bounds
sufficiently. For example, it does not recognize that in the upper bound
of i, floord(k,m) is always less than or equal to n-1 (it fails because
it only checks k/m vs. n-1 but in the rationals k/m can be greater than
n-1). Therefore, it misses the chance to detect that i = floord(k,m) etc.

So, there are obviously some "integer" challenges to generating the good
code desired in this example. At the moment, I'm thinking about which
tests/heuristics to implement in my prototype to take advantage/care of
integrality.

Thanks for this "food for thought"!

Regards,
Armin

Reply via email to