On IPA PTA field sensitivity and pointer expression (part 2)

Erick Ochoa Mon, 28 Sep 2020 05:25:46 -0700

Hi,

previously I sent an e-mail inquiring about the state of points-toinformation of structure variables allocated in the heap. It was broughtto my attention that heap variables do not have a size to model andtherefore IPA-PTA is not able to provide field sensitivity.

I now understand better how field sensitivity is modeled in IPA-PTA andthe way size is needed in order to compute the correct solution.However, I am now trying to compute the points-to analysis for pointerexpressions for stack allocated struct variables. I am trying to answerthe question:

What does `temp->f1` points to? For the following simple example withoutheap allocated memory.


```c
struct A { char* f0; char *f1; struct A *f2;};

int __GIMPLE(startwith("ipa-pta"))
main (int argc, char * * argv)
{
  struct A p1;
  char * pc;
  char c;
  char *cast;
  struct A*temp;
  char *temp2;
  int i;
  int _27;

  i_15 = 1;
  pc = &c;
  p1.f1 = pc;
  p1.f2 = &p1;
  _27 = 0;
  cast = pc;
  temp = p1.f2;
  temp2 = temp->f1;
  return _27;
}
```

There are two question I have regarding this example. The first one isthat IPA-PTA will determine that temp2 points to { c p1 } while I thinkit should only point to { c } and I'm trying to understand why. Thesecond thing is that, I am still unsure how to get points-to informationfor pointer expressions like temp->f1.


Details:

IPA-PTA correctly points out that the structure p1 and structure pointertemp can point to both { c and p1 }


```
c = { }
p1 = { c p1 } same as temp_33
temp_33 = { c p1 }
```

I believe this is because p1 is a the whole struct variable, and temp_33is also modeling the whole struct variable. (in other words *temp_33+64points-to c, *temp_33+128 points-to p1. Note that nothing is in field f0)


However, in the case of temp2, we have the following points-to information:


```
temp2_34 = { c p1 }
```

which I believe is an over approximation. Looking at the constraintsgenerated, we see that temp2_34 was assigned the following constraint


temp2_34 = *temp_33 + 64

And that means that the method do_sd_constraint should have been used tocompute the correct points to information. Looking at the the method,and adding some print statements, it is clear to me that the problemwith this imprecision is that temp_33 may point to { c } in its secondfield. However, isn't GCC supposed to take into account fieldinformation in this case? I believe that in order to make this moreprecise we need a change in the get_varinfo API to something that takesinto account offsets and gets the solution for pointer expressions.


Instead of this line
          else if (v->may_have_pointers
                   && add_graph_edge (graph, lhs, t))
            flag |= bitmap_ior_into (sol, get_varinfo (t)->solution);

something like:

          else if (v->may_have_pointers
                   && add_graph_edge (graph, lhs, t))

flag |= bitmap_ior_into (sol, get_varinfo (t,roffset)->solution);

This seems to me that it is already a known issue and it might bedescribed accurately by this comment.

TODO: Adding offsets to pointer-to-structures can be handled (IE notpunted

  on and turned into anything), but isn't.  You can just see what offset
  inside the pointed-to struct it's going to access.

So, I just want to confirm, does this comment refer concretely to whatI'm trying to do? And does this mean that in order to accomplish an APIsimilar to what I described, would I need to create new constraintvariables? (One new constraint variable for each field in all pointer tostruct variables)


Thanks!

On IPA PTA field sensitivity and pointer expression (part 2)

Reply via email to