> On 05-Aug-2021, at 4:56 AM, David Malcolm <dmalc...@redhat.com> wrote:
> 
> On Wed, 2021-08-04 at 21:32 +0530, Ankur Saini wrote:
> 
> [...snip...]
>> 
>> - From observation, a typical vfunc call that isn't devirtualised by
>> the compiler's front end looks something like this 
>> "OBJ_TYPE_REF(_2;(struct A)a_ptr_5(D)->0) (a_ptr_5(D))"
>> where "a_ptr_5(D)" is pointer that is being used to call the virtual
>> function.
>> 
>> - We can access it's region to see what is the type of the object the
>> pointer is actually pointing to.
>> 
>> - This is then used to find a call with DECL_CONTEXT of the object
>> from the all the possible targets of that polymorphic call.
> 
> [...]
> 
>> 
>> Patch file ( prototype ) : 
>> 
> 
>> +  /* Call is possibly a polymorphic call.
>> +  
>> +     In such case, use devirtisation tools to find 
>> +     possible callees of this function call.  */
>> +  
>> +  function *fun = get_current_function ();
>> +  gcall *stmt  = const_cast<gcall *> (call);
>> +  cgraph_edge *e = cgraph_node::get (fun->decl)->get_edge (stmt);
>> +  if (e->indirect_info->polymorphic)
>> +  {
>> +    void *cache_token;
>> +    bool final;
>> +    vec <cgraph_node *> targets
>> +      = possible_polymorphic_call_targets (e, &final, &cache_token, true);
>> +    if (!targets.is_empty ())
>> +      {
>> +        tree most_propbable_taget = NULL_TREE;
>> +        if(targets.length () == 1)
>> +                return targets[0]->decl;
>> +    
>> +        /* From the current state, check which subclass the pointer that 
>> +           is being used to this polymorphic call points to, and use to
>> +           filter out correct function call.  */
>> +        tree t_val = gimple_call_arg (call, 0);
> 
> Maybe rename to "this_expr"?
> 
> 
>> +        const svalue *sval = get_rvalue (t_val, ctxt);
> 
> and "this_sval"?

ok

> 
> ...assuming that that's what the value is.
> 
> Probably should reject the case where there are zero arguments.

Ideally it should always have one argument representing the pointer used to 
call the function. 

for example, if the function is called like this : -

a_ptr->foo(arg);  // where foo() is a virtual function and a_ptr is a pointer 
to an object of a subclass.

I saw that it’s GIMPLE representation is as follows : -

OBJ_TYPE_REF(_2;(struct A)a_ptr_5(D)->0) (a_ptr_5, arg);

> 
> 
>> +
>> +        const region *reg
>> +          = [&]()->const region *
>> +              {
>> +                switch (sval->get_kind ())
>> +                  {
>> +                    case SK_INITIAL:
>> +                      {
>> +                        const initial_svalue *initial_sval
>> +                          = sval->dyn_cast_initial_svalue ();
>> +                        return initial_sval->get_region ();
>> +                      }
>> +                      break;
>> +                    case SK_REGION:
>> +                      {
>> +                        const region_svalue *region_sval 
>> +                          = sval->dyn_cast_region_svalue ();
>> +                        return region_sval->get_pointee ();
>> +                      }
>> +                      break;
>> +
>> +                    default:
>> +                      return NULL;
>> +                  }
>> +              } ();
> 
> I think the above should probably be a subroutine.
> 
> That said, it's not clear to me what it's doing, or that this is correct.


Sorry, I think I should have explained it earlier.

Let's take an example code snippet :- 

Derived d;
Base *base_ptr;
base_ptr = &d;
base_ptr->foo();        // where foo() is a virtual function

This genertes the following GIMPLE dump :- 

Derived::Derived (&d);
base_ptr_6 = &d.D.3779;
_1 = base_ptr_6->_vptr.Base;
_2 = _1 + 8;
_3 = *_2;
OBJ_TYPE_REF(_3;(struct Base)base_ptr_6->1) (base_ptr_6);

Here instead of trying to extract virtual pointer from the call and see which 
subclass it belongs, I found it simpler to extract the actual pointer which is 
used to call the function itself (which from observation, is always the first 
parameter of the call) and used the region model at that point to figure out 
what is the type of the object it actually points to ultimately get the actual 
subclass who's function is being called here. :)

Now let me try to explain how I actually executed it ( A lot of assumptions 
here are based on observation, so please correct me wherever you think I made a 
false interpretation or forgot about a certain special case ) :

- once it is confirmed that the call that we are dealing with is a polymorphic 
call ( via the cgraph edge representing the call ), I used the 
"possible_polymorphic_call_targets ()" from ipa-utils.h ( defined in 
ipa-devirt.c ), to get the possible callee of that call. 

  function *fun = get_current_function ();
  gcall *stmt  = const_cast<gcall *> (call);
  cgraph_edge *e = cgraph_node::get (fun->decl)->get_edge (stmt);
  if (e->indirect_info->polymorphic)
  {
    void *cache_token;
    bool final;
    vec <cgraph_node *> targets
      = possible_polymorphic_call_targets (e, &final, &cache_token, true);

- Now if the list contains more than one targets, I will make use of the 
current enode's region model to get more info about the pointer which was used 
to call the function .

        /* here I extract the pointer (which was used to call the function), 
which from observation, is always the zeroth argument of the call.  */
        tree t_val = gimple_call_arg (call, 0);
        const svalue *sval = get_rvalue (t_val, ctxt);

- In all the examples I used, the pointer is represented as region_svalue or as 
initial_svalue (I think, initial_svalue is the case where the pointer is taken 
as a parameter of the current function and analyzer is analysing top-level call 
to this function )

Here are some examples of the following, Where I used __analyzer_describe () to 
show the same 
 . (https://godbolt.org/z/Mqs8oM6ff)
 . (https://godbolt.org/z/z4sfTM3f5))

        /* here I extract the region that the pointer is pointing to, and as 
both of them returns a (const region *), I used a lambda to get it ( If you 
want, I can turn this into a separate function to make it more readable )  */

        const region *reg
          = [&]()->const region *
              {
                switch (sval->get_kind ())
                  {
                    case SK_INITIAL:
                      {
                        const initial_svalue *initial_sval
                          = sval->dyn_cast_initial_svalue ();
                        return initial_sval->get_region ();
                      }
                      break;
                    case SK_REGION:
                      {
                        const region_svalue *region_sval 
                          = sval->dyn_cast_region_svalue ();
                        return region_sval->get_pointee ();
                      }
                      break;

                    default:
                      return NULL;
                  }
              } ();

        gcc_assert (reg);

        /* Now that I have the region, I tried to get the type of the object it 
is holding and put it in ‘known_possible_subclass_type’.  */

        tree known_possible_subclass_type;
        known_possible_subclass_type = reg->get_type ();
        if (reg->get_kind () == RK_FIELD)
          {
             const field_region* field_reg = reg->dyn_cast_field_region ();
             known_possible_subclass_type 
               = DECL_CONTEXT (field_reg->get_field ());
          }

/* After that I iterated over the entire array of possible calls to find the 
function which whose scope ( DECL_CONTEXT (fn_decl) ) is same as that of the 
type of the object that the pointer is actually pointing to.  */

        for (cgraph_node *x : targets)
          {
            if (DECL_CONTEXT (x->decl) == known_possible_subclass_type)
              most_propbable_taget = x->decl;
          }
        return most_propbable_taget;
      }
   }

I tested it on all of the test programs I created and till now in all of the 
cases, the analyzer is correctly determining the call. I am currently in the 
process of creating more tests ( including multiple types of inheritances ) to 
see how successful is this implementation .

> 
> I'm guessing that you need to see if
>  *((void **)this)
> is a vtable pointer (or something like that), and, if so, which class
> it is for.
> 
> Is there a way of getting the vtable pointer as an svalue?
> 
>> +        gcc_assert (reg);
>> +
>> +        tree known_possible_subclass_type;
>> +        known_possible_subclass_type = reg->get_type ();
>> +        if (reg->get_kind () == RK_FIELD)
>> +          {
>> +             const field_region* field_reg = reg->dyn_cast_field_region ();
>> +             known_possible_subclass_type 
>> +               = DECL_CONTEXT (field_reg->get_field ());
>> +          }
>> +
>> +        for (cgraph_node *x : targets)
>> +          {
>> +            if (DECL_CONTEXT (x->decl) == known_possible_subclass_type)
>> +              most_propbable_taget = x->decl;
>> +          }
>> +        return most_propbable_taget;
>> +      }
>> +   }
>> +
>>   return NULL_TREE;
>> }
> 
> Dave
> 
> 

Thanks 
- Ankur

Reply via email to