> On 05-Aug-2021, at 4:56 AM, David Malcolm <dmalc...@redhat.com> wrote:
>
> On Wed, 2021-08-04 at 21:32 +0530, Ankur Saini wrote:
>
> [...snip...]
>>
>> - From observation, a typical vfunc call that isn't devirtualised by
>> the compiler's front end looks something like this
>> "OBJ_TYPE_REF(_2;(struct A)a_ptr_5(D)->0) (a_ptr_5(D))"
>> where "a_ptr_5(D)" is pointer that is being used to call the virtual
>> function.
>>
>> - We can access it's region to see what is the type of the object the
>> pointer is actually pointing to.
>>
>> - This is then used to find a call with DECL_CONTEXT of the object
>> from the all the possible targets of that polymorphic call.
>
> [...]
>
>>
>> Patch file ( prototype ) :
>>
>
>> + /* Call is possibly a polymorphic call.
>> +
>> + In such case, use devirtisation tools to find
>> + possible callees of this function call. */
>> +
>> + function *fun = get_current_function ();
>> + gcall *stmt = const_cast<gcall *> (call);
>> + cgraph_edge *e = cgraph_node::get (fun->decl)->get_edge (stmt);
>> + if (e->indirect_info->polymorphic)
>> + {
>> + void *cache_token;
>> + bool final;
>> + vec <cgraph_node *> targets
>> + = possible_polymorphic_call_targets (e, &final, &cache_token, true);
>> + if (!targets.is_empty ())
>> + {
>> + tree most_propbable_taget = NULL_TREE;
>> + if(targets.length () == 1)
>> + return targets[0]->decl;
>> +
>> + /* From the current state, check which subclass the pointer that
>> + is being used to this polymorphic call points to, and use to
>> + filter out correct function call. */
>> + tree t_val = gimple_call_arg (call, 0);
>
> Maybe rename to "this_expr"?
>
>
>> + const svalue *sval = get_rvalue (t_val, ctxt);
>
> and "this_sval"?
ok
>
> ...assuming that that's what the value is.
>
> Probably should reject the case where there are zero arguments.
Ideally it should always have one argument representing the pointer used to
call the function.
for example, if the function is called like this : -
a_ptr->foo(arg); // where foo() is a virtual function and a_ptr is a pointer
to an object of a subclass.
I saw that it’s GIMPLE representation is as follows : -
OBJ_TYPE_REF(_2;(struct A)a_ptr_5(D)->0) (a_ptr_5, arg);
>
>
>> +
>> + const region *reg
>> + = [&]()->const region *
>> + {
>> + switch (sval->get_kind ())
>> + {
>> + case SK_INITIAL:
>> + {
>> + const initial_svalue *initial_sval
>> + = sval->dyn_cast_initial_svalue ();
>> + return initial_sval->get_region ();
>> + }
>> + break;
>> + case SK_REGION:
>> + {
>> + const region_svalue *region_sval
>> + = sval->dyn_cast_region_svalue ();
>> + return region_sval->get_pointee ();
>> + }
>> + break;
>> +
>> + default:
>> + return NULL;
>> + }
>> + } ();
>
> I think the above should probably be a subroutine.
>
> That said, it's not clear to me what it's doing, or that this is correct.
Sorry, I think I should have explained it earlier.
Let's take an example code snippet :-
Derived d;
Base *base_ptr;
base_ptr = &d;
base_ptr->foo(); // where foo() is a virtual function
This genertes the following GIMPLE dump :-
Derived::Derived (&d);
base_ptr_6 = &d.D.3779;
_1 = base_ptr_6->_vptr.Base;
_2 = _1 + 8;
_3 = *_2;
OBJ_TYPE_REF(_3;(struct Base)base_ptr_6->1) (base_ptr_6);
Here instead of trying to extract virtual pointer from the call and see which
subclass it belongs, I found it simpler to extract the actual pointer which is
used to call the function itself (which from observation, is always the first
parameter of the call) and used the region model at that point to figure out
what is the type of the object it actually points to ultimately get the actual
subclass who's function is being called here. :)
Now let me try to explain how I actually executed it ( A lot of assumptions
here are based on observation, so please correct me wherever you think I made a
false interpretation or forgot about a certain special case ) :
- once it is confirmed that the call that we are dealing with is a polymorphic
call ( via the cgraph edge representing the call ), I used the
"possible_polymorphic_call_targets ()" from ipa-utils.h ( defined in
ipa-devirt.c ), to get the possible callee of that call.
function *fun = get_current_function ();
gcall *stmt = const_cast<gcall *> (call);
cgraph_edge *e = cgraph_node::get (fun->decl)->get_edge (stmt);
if (e->indirect_info->polymorphic)
{
void *cache_token;
bool final;
vec <cgraph_node *> targets
= possible_polymorphic_call_targets (e, &final, &cache_token, true);
- Now if the list contains more than one targets, I will make use of the
current enode's region model to get more info about the pointer which was used
to call the function .
/* here I extract the pointer (which was used to call the function),
which from observation, is always the zeroth argument of the call. */
tree t_val = gimple_call_arg (call, 0);
const svalue *sval = get_rvalue (t_val, ctxt);
- In all the examples I used, the pointer is represented as region_svalue or as
initial_svalue (I think, initial_svalue is the case where the pointer is taken
as a parameter of the current function and analyzer is analysing top-level call
to this function )
Here are some examples of the following, Where I used __analyzer_describe () to
show the same
. (https://godbolt.org/z/Mqs8oM6ff)
. (https://godbolt.org/z/z4sfTM3f5))
/* here I extract the region that the pointer is pointing to, and as
both of them returns a (const region *), I used a lambda to get it ( If you
want, I can turn this into a separate function to make it more readable ) */
const region *reg
= [&]()->const region *
{
switch (sval->get_kind ())
{
case SK_INITIAL:
{
const initial_svalue *initial_sval
= sval->dyn_cast_initial_svalue ();
return initial_sval->get_region ();
}
break;
case SK_REGION:
{
const region_svalue *region_sval
= sval->dyn_cast_region_svalue ();
return region_sval->get_pointee ();
}
break;
default:
return NULL;
}
} ();
gcc_assert (reg);
/* Now that I have the region, I tried to get the type of the object it
is holding and put it in ‘known_possible_subclass_type’. */
tree known_possible_subclass_type;
known_possible_subclass_type = reg->get_type ();
if (reg->get_kind () == RK_FIELD)
{
const field_region* field_reg = reg->dyn_cast_field_region ();
known_possible_subclass_type
= DECL_CONTEXT (field_reg->get_field ());
}
/* After that I iterated over the entire array of possible calls to find the
function which whose scope ( DECL_CONTEXT (fn_decl) ) is same as that of the
type of the object that the pointer is actually pointing to. */
for (cgraph_node *x : targets)
{
if (DECL_CONTEXT (x->decl) == known_possible_subclass_type)
most_propbable_taget = x->decl;
}
return most_propbable_taget;
}
}
I tested it on all of the test programs I created and till now in all of the
cases, the analyzer is correctly determining the call. I am currently in the
process of creating more tests ( including multiple types of inheritances ) to
see how successful is this implementation .
>
> I'm guessing that you need to see if
> *((void **)this)
> is a vtable pointer (or something like that), and, if so, which class
> it is for.
>
> Is there a way of getting the vtable pointer as an svalue?
>
>> + gcc_assert (reg);
>> +
>> + tree known_possible_subclass_type;
>> + known_possible_subclass_type = reg->get_type ();
>> + if (reg->get_kind () == RK_FIELD)
>> + {
>> + const field_region* field_reg = reg->dyn_cast_field_region ();
>> + known_possible_subclass_type
>> + = DECL_CONTEXT (field_reg->get_field ());
>> + }
>> +
>> + for (cgraph_node *x : targets)
>> + {
>> + if (DECL_CONTEXT (x->decl) == known_possible_subclass_type)
>> + most_propbable_taget = x->decl;
>> + }
>> + return most_propbable_taget;
>> + }
>> + }
>> +
>> return NULL_TREE;
>> }
>
> Dave
>
>
Thanks
- Ankur