Hi,

On Thu, Dec 11 2025, Jan Hubicka wrote:
>> Hi,
>> 
>> this patch extends the ipa-devirt pass with the ability to add
>> speculative calls for indirect calls where the target was loaded from
>> a record_type data structure and we have seen writes of address of
>> only one particular function to the same offset of that record
>> type (including static initializers).
>> 
>> The idea is basically taken from Christoph Müllner's patch from 2022
>> (https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605934.html)
>> but I have re-worked it so that it stores the required information in
>> GC memory (which I belive is necessary) and does not require an
>> additional pass through gimple statements of all functions because it
>> uses the analysis phase of ipa-cp/ipa-fnsummary (which was an approach
>> we agreed on with Honza).
>> 
>> It also performs simple verification that the collected types match
>> the type of the record field.  We could verify that the function
>> determined as the likely target matches the call statement
>> expectations, but for that we would need to stream both types which is
>> something I decided not to do.
>> 
>> Bootsstrapped and LTO-bootstrapped and tested on x86_64-linux.  OK for
>> master?
>> 
>> Thanks,
>> 
>> Martin
>> 
>> 
>> gcc/ChangeLog:
>> 
>> 2025-10-24  Martin Jambor  <[email protected]>
>> 
>>      PR ipa/107666
>>      * cgraph.h (cgraph_simple_indirect_info): New fields rec_type and
>>      fld_offset.
>>      * ipa-prop.h (ipa_analyze_var_static_initializer): Declare.
>>      (ipa_dump_noted_record_fnptrs): Likewise.
>>      (ipa_debug_noted_record_fnptrs): Likewise.
>>      (ipa_single_noted_fnptr_in_record): Likewise.
>>      (ipa_free_noted_fnptr_calls): Likewise.
>>      * ipa-cp.cc (ipcp_generate_summary): Call
>>      ipa_analyze_var_static_initializer on each varbool node with a static
>>      initializer.
>>      * ipa-devirt.cc (struct devirt_stats): New type.
>>      (devirt_target_ok_p): New function.
>>      (ipa_devirt): Move statistics counters to the new structure.  dump
>>      noted function pointers stored in records.  Check for edge hotness
>>      first and for odr_types only for polymorphic edges.  Moved a number of
>>      checks to devirt_target_ok_p.  Also add speculative direct calls for
>>      non-polymorphic indirect ones when ipa_single_noted_fnptr_in_record
>>      finds a likely target.  Call ipa_free_noted_fnptr_calls.
>>      * ipa-prop.cc (noted_fnptr_store): New type.
>>      (struct noted_fnptr_hasher): Likewise.
>>      (noted_fnptr_hasher::hash): New function.
>>      (noted_fnptr_hasher::equal): Likewise.
>>      (noted_fnptrs_in_records): New.
>>      (is_func_ptr_from_record): New function.
>>      (ipa_analyze_indirect_call_uses): Also simple create indirect info
>>      structures with fnptr_loaded_from_record set.
>>      (note_fnptr_in_record): New function.
>>      (ipa_dump_noted_record_fnptrs): Likewise.
>>      (ipa_debug_noted_record_fnptrs): Likewise.
>>      (ipa_single_noted_fnptr_in_record): Likewise.
>>      (ipa_free_noted_fnptr_calls): Likewise.
>>      (ipa_analyze_stmt_uses): Also look for stroes of function pointers to
>>      record structures.
>>      (ipa_analyze_var_static_initializer): New function.
>>      (ipa_write_indirect_edge_info): Also stream fnptr_loaded_from_record
>>      indirec infos.
>>      (ipa_read_indirect_edge_info): Likewise.
>>      (ipa_prop_write_jump_functions): Also stream the contents of
>>      noted_fnptrs_in_records.
>>      (ipa_prop_read_section): Likewise.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 2025-10-24  Martin Jambor  <[email protected]>
>> 
>>      * gcc.dg/lto/fnptr-from-rec-1_0.c: New test.
>>      * gcc.dg/lto/fnptr-from-rec-1_1.c: Likewise.
>>      * gcc.dg/lto/fnptr-from-rec-2_0.c: Likewise.
>>      * gcc.dg/lto/fnptr-from-rec-2_1.c: Likewise.
>>      * gcc.dg/lto/fnptr-from-rec-3_0.c: Likewise.
>>      * gcc.dg/lto/fnptr-from-rec-3_1.c: Likewise.
>> 
>> Co-authored by: Christoph Müllner <[email protected]>
>> ---
>>  gcc/cgraph.h                                  |  13 +-
>>  gcc/ipa-cp.cc                                 |   4 +
>>  gcc/ipa-devirt.cc                             | 195 +++++++----
>>  gcc/ipa-prop.cc                               | 330 +++++++++++++++++-
>>  gcc/ipa-prop.h                                |   5 +
>>  gcc/testsuite/gcc.dg/lto/fnptr-from-rec-1_0.c |  41 +++
>>  gcc/testsuite/gcc.dg/lto/fnptr-from-rec-1_1.c |  19 +
>>  gcc/testsuite/gcc.dg/lto/fnptr-from-rec-2_0.c |  43 +++
>>  gcc/testsuite/gcc.dg/lto/fnptr-from-rec-2_1.c |  22 ++
>>  gcc/testsuite/gcc.dg/lto/fnptr-from-rec-3_0.c |  43 +++
>>  gcc/testsuite/gcc.dg/lto/fnptr-from-rec-3_1.c |  19 +
>>  11 files changed, 651 insertions(+), 83 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/lto/fnptr-from-rec-1_0.c
>>  create mode 100644 gcc/testsuite/gcc.dg/lto/fnptr-from-rec-1_1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/lto/fnptr-from-rec-2_0.c
>>  create mode 100644 gcc/testsuite/gcc.dg/lto/fnptr-from-rec-2_1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/lto/fnptr-from-rec-3_0.c
>>  create mode 100644 gcc/testsuite/gcc.dg/lto/fnptr-from-rec-3_1.c
>> 
>> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
>> index c1329015342..062ad0f65ad 100644
>> --- a/gcc/cgraph.h
>> +++ b/gcc/cgraph.h
>> @@ -1746,19 +1746,30 @@ class GTY((tag ("CIIK_SIMPLE")))
>>  public:
>>    cgraph_simple_indirect_info (int flags)
>>      : cgraph_indirect_call_info (CIIK_SIMPLE, flags), offset (0),
>> -    agg_contents (false), member_ptr (false), by_ref (false),
>> +    rec_type (NULL_TREE), fld_offset (0), agg_contents (false),
>> +    member_ptr (false), fnptr_loaded_from_record (false), by_ref (false),
>>      guaranteed_unmodified (false)
>>      {}
>>  
>>    /* When agg_content is set, an offset where the call pointer is located
>>       within the aggregate.  */
>>    HOST_WIDE_INT offset;
>> +  /* Only meaningful if fnptr_loaded_from_record is set.  Then it contains 
>> the
>> +     type of the record from which the target of the call was loaded. */
>> +  tree rec_type;
>> +  /* Only meaningful if fnptr_loaded_from_record is set.  Then it contains 
>> the
>> +     offset in bytes within the type above from which the target of the call
>> +     was loaded.  */
>> +  unsigned fld_offset;
>
> I think this is exlusive with aggregate handling, so perhaps offset, can
> be (ab)used for both.

Unfortunately no, the offset is already being used to hold information
about possible C++ member pointers (and the values can be different too,
though I have forgotten what the situation was).

>> +/* If REF is a memory access that loads a function pointer (but not a method
>> +   pointer) from a RECORD_TYPE, return true and store the type of the 
>> RECORD to
>> +   *REC_TYPE and the byte offset of the field to *FLD_OFFSET.  Otherwise 
>> return
>> +   false.  OHS es the "other hand side" which is used to check type
>> +   compatibility with field in question, when possible.  */
>> +
>> +static bool
>> +is_func_ptr_from_record (tree ref, tree *rec_type, unsigned *fld_offset,
>> +                     tree ohs)
>> +{
>> +  if (!POINTER_TYPE_P (TREE_TYPE (ref))
>> +      || TREE_CODE (TREE_TYPE (TREE_TYPE (ref))) != FUNCTION_TYPE)
>> +    return false;
>> +
>> +  if (TREE_CODE (ref) == COMPONENT_REF
>> +      && TREE_CODE (TREE_TYPE (TREE_OPERAND (ref, 0))) == RECORD_TYPE)
>> +    {
>> +      gcc_assert (POINTER_TYPE_P (TREE_TYPE (ohs)));
>> +      ohs = TREE_TYPE (TREE_TYPE (ohs));
>> +      tree ftype = TREE_TYPE (TREE_OPERAND (ref, 1));
>> +      if (!POINTER_TYPE_P (ftype))
>> +    return false;
>> +      ftype = TREE_TYPE (ftype);
>> +      if (!types_compatible_p (ohs, ftype))
>> +    return false;
>> +
>> +      tree tree_off = bit_position (TREE_OPERAND (ref, 1));
>> +      if (!tree_fits_shwi_p (tree_off))
>> +    return false;
>> +      HOST_WIDE_INT bit_offset = tree_to_shwi (tree_off);
>> +      if (bit_offset % BITS_PER_UNIT)
>> +    return false;
>> +      HOST_WIDE_INT unit_offset = bit_offset / BITS_PER_UNIT;
>> +      if (unit_offset > UINT_MAX)
>> +    return false;
>> +      *rec_type = TREE_TYPE (TREE_OPERAND (ref, 0));
>> +      *fld_offset = unit_offset;
> Isn't this mostly what get_addr_base_and_unit_offset does?

This is different from get_ref_base_and_offset in that it only looks at
the GIMPLE outermost reference (so in a.b.c.f it would only inspect
c.f).  This is on purpose as I expect this to what we want more often (I
do not have any data to back this up, though) and we are not trying to
be somehow smart and record it for all types (but in that case we would
need our own expression scanner doing that anyway).

>> +      return true;
>> +    }
>> +  else if (TREE_CODE (ref) == MEM_REF
>> +       && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (ref, 0)))
>> +       && (TREE_CODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (ref, 0))))
>> +           == RECORD_TYPE))
>> +    {
>> +      HOST_WIDE_INT unit_offset = tree_to_shwi (TREE_OPERAND (ref, 1));
>> +      if (unit_offset > UINT_MAX)
>> +    return false;
>> +      *rec_type = TREE_TYPE (TREE_TYPE (TREE_OPERAND (ref, 0)));
>> +      *fld_offset = unit_offset;
>> +      return true;
>> +    }
>> +  return false;
>> +}
>> +
>
> Patch is OK.

Thanks a lot.  I'm re-testing it now and plan to commit it soon.

Martin

Reply via email to