Invoking a hook in c_parser_postfix_expression catches
__builtin_offsetof, but expressions such as "((size_t)&((struct
S*)0)->m)" are still lost irreversibly. For the plugin I'm writing, I
need to get them all.

On Mon, Oct 20, 2025 at 3:16 PM Richard Biener
<[email protected]> wrote:
>
> On Mon, Oct 20, 2025 at 1:23 PM Jasper Niebuhr <[email protected]> 
> wrote:
> >
> > That makes total sense. The COMPONENT_REFs that I know are being
> > folded are either in __builtin_offsetof or offsetof-like constructs,
> > e.g. "((size_t)&((struct S*)0)->m)". The former case calls
> > fold_offsetof immediately, while parsing. Expressions using the latter
> > are not folded during parsing, but right after. At the moment, the
> > folding logic for this still ends up calling fold_offsetof to do the
> > job.
> >
> > Technically, I could invoke a callback from inside fold_offsetof.
> > However, I find it somewhat fragile to assume that future refactorings
> > of the folder will always continue to route offseof-like folding
> > through fold_offsetof(). The logic could easily be inlined or
> > delegated elsewhere. I suppose an accompanying test case with a
> > minimal plugin could ensure that the callback is actually invoked in
> > all relevant cases and would catch any future change that bypasses
> > this path.
>
> So is it good enough to invoke an existing hook before this early
> foldiing (I suppose from c_parser_postfix_expression?)
>
> > Does that sound good?
> >
> > On Mon, Oct 20, 2025 at 11:19 AM Richard Biener
> > <[email protected]> wrote:
> > >
> > > On Mon, Oct 20, 2025 at 10:36 AM Jasper Niebuhr
> > > <[email protected]> wrote:
> > > >
> > > > I'm not aware of any mechanisms, or a foundation for introducing a new
> > > > mechanism that allows associating original trees with folded constant
> > > > expressions, without the risk of those being lost during further
> > > > folding.
> > > >
> > > > That said, I see your point about a new global plugin event being
> > > > heavy-weight in terms of API surface and maintenance. I could make
> > > > this a front-end-local callback. For example, the C front-end could
> > > > expose an internal registration function, implemented in c-common.cc
> > > > and declared only in c-common.h.
> > > >
> > > > Since c-common.h is not included in the public plugin headers, this
> > > > wouldn't become part of the documented plugin API. Plugins that really
> > > > need this would have to declare the function prototype themselves and
> > > > link against the C front end. In other words, this would be a private,
> > > > opt-in mechanism rather than a public API commitment, so it shouldn’t
> > > > constrain future refactoring.
> > > >
> > > > If that seems reasonable, I can prepare a v2 implementing it this way.
> > >
> > > I guess my thinking is more in the line of this being very much a too 
> > > low-level
> > > point to do any interception.  As for catching expressions pre-folding 
> > > and a
> > > frontend specific hook I would suggest to research into a direection to 
> > > have
> > > a hook at the point the parser finishes parsing certain constructs - I'm 
> > > not
> > > sure what exactly we are looking at, possibly {r,l}values?  A point before
> > > semantic analysis which is what breaks your case as far as I understand.
> > >
> > > Richard.
> > >
> > > > On Mon, Oct 20, 2025 at 9:22 AM Richard Biener
> > > > <[email protected]> wrote:
> > > > >
> > > > > On Sat, Oct 18, 2025 at 3:49 PM York Jasper Niebuhr
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > This patch adds the PLUGIN_BUILD_COMPONENT_REF callback, which is 
> > > > > > invoked
> > > > > > by the C front end when a COMPONENT_REF node is built. The callback
> > > > > > receives a pointer to the COMPONENT_REF tree (of type 'tree *'). 
> > > > > > Plugins
> > > > > > may replace the node by assigning through the pointer, but any
> > > > > > replacement must be type-compatible with the original node.
> > > > > >
> > > > > > The callback allows plugins to observe or instrument struct member
> > > > > > accesses that would otherwise be lost due to folding before the 
> > > > > > earliest
> > > > > > possible plugin pass or hook. In particular, the fold_offsetof
> > > > > > functionality removes all traces of type and member information in
> > > > > > offsetof-like trees, leaving only an integer constant for plugins to
> > > > > > inspect.
> > > > > >
> > > > > > A considered alternative was to disable fold_offsetof altogether.
> > > > > > However, that prevents offsetof expressions from qualifying as
> > > > > > constant-expressions; for example, static assertions can no longer 
> > > > > > be
> > > > > > evaluated if they contain non-folded offsetof expressions. The 
> > > > > > callback
> > > > > > provides fine-grained control over individual COMPONENT_REFs 
> > > > > > instead of
> > > > > > universally changing folding behavior.
> > > > >
> > > > > I think a hook on COMPONENT_REF building is quite heavy-weight.  IMO
> > > > > folding required for constant-expression-ness diagnosing might be 
> > > > > better
> > > > > of exposing both the folding result and the original tree somehow?
> > > > >
> > > > > > A typical use case would be to replace a select set of COMPONENT_REF
> > > > > > nodes with type-compatible expressions calling a placeholder 
> > > > > > function,
> > > > > > e.g. __deferred_offsetof(type, member). These calls cannot be folded
> > > > > > away and thus remain available for plugin analysis in later passes.
> > > > > > Offsets not of interest can be left untouched, preserving their 
> > > > > > const
> > > > > > qualification and use in static assertions.
> > > > > >
> > > > > > Allowing PLUGIN_BUILD_COMPONENT_REF to alter COMPONENT_REF nodes 
> > > > > > required
> > > > > > minor adjustments to fold_offsetof, which assumes a specific input
> > > > > > format. Code paths that cannot guarantee that format should now use
> > > > > > fold_offsetof_maybe(), which attempts to fold normally but, on 
> > > > > > failure,
> > > > > > casts the unfolded expression to the desired output type.
> > > > > >
> > > > > > If the callback is not used to alter COMPONENT_REF trees, there is 
> > > > > > **no
> > > > > > change** in GCC’s behavior.
> > > > > >
> > > > > > Signed-off-by: York Jasper Niebuhr <[email protected]>
> > > > > >
> > > > > > ---
> > > > > >  gcc/c-family/c-common.cc | 48 
> > > > > > +++++++++++++++++++++++++++++++---------
> > > > > >  gcc/c-family/c-common.h  |  3 ++-
> > > > > >  gcc/c/c-parser.cc        |  2 +-
> > > > > >  gcc/c/c-typeck.cc        | 12 ++++++++++
> > > > > >  gcc/doc/plugins.texi     |  6 +++++
> > > > > >  gcc/plugin.cc            |  2 ++
> > > > > >  gcc/plugin.def           |  6 +++++
> > > > > >  7 files changed, 66 insertions(+), 13 deletions(-)
> > > > > >
> > > > > > diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> > > > > > index 587d76461e9..d34edfaa688 100644
> > > > > > --- a/gcc/c-family/c-common.cc
> > > > > > +++ b/gcc/c-family/c-common.cc
> > > > > > @@ -7076,43 +7076,48 @@ c_common_to_target_charset (HOST_WIDE_INT c)
> > > > > >     the whole expression.  Return the folded result.  */
> > > > > >
> > > > > >  tree
> > > > > > -fold_offsetof (tree expr, tree type, enum tree_code ctx)
> > > > > > +fold_offsetof (tree expr, tree type, enum tree_code ctx, bool 
> > > > > > may_fail)
> > > > > >  {
> > > > > >    tree base, off, t;
> > > > > >    tree_code code = TREE_CODE (expr);
> > > > > > +
> > > > > >    switch (code)
> > > > > >      {
> > > > > >      case ERROR_MARK:
> > > > > >        return expr;
> > > > > >
> > > > > >      case VAR_DECL:
> > > > > > -      error ("cannot apply %<offsetof%> to static data member 
> > > > > > %qD", expr);
> > > > > > +      if (!may_fail)
> > > > > > +       error ("cannot apply %<offsetof%> to static data member 
> > > > > > %qD", expr);
> > > > > >        return error_mark_node;
> > > > > >
> > > > > >      case CALL_EXPR:
> > > > > >      case TARGET_EXPR:
> > > > > > -      error ("cannot apply %<offsetof%> when %<operator[]%> is 
> > > > > > overloaded");
> > > > > > +      if (!may_fail)
> > > > > > +       error ("cannot apply %<offsetof%> when %<operator[]%> is 
> > > > > > overloaded");
> > > > > >        return error_mark_node;
> > > > > >
> > > > > >      case NOP_EXPR:
> > > > > >      case INDIRECT_REF:
> > > > > >        if (!TREE_CONSTANT (TREE_OPERAND (expr, 0)))
> > > > > >         {
> > > > > > -         error ("cannot apply %<offsetof%> to a non constant 
> > > > > > address");
> > > > > > +         if (!may_fail)
> > > > > > +           error ("cannot apply %<offsetof%> to a non constant 
> > > > > > address");
> > > > > >           return error_mark_node;
> > > > > >         }
> > > > > >        return convert (type, TREE_OPERAND (expr, 0));
> > > > > >
> > > > > >      case COMPONENT_REF:
> > > > > > -      base = fold_offsetof (TREE_OPERAND (expr, 0), type, code);
> > > > > > +      base = fold_offsetof (TREE_OPERAND (expr, 0), type, code, 
> > > > > > may_fail);
> > > > > >        if (base == error_mark_node)
> > > > > >         return base;
> > > > > >
> > > > > >        t = TREE_OPERAND (expr, 1);
> > > > > >        if (DECL_C_BIT_FIELD (t))
> > > > > >         {
> > > > > > -         error ("attempt to take address of bit-field structure "
> > > > > > -                "member %qD", t);
> > > > > > +         if (!may_fail)
> > > > > > +           error ("attempt to take address of bit-field structure "
> > > > > > +                  "member %qD", t);
> > > > > >           return error_mark_node;
> > > > > >         }
> > > > > >        off = size_binop_loc (input_location, PLUS_EXPR, 
> > > > > > DECL_FIELD_OFFSET (t),
> > > > > > @@ -7121,7 +7126,7 @@ fold_offsetof (tree expr, tree type, enum 
> > > > > > tree_code ctx)
> > > > > >        break;
> > > > > >
> > > > > >      case ARRAY_REF:
> > > > > > -      base = fold_offsetof (TREE_OPERAND (expr, 0), type, code);
> > > > > > +      base = fold_offsetof (TREE_OPERAND (expr, 0), type, code, 
> > > > > > may_fail);
> > > > > >        if (base == error_mark_node)
> > > > > >         return base;
> > > > > >
> > > > > > @@ -7178,17 +7183,38 @@ fold_offsetof (tree expr, tree type, enum 
> > > > > > tree_code ctx)
> > > > > >      case COMPOUND_EXPR:
> > > > > >        /* Handle static members of volatile structs.  */
> > > > > >        t = TREE_OPERAND (expr, 1);
> > > > > > -      gcc_checking_assert (VAR_P (get_base_address (t)));
> > > > > > -      return fold_offsetof (t, type);
> > > > > > +      if (!VAR_P (get_base_address (t)))
> > > > > > +       return error_mark_node;
> > > > > > +      return fold_offsetof (t, type, ERROR_MARK, may_fail);
> > > > > >
> > > > > >      default:
> > > > > > -      gcc_unreachable ();
> > > > > > +      return error_mark_node;
> > > > > >      }
> > > > > >
> > > > > >    if (!POINTER_TYPE_P (type))
> > > > > >      return size_binop (PLUS_EXPR, base, convert (type, off));
> > > > > >    return fold_build_pointer_plus (base, off);
> > > > > >  }
> > > > > > +
> > > > > > +/* Tries folding expr using fold_offsetof.  On success, the folded 
> > > > > > offsetof
> > > > > > +   is returned.  On failure, the original expr is wrapped in an 
> > > > > > ADDR_EXPR
> > > > > > +   and converted to the desired expression type.  The resulting 
> > > > > > expression
> > > > > > +   may or may not be constant!  */
> > > > > > +
> > > > > > +tree
> > > > > > +fold_offsetof_maybe (tree expr, tree type)
> > > > > > +{
> > > > > > +  /* expr might not have the correct structure, thus folding may 
> > > > > > fail.  */
> > > > > > +  tree maybe_folded = fold_offsetof (expr, type, ERROR_MARK, true);
> > > > > > +  if (maybe_folded != error_mark_node)
> > > > > > +    return maybe_folded;
> > > > > > +
> > > > > > +  tree ptr_type = build_pointer_type (TREE_TYPE (expr));
> > > > > > +  tree ptr = build1 (ADDR_EXPR, ptr_type, expr);
> > > > > > +
> > > > > > +  return fold_convert (type, ptr);
> > > > > > +}
> > > > > > +
> > > > > >
> > > > > >  /* *PTYPE is an incomplete array.  Complete it with a domain based 
> > > > > > on
> > > > > >     INITIAL_VALUE.  If INITIAL_VALUE is not present, use 1 if 
> > > > > > DO_DEFAULT
> > > > > > diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
> > > > > > index ea6c2975056..70fcfeb6661 100644
> > > > > > --- a/gcc/c-family/c-common.h
> > > > > > +++ b/gcc/c-family/c-common.h
> > > > > > @@ -1174,7 +1174,8 @@ extern bool c_dump_tree (void *, tree);
> > > > > >  extern void verify_sequence_points (tree);
> > > > > >
> > > > > >  extern tree fold_offsetof (tree, tree = size_type_node,
> > > > > > -                          tree_code ctx = ERROR_MARK);
> > > > > > +                          tree_code ctx = ERROR_MARK, bool 
> > > > > > may_fail = false);
> > > > > > +extern tree fold_offsetof_maybe (tree, tree = size_type_node);
> > > > > >
> > > > > >  extern int complete_array_type (tree *, tree, bool);
> > > > > >  extern void complete_flexible_array_elts (tree);
> > > > > > diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> > > > > > index 22ec0f849b7..6a8a5d58e6d 100644
> > > > > > --- a/gcc/c/c-parser.cc
> > > > > > +++ b/gcc/c/c-parser.cc
> > > > > > @@ -11823,7 +11823,7 @@ c_parser_postfix_expression (c_parser 
> > > > > > *parser)
> > > > > >             location_t end_loc = c_parser_peek_token 
> > > > > > (parser)->get_finish ();
> > > > > >             c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
> > > > > >                                        "expected %<)%>");
> > > > > > -           expr.value = fold_offsetof (offsetof_ref);
> > > > > > +           expr.value = fold_offsetof_maybe (offsetof_ref);
> > > > > >             set_c_expr_source_range (&expr, loc, end_loc);
> > > > > >           }
> > > > > >           break;
> > > > > > diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
> > > > > > index 55d896e02df..aff6dce36fb 100644
> > > > > > --- a/gcc/c/c-typeck.cc
> > > > > > +++ b/gcc/c/c-typeck.cc
> > > > > > @@ -55,6 +55,7 @@ along with GCC; see the file COPYING3.  If not see
> > > > > >  #include "realmpfr.h"
> > > > > >  #include "tree-pretty-print-markup.h"
> > > > > >  #include "gcc-urlifier.h"
> > > > > > +#include "plugin.h"
> > > > > >
> > > > > >  /* Possible cases of implicit conversions.  Used to select 
> > > > > > diagnostic messages
> > > > > >     and control folding initializers in convert_for_assignment.  */
> > > > > > @@ -133,6 +134,7 @@ static int lvalue_or_else (location_t, 
> > > > > > const_tree, enum lvalue_use);
> > > > > >  static void record_maybe_used_decl (tree);
> > > > > >  static bool comptypes_internal (const_tree, const_tree,
> > > > > >                                 struct comptypes_data *data);
> > > > > > +
> > > > > >
> > > > > >  /* Return true if EXP is a null pointer constant, false otherwise. 
> > > > > >  */
> > > > > >
> > > > > > @@ -3174,6 +3176,16 @@ build_component_ref (location_t loc, tree 
> > > > > > datum, tree component,
> > > > > >           else if (TREE_DEPRECATED (subdatum))
> > > > > >             warn_deprecated_use (subdatum, NULL_TREE);
> > > > > >
> > > > > > +      tree pre_cb_type = TREE_TYPE (ref);
> > > > > > +      if (invoke_plugin_callbacks (PLUGIN_BUILD_COMPONENT_REF, 
> > > > > > &ref)
> > > > > > +             == PLUGEVT_SUCCESS
> > > > > > +             && !comptypes (TREE_TYPE (ref), pre_cb_type))
> > > > > > +       {
> > > > > > +         error_at (EXPR_LOCATION (ref),
> > > > > > +                   "PLUGIN_BUILD_COMPONENT_REF callback returned"
> > > > > > +                   " expression of incompatible type");
> > > > > > +       }
> > > > > > +
> > > > > >           datum = ref;
> > > > > >
> > > > > >           field = TREE_CHAIN (field);
> > > > > > diff --git a/gcc/doc/plugins.texi b/gcc/doc/plugins.texi
> > > > > > index c11167a34ef..312f178fab4 100644
> > > > > > --- a/gcc/doc/plugins.texi
> > > > > > +++ b/gcc/doc/plugins.texi
> > > > > > @@ -222,6 +222,12 @@ enum plugin_event
> > > > > >       ana::plugin_analyzer_init_iface *.  */
> > > > > >    PLUGIN_ANALYZER_INIT,
> > > > > >
> > > > > > +  /* Called by the C front end when a COMPONENT_REF node is built. 
> > > > > >  The
> > > > > > +     callback receives a pointer to the COMPONENT_REF tree (of 
> > > > > > type 'tree *').
> > > > > > +     Plugins may replace the node by assigning through the 
> > > > > > pointer, but any
> > > > > > +     replacement must be type-compatible with the original node.  
> > > > > > */
> > > > > > +  PLUGIN_BUILD_COMPONENT_REF,
> > > > > > +
> > > > > >    PLUGIN_EVENT_FIRST_DYNAMIC    /* Dummy event used for indexing 
> > > > > > callback
> > > > > >                                     array.  */
> > > > > >  @};
> > > > > > diff --git a/gcc/plugin.cc b/gcc/plugin.cc
> > > > > > index 0de2cc2dd2c..975e8c4e291 100644
> > > > > > --- a/gcc/plugin.cc
> > > > > > +++ b/gcc/plugin.cc
> > > > > > @@ -500,6 +500,7 @@ register_callback (const char *plugin_name,
> > > > > >        case PLUGIN_NEW_PASS:
> > > > > >        case PLUGIN_INCLUDE_FILE:
> > > > > >        case PLUGIN_ANALYZER_INIT:
> > > > > > +      case PLUGIN_BUILD_COMPONENT_REF:
> > > > > >          {
> > > > > >            struct callback_info *new_callback;
> > > > > >            if (!callback)
> > > > > > @@ -581,6 +582,7 @@ invoke_plugin_callbacks_full (int event, void 
> > > > > > *gcc_data)
> > > > > >        case PLUGIN_NEW_PASS:
> > > > > >        case PLUGIN_INCLUDE_FILE:
> > > > > >        case PLUGIN_ANALYZER_INIT:
> > > > > > +      case PLUGIN_BUILD_COMPONENT_REF:
> > > > > >          {
> > > > > >            /* Iterate over every callback registered with this 
> > > > > > event and
> > > > > >               call it.  */
> > > > > > diff --git a/gcc/plugin.def b/gcc/plugin.def
> > > > > > index 94e012a1e00..b0335178762 100644
> > > > > > --- a/gcc/plugin.def
> > > > > > +++ b/gcc/plugin.def
> > > > > > @@ -103,6 +103,12 @@ DEFEVENT (PLUGIN_INCLUDE_FILE)
> > > > > >     ana::plugin_analyzer_init_iface *.  */
> > > > > >  DEFEVENT (PLUGIN_ANALYZER_INIT)
> > > > > >
> > > > > > +/* Called by the C front end when a COMPONENT_REF node is built.
> > > > > > +   The callback receives a pointer to the COMPONENT_REF tree (of 
> > > > > > type 'tree *').
> > > > > > +   Plugins may replace the node by assigning through the pointer, 
> > > > > > but any
> > > > > > +   replacement must be type-compatible with the original node.  */
> > > > > > +DEFEVENT (PLUGIN_BUILD_COMPONENT_REF)
> > > > > > +
> > > > > >  /* When adding a new hard-coded plugin event, don't forget to edit 
> > > > > > in
> > > > > >     file plugin.cc the functions register_callback and
> > > > > >     invoke_plugin_callbacks_full accordingly!  */
> > > > > > --
> > > > > > 2.43.0
> > > > > >

Reply via email to