Re: [PATCH v4 2/6] dwarf: create annotation DIEs for btf tags

Richard Biener Wed, 02 Jul 2025 00:36:44 -0700

On Tue, Jul 1, 2025 at 11:20 PM David Faust <david.fa...@oracle.com> wrote:
>
>
>
> On 7/1/25 01:02, Richard Biener wrote:
> > On Mon, Jun 30, 2025 at 9:12 PM David Faust <david.fa...@oracle.com> wrote:
> >>
> >>
> >>
> >> On 6/30/25 06:11, Richard Biener wrote:
> >>>> +static void
> >>>> +gen_btf_decl_tag_dies (tree t, dw_die_ref target, dw_die_ref 
> >>>> context_die)
> >>>> +{
> >>>> +  if (t == NULL_TREE || !DECL_P (t) || !target)
> >>>> +    return;
> >>>> +
> >>>> +  tree attr = lookup_attribute ("btf_decl_tag", DECL_ATTRIBUTES (t));
> >>>> +  if (attr == NULL_TREE)
> >>>> +    return;
> >>>> +
> >>>> +  gen_btf_tag_dies (attr, target, context_die);
> >>>> +
> >>>> +  /* Strip the decl tag attribute once we have created the annotation 
> >>>> DIEs
> >>>> +     to avoid attempting process it multiple times.  Global variable
> >>>> +     declarations may reach this function more than once.  */
> >>>> +  DECL_ATTRIBUTES (t)
> >>>> +    = remove_attribute ("btf_decl_tag", DECL_ATTRIBUTES (t));
> >>> I do not like modifying trees as part of dwarf2out.  You should be able to
> >>> see whether a DIE already has the respective attribute applied?
> >>
> >> Yes, you're right. For decl_tag the case is simple and better handled by
> >> consulting the hash table. Simple fix and this remove_attribute can be
> >> deleted.
> >>
> >> Understood re: modifying trees in dwarf2out. I agree it's not ideal.
> >>
> >> For this case the remove_attribute can be deleted. For the two below,
> >> one is already immediately restored and the other could be as well so
> >> that there are no lasting changes in the tree at all.
> >>
> >> I will explain the reasoning some more below.
> >>
> >>>
> >>>> +}
> >>>> +
> >>>>  /* Given a pointer to an arbitrary ..._TYPE tree node, return a 
> >>>> debugging
> >>>>     entry that chains the modifiers specified by CV_QUALS in front of the
> >>>>     given type.  REVERSE is true if the type is to be interpreted in the
> >>>> @@ -13674,6 +13894,7 @@ modified_type_die (tree type, int cv_quals, bool 
> >>>> reverse,
> >>>>    tree item_type = NULL;
> >>>>    tree qualified_type;
> >>>>    tree name, low, high;
> >>>> +  tree tags;
> >>>>    dw_die_ref mod_scope;
> >>>>    struct array_descr_info info;
> >>>>    /* Only these cv-qualifiers are currently handled.  */
> >>>> @@ -13783,10 +14004,62 @@ modified_type_die (tree type, int cv_quals, 
> >>>> bool reverse,
> >>>>           dquals &= cv_qual_mask;
> >>>>           if ((dquals & ~cv_quals) != TYPE_UNQUALIFIED
> >>>>               || (cv_quals == dquals && DECL_ORIGINAL_TYPE (name) != 
> >>>> type))
> >>>> -           /* cv-unqualified version of named type.  Just use
> >>>> -              the unnamed type to which it refers.  */
> >>>> -           return modified_type_die (DECL_ORIGINAL_TYPE (name), 
> >>>> cv_quals,
> >>>> -                                     reverse, context_die);
> >>>> +           {
> >>>> +             tree dtags = lookup_attribute ("btf_type_tag",
> >>>> +                                            TYPE_ATTRIBUTES (dtype));
> >>>> +             if ((tags = lookup_attribute ("btf_type_tag",
> >>>> +                                           TYPE_ATTRIBUTES (type)))
> >>>> +                 && !attribute_list_equal (tags, dtags))
> >>>> +               {
> >>>> +                 /* Use of a typedef with additional btf_type_tags.
> >>>> +                    Create a new typedef DIE to which we can attach the
> >>>> +                    additional type_tag DIEs without disturbing other 
> >>>> users of
> >>>> +                    the underlying typedef.  */
> >>>> +                 dw_die_ref mod_die = modified_type_die (dtype, 
> >>>> cv_quals,
> >>>> +                                                         reverse, 
> >>>> context_die);
> >>>> +                 mod_die = clone_die (mod_die);
> >>>> +                 add_child_die (comp_unit_die (), mod_die);
> >>>> +                 if (!lookup_type_die (type))
> >>>> +                   equate_type_number_to_die (type, mod_die);
> >>>> +
> >>>> +                 /* 'tags' is an accumulated list of type_tag attributes
> >>>> +                    for the typedef'd type on both sides of the typedef.
> >>>> +                    'dtags' is the set of type_tag attributes only 
> >>>> appearing
> >>>> +                    in the typedef itself.
> >>>> +                    Find the set of type_tags only on the _use_ of the
> >>>> +                    typedef, i.e. (tags - dtags).  By construction these
> >>>> +                    additional type_tags have been chained onto the 
> >>>> head of
> >>>> +                    the attribute list of the original typedef.  */
> >>>> +                 tree t = tags;
> >>>> +                 bool altered_chain = false;
> >>>> +                 while (t)
> >>>> +                   {
> >>>> +                     if (TREE_CHAIN (t) == dtags)
> >>>> +                       {
> >>>> +                         TREE_CHAIN (t) = NULL_TREE;
> >>>> +                         altered_chain = true;
> >>>> +                         break;
> >>>> +                       }
> >>>> +                     t = TREE_CHAIN (t);
> >>>> +                   }
> >>> And this might be part of that and should thus split out, or be part
> >>> of gen_btf_type_tag_dies itself?  I expected the hashtable to
> >>> cover the DIEs we have for existing chains of tags?
> >>
> >> OK, this can move into gen_btf_type_tag_dies.
> >>
> >> The reason the hash table doesn't help us here is that the uncertainty
> >> is in the attribute list itself.
> >>
> >> Specifically, if the type of a variable is a typedef with an additional
> >> type attribute applied, then it is unclear from just the tree node of
> >> the type of the variable whether that attribute is "new" on the use of
> >> the typedef, or is part of the typedef itself.
> >>
> >> For example:
> >>
> >>   typedef int __attribute__((btf_type_tag ("tag1"))) foo;
> >>   foo __attribute__((btf_type_tag ("tag2"))) x;
> >>
> >> For the typedef, we create a DW_TAG_typedef DIE with name "foo", with
> >> AT_type referring to a base type DIE for an integer type with "tag1"
> >> attached via AT_GNU_annotation.  So far so good.
> >>
> >> When generating dwarf for variable 'x', the tree node we have for
> >> the type of 'x' is an integer type node with name "foo" but the attribute
> >> list contains _both_ "tag2" and "tag1".  This is what I mean with the
> >> comment in the code above about the "accumulated list" from both sides
> >> of the typedef.  It happens because we have a variant type node of
> >> the named type foo created, in order to be a distinct node for
> >> (foo + "tag2") for the type of 'x'.
> >
> > So I view DW_TAG_GNU_annotation as a way to represent attributes
> > on types and decls going forward (eventually DWARF will gain sth
> > standard here for [[..]] - who knows).  As you say we treat attributed
> > types as variants, I do not see why the attribute list of the
> > variant type for 'x' has both tags?
> AFAIU, the attribute list has both tags just because there is not a
> distinct type node for the typedef, rather it adds a name to the
> integer type node.  So there isn't a place in the tree currently to
> put the attribute to distinguish it being on one side of the typedef
> or the other, they all get lumped into one attribute list.


I see.  So

typedef int __attribute__((btf_type_tag ("tag1"))) foo;
typedef int __attribute__((btf_type_tag ("tag2"))) bar;

gets us two identical tagged types?  Or is it that

foo __attribute__((btf_type_tag ("tag2"))) x;

fails to generate a type variant for applying the attribute to?  And
thus a followup

foo y;

will also be tag2-tagged?

It feels like we must get one of those cases "wrong" if it is like
you say?

> It's also possible that I am doing something wrong in the FE
> attribute handler, but I'm not sure how else the attributes could
> be represented in the tree.

I'll see in the other patch if I spot something.

> > And even if it has, I'd simply let
> > it go and have DW_TAG_GNU_annotation with both tags on the
> > type DIE and one with just 'tag1' on the typedef DIE for 'foo'.
>
> OK, I can do that.  For the current uses of type_tag in the kernel
> BPF subsystem, I don't think this case is very relevant.
>
> >
> > What about
> >
> > typedef int __attribute__((btf_type_tag ("tag1"))) foo;
> > foo __attribute__((btf_type_tag ("tag2"), btftype_tag ("tag1"))) x;
> >
> > in your scheme?
>
> The resulting tree node for the type of x here is actually identical
> to what we get in the above example when x only has "tag2". We can't
> distinguish the two cases from the tree.
>
> If we have both, e.g.
>
> typedef int __attribute__((btf_type_tag ("tag1"))) foo;
> foo __attribute__((btf_type_tag ("tag2"), btf_type_tag ("tag1"))) x;
> foo __attribute__((btf_type_tag ("tag2"))) y;
>
> Then both var_decl nodes for 'x' and 'y' share exactly the same
> node as their type, an integer_type node named "foo" with exactly
> one each of "tag1" and "tag2" in the attribute list.
>
> I think that the current attribute handling will not add an attribute
> with the same name and value more than once to a given type's
> attribute list.  That affects us here because there isn't a distinct
> attribute list for the typedef. So any variant of "foo" just has
> one attribute list that lumps together all the relevant attributes,
> which also ends up in effect deduplicating them.

OK, I see.  I think given it's imperfect on the tree side we should
(at least initially) not worry about the DWARF side getting things
"wrong".

> >
> > I think dwarf2out needs to emit what is on the actual tree types and
> > decls, not what we reverse engineer would be likely present on the
> > source.  If what the trees represent does not agree with what we like
> > to be present in the DWARF we should fix the tree side.
>
> That makes sense to me.  I do think that this means putting information
> in the tree that isn't necessarily relevant from program-semantics or
> C type system view, but allows us to more accurately reflect source
> constructs in DWARF.
>
> Modifying the tree nodes like this will probably ultimately be the right
> approach. No objection there, nor to working on the implementation.
>
> I am worried though that it could take quite some time to design and fully
> implement. The kernel BPF subsystem has real need for these tags in some
> form, it's been a couple of years now and they are piling up hacks to work
> around GCC not supporting type_tag and decl_tag yet.
>
> So, if the overall design is OK (with a 2nd ack) it would be great if
> we can give them something that works, and then work on improving and
> extending it.

I think the overall design is OK.

> >
> >> IOW, 'dtags' holds "tag1" since it comes from the typedef itself,
> >> and 'tags' holds both "tag1" and "tag2".  We need to do some extra work
> >> looking at the tree to figure out which type_tags are actually on the
> >> _use_ of the typedef vs. part of the typdef itself, in order to correctly
> >> decide which chains of tags should be associated to which DIE.
> >>
> >> Note that the attribute chain in the tree is immediately restored
> >> after generating the tag dies...
> >>>
> >>>> +
> >>>> +                 /* Now generate the type_tag DIEs only for the new
> >>>> +                    type_tags appearing in the use of the typedef, and
> >>>> +                    attach them to the cloned typedef DIE.  */
> >>>> +                 gen_btf_type_tag_dies (type, tags, mod_die, 
> >>>> context_die);
> >>>> +
> >>>> +                 /* Restore the list of type_tags.  In principle, the 
> >>>> same
> >>>> +                    attribute list could be shared by unrelated types.  
> >>>> */
> >>>> +                 if (altered_chain)
> >>>> +                   TREE_CHAIN (t) = dtags;
> >>
> >> ...here.  So there are no lasting changes in the tree.
> >>
> >> It could be done w/o altering the tree at all by duplicating the relevant
> >> attribute list and operating on that, but it seems a little wasteful.
> >>
> >>>> +
> >>>> +                 return mod_die;
> >>>> +               }
> >>>> +             /* cv-unqualified version of named type.  Just use
> >>>> +                the unnamed type to which it refers.  */
> >>>> +             return modified_type_die (DECL_ORIGINAL_TYPE (name), 
> >>>> cv_quals,
> >>>> +                                       reverse, context_die);
> >>>> +           }
> >>>>           /* Else cv-qualified version of named type; fall through.  */
> >>>>         }
> >>>>      }
> >>>> @@ -13887,6 +14160,17 @@ modified_type_die (tree type, int cv_quals, 
> >>>> bool reverse,
> >>>>             first_quals |= dwarf_qual_info[i].q;
> >>>>           }
> >>>>      }
> >>>> +  else if ((tags = lookup_attribute ("btf_type_tag", TYPE_ATTRIBUTES 
> >>>> (type))))
> >>>> +    {
> >>>> +      /* Remove type tags before the recursive call to avoid processing 
> >>>> the
> >>>> +        same attribute multiple times.  */
> >>>> +      TYPE_ATTRIBUTES (type) = remove_attribute ("btf_type_tag",
> >>>> +                                                TYPE_ATTRIBUTES (type));
> >>> See above.
> >>
> >> The reason for this one is simply the issue of recursion.
> >>
> >> Doing remove_attribute here is analogous to the cv-qual handling using
> >> get_nearest_type_subqualifiers and recursing with the lesser qualified
> >> type. In this case the lesser qualified type is the same type without
> >> the type_tag attribute(s)
> >
> > But don't we have a variant type without the attribute?  That's the basic
> > premise of how modified_type_die works.  If we cannot fit attribute
> > handling into this scheme then it's not the appropriate "target"?  Maybe
> > we need to force the attribute to be always applied to a decl then?
> > Or make sure there's a type variant with all CV qualifiers but no attributes
> > present.  Like for
> >
> > const int __attribute__((btf_type_tag("tag1"))) a;
> >
> > make sure we have type trees for 'int' (main variant - always present),
> > 'const int' (possibly not present variant tree right now?) and
> > 'const int __attribute__((btf_type_tag("tag1")))' (the TREE_TYPE of 'a')?
>
> Hmm... right. There should always be a non-attributed variant of the type
> somewhere. Rather than remove the attribute and recurse to generate
> DWARF for it, it would be better to directly locate that variant and
> ensure DWARF is generated for it first.  If the non-attributed type is
> used, then we duplicate its DIE to use here and attach the annotations.
>
> But, if the non-attributed variant is not used elsewhere in the TU, then
> maybe by this point it doesn't exist to find. It sounds a little familiar,
> maybe I ran into this originally which led to the remove_attribute...

Quite possibly.

> >
> > Otherwise the attribute handling would need to be integrated into the
> > "CV" qualification handing of dwarf2out.
> >
> >> I think the alternative would be to split the type_tags into a separate
> >> argument to modified_type_die, similar to cv_quals, which seemed like
> >> a much more invasive set of changes to make.
> >
> > It might be the proper thing if the attribute-less type tree isn't in
> > the variant
> > set.  That is, I'd pass the whole TYPE_ATTRIBUTES as separate qualification
> > from cv_quals and to be handled/stripped first, forcing a separate type DIE
> > to not interfere with cv_quals handling.
>
> Maybe this will be the best option for now then.
>
> Either way, I agree that the remove_attribute is not good and
> should not be necessary.  I will look into this more and find a way
> to get rid of it.

Thanks!

> >
> > This is after all laying ground for TYPE_ATTRIBUTES encoding (currently
> > nothing in dwarf2out looks at those, almost nothing at DECL_ATTRIBUTES).
> >
> > Richard.
>
> Thank you for the reviews and feedback so far, it's very helpful :)
>
> >
> >>>
> >>>> +      dw_die_ref mod_die = modified_type_die (type, cv_quals, reverse,
> >>>> +                                             context_die);
> >>>> +      gen_btf_type_tag_dies (type, tags, mod_die, context_die);
> >>
> >> It should work fine to save TYPE_ATTRIBUTES above and restore it to the
> >> original here after the recursive call so that ultimately the tree
> >> is unchanged.
> >>
> >> Would the same approach with that modification be OK?
> >>
> >>>> +      return mod_die;
> >>>> +    }
> >>>>    else if (code == POINTER_TYPE || code == REFERENCE_TYPE)
> >>>>      {
> >>>>        dwarf_tag tag = DW_TAG_pointer_type;
> >>>> @@ -22770,6 +23054,7 @@ gen_array_type_die (tree type, dw_die_ref 
> >>>> context_die)
> >>
>

Re: [PATCH v4 2/6] dwarf: create annotation DIEs for btf tags

Reply via email to