On 7/1/25 01:02, Richard Biener wrote:
> On Mon, Jun 30, 2025 at 9:12 PM David Faust <david.fa...@oracle.com> wrote:
>>
>>
>>
>> On 6/30/25 06:11, Richard Biener wrote:
>>>> +static void
>>>> +gen_btf_decl_tag_dies (tree t, dw_die_ref target, dw_die_ref context_die)
>>>> +{
>>>> + if (t == NULL_TREE || !DECL_P (t) || !target)
>>>> + return;
>>>> +
>>>> + tree attr = lookup_attribute ("btf_decl_tag", DECL_ATTRIBUTES (t));
>>>> + if (attr == NULL_TREE)
>>>> + return;
>>>> +
>>>> + gen_btf_tag_dies (attr, target, context_die);
>>>> +
>>>> + /* Strip the decl tag attribute once we have created the annotation DIEs
>>>> + to avoid attempting process it multiple times. Global variable
>>>> + declarations may reach this function more than once. */
>>>> + DECL_ATTRIBUTES (t)
>>>> + = remove_attribute ("btf_decl_tag", DECL_ATTRIBUTES (t));
>>> I do not like modifying trees as part of dwarf2out. You should be able to
>>> see whether a DIE already has the respective attribute applied?
>>
>> Yes, you're right. For decl_tag the case is simple and better handled by
>> consulting the hash table. Simple fix and this remove_attribute can be
>> deleted.
>>
>> Understood re: modifying trees in dwarf2out. I agree it's not ideal.
>>
>> For this case the remove_attribute can be deleted. For the two below,
>> one is already immediately restored and the other could be as well so
>> that there are no lasting changes in the tree at all.
>>
>> I will explain the reasoning some more below.
>>
>>>
>>>> +}
>>>> +
>>>> /* Given a pointer to an arbitrary ..._TYPE tree node, return a debugging
>>>> entry that chains the modifiers specified by CV_QUALS in front of the
>>>> given type. REVERSE is true if the type is to be interpreted in the
>>>> @@ -13674,6 +13894,7 @@ modified_type_die (tree type, int cv_quals, bool
>>>> reverse,
>>>> tree item_type = NULL;
>>>> tree qualified_type;
>>>> tree name, low, high;
>>>> + tree tags;
>>>> dw_die_ref mod_scope;
>>>> struct array_descr_info info;
>>>> /* Only these cv-qualifiers are currently handled. */
>>>> @@ -13783,10 +14004,62 @@ modified_type_die (tree type, int cv_quals, bool
>>>> reverse,
>>>> dquals &= cv_qual_mask;
>>>> if ((dquals & ~cv_quals) != TYPE_UNQUALIFIED
>>>> || (cv_quals == dquals && DECL_ORIGINAL_TYPE (name) != type))
>>>> - /* cv-unqualified version of named type. Just use
>>>> - the unnamed type to which it refers. */
>>>> - return modified_type_die (DECL_ORIGINAL_TYPE (name), cv_quals,
>>>> - reverse, context_die);
>>>> + {
>>>> + tree dtags = lookup_attribute ("btf_type_tag",
>>>> + TYPE_ATTRIBUTES (dtype));
>>>> + if ((tags = lookup_attribute ("btf_type_tag",
>>>> + TYPE_ATTRIBUTES (type)))
>>>> + && !attribute_list_equal (tags, dtags))
>>>> + {
>>>> + /* Use of a typedef with additional btf_type_tags.
>>>> + Create a new typedef DIE to which we can attach the
>>>> + additional type_tag DIEs without disturbing other
>>>> users of
>>>> + the underlying typedef. */
>>>> + dw_die_ref mod_die = modified_type_die (dtype, cv_quals,
>>>> + reverse,
>>>> context_die);
>>>> + mod_die = clone_die (mod_die);
>>>> + add_child_die (comp_unit_die (), mod_die);
>>>> + if (!lookup_type_die (type))
>>>> + equate_type_number_to_die (type, mod_die);
>>>> +
>>>> + /* 'tags' is an accumulated list of type_tag attributes
>>>> + for the typedef'd type on both sides of the typedef.
>>>> + 'dtags' is the set of type_tag attributes only
>>>> appearing
>>>> + in the typedef itself.
>>>> + Find the set of type_tags only on the _use_ of the
>>>> + typedef, i.e. (tags - dtags). By construction these
>>>> + additional type_tags have been chained onto the head
>>>> of
>>>> + the attribute list of the original typedef. */
>>>> + tree t = tags;
>>>> + bool altered_chain = false;
>>>> + while (t)
>>>> + {
>>>> + if (TREE_CHAIN (t) == dtags)
>>>> + {
>>>> + TREE_CHAIN (t) = NULL_TREE;
>>>> + altered_chain = true;
>>>> + break;
>>>> + }
>>>> + t = TREE_CHAIN (t);
>>>> + }
>>> And this might be part of that and should thus split out, or be part
>>> of gen_btf_type_tag_dies itself? I expected the hashtable to
>>> cover the DIEs we have for existing chains of tags?
>>
>> OK, this can move into gen_btf_type_tag_dies.
>>
>> The reason the hash table doesn't help us here is that the uncertainty
>> is in the attribute list itself.
>>
>> Specifically, if the type of a variable is a typedef with an additional
>> type attribute applied, then it is unclear from just the tree node of
>> the type of the variable whether that attribute is "new" on the use of
>> the typedef, or is part of the typedef itself.
>>
>> For example:
>>
>> typedef int __attribute__((btf_type_tag ("tag1"))) foo;
>> foo __attribute__((btf_type_tag ("tag2"))) x;
>>
>> For the typedef, we create a DW_TAG_typedef DIE with name "foo", with
>> AT_type referring to a base type DIE for an integer type with "tag1"
>> attached via AT_GNU_annotation. So far so good.
>>
>> When generating dwarf for variable 'x', the tree node we have for
>> the type of 'x' is an integer type node with name "foo" but the attribute
>> list contains _both_ "tag2" and "tag1". This is what I mean with the
>> comment in the code above about the "accumulated list" from both sides
>> of the typedef. It happens because we have a variant type node of
>> the named type foo created, in order to be a distinct node for
>> (foo + "tag2") for the type of 'x'.
>
> So I view DW_TAG_GNU_annotation as a way to represent attributes
> on types and decls going forward (eventually DWARF will gain sth
> standard here for [[..]] - who knows). As you say we treat attributed
> types as variants, I do not see why the attribute list of the
> variant type for 'x' has both tags?
AFAIU, the attribute list has both tags just because there is not a
distinct type node for the typedef, rather it adds a name to the
integer type node. So there isn't a place in the tree currently to
put the attribute to distinguish it being on one side of the typedef
or the other, they all get lumped into one attribute list.
It's also possible that I am doing something wrong in the FE
attribute handler, but I'm not sure how else the attributes could
be represented in the tree.
> And even if it has, I'd simply let
> it go and have DW_TAG_GNU_annotation with both tags on the
> type DIE and one with just 'tag1' on the typedef DIE for 'foo'.
OK, I can do that. For the current uses of type_tag in the kernel
BPF subsystem, I don't think this case is very relevant.
>
> What about
>
> typedef int __attribute__((btf_type_tag ("tag1"))) foo;
> foo __attribute__((btf_type_tag ("tag2"), btftype_tag ("tag1"))) x;
>
> in your scheme?
The resulting tree node for the type of x here is actually identical
to what we get in the above example when x only has "tag2". We can't
distinguish the two cases from the tree.
If we have both, e.g.
typedef int __attribute__((btf_type_tag ("tag1"))) foo;
foo __attribute__((btf_type_tag ("tag2"), btf_type_tag ("tag1"))) x;
foo __attribute__((btf_type_tag ("tag2"))) y;
Then both var_decl nodes for 'x' and 'y' share exactly the same
node as their type, an integer_type node named "foo" with exactly
one each of "tag1" and "tag2" in the attribute list.
I think that the current attribute handling will not add an attribute
with the same name and value more than once to a given type's
attribute list. That affects us here because there isn't a distinct
attribute list for the typedef. So any variant of "foo" just has
one attribute list that lumps together all the relevant attributes,
which also ends up in effect deduplicating them.
>
> I think dwarf2out needs to emit what is on the actual tree types and
> decls, not what we reverse engineer would be likely present on the
> source. If what the trees represent does not agree with what we like
> to be present in the DWARF we should fix the tree side.
That makes sense to me. I do think that this means putting information
in the tree that isn't necessarily relevant from program-semantics or
C type system view, but allows us to more accurately reflect source
constructs in DWARF.
Modifying the tree nodes like this will probably ultimately be the right
approach. No objection there, nor to working on the implementation.
I am worried though that it could take quite some time to design and fully
implement. The kernel BPF subsystem has real need for these tags in some
form, it's been a couple of years now and they are piling up hacks to work
around GCC not supporting type_tag and decl_tag yet.
So, if the overall design is OK (with a 2nd ack) it would be great if
we can give them something that works, and then work on improving and
extending it.
>
>> IOW, 'dtags' holds "tag1" since it comes from the typedef itself,
>> and 'tags' holds both "tag1" and "tag2". We need to do some extra work
>> looking at the tree to figure out which type_tags are actually on the
>> _use_ of the typedef vs. part of the typdef itself, in order to correctly
>> decide which chains of tags should be associated to which DIE.
>>
>> Note that the attribute chain in the tree is immediately restored
>> after generating the tag dies...
>>>
>>>> +
>>>> + /* Now generate the type_tag DIEs only for the new
>>>> + type_tags appearing in the use of the typedef, and
>>>> + attach them to the cloned typedef DIE. */
>>>> + gen_btf_type_tag_dies (type, tags, mod_die, context_die);
>>>> +
>>>> + /* Restore the list of type_tags. In principle, the same
>>>> + attribute list could be shared by unrelated types. */
>>>> + if (altered_chain)
>>>> + TREE_CHAIN (t) = dtags;
>>
>> ...here. So there are no lasting changes in the tree.
>>
>> It could be done w/o altering the tree at all by duplicating the relevant
>> attribute list and operating on that, but it seems a little wasteful.
>>
>>>> +
>>>> + return mod_die;
>>>> + }
>>>> + /* cv-unqualified version of named type. Just use
>>>> + the unnamed type to which it refers. */
>>>> + return modified_type_die (DECL_ORIGINAL_TYPE (name),
>>>> cv_quals,
>>>> + reverse, context_die);
>>>> + }
>>>> /* Else cv-qualified version of named type; fall through. */
>>>> }
>>>> }
>>>> @@ -13887,6 +14160,17 @@ modified_type_die (tree type, int cv_quals, bool
>>>> reverse,
>>>> first_quals |= dwarf_qual_info[i].q;
>>>> }
>>>> }
>>>> + else if ((tags = lookup_attribute ("btf_type_tag", TYPE_ATTRIBUTES
>>>> (type))))
>>>> + {
>>>> + /* Remove type tags before the recursive call to avoid processing
>>>> the
>>>> + same attribute multiple times. */
>>>> + TYPE_ATTRIBUTES (type) = remove_attribute ("btf_type_tag",
>>>> + TYPE_ATTRIBUTES (type));
>>> See above.
>>
>> The reason for this one is simply the issue of recursion.
>>
>> Doing remove_attribute here is analogous to the cv-qual handling using
>> get_nearest_type_subqualifiers and recursing with the lesser qualified
>> type. In this case the lesser qualified type is the same type without
>> the type_tag attribute(s)
>
> But don't we have a variant type without the attribute? That's the basic
> premise of how modified_type_die works. If we cannot fit attribute
> handling into this scheme then it's not the appropriate "target"? Maybe
> we need to force the attribute to be always applied to a decl then?
> Or make sure there's a type variant with all CV qualifiers but no attributes
> present. Like for
>
> const int __attribute__((btf_type_tag("tag1"))) a;
>
> make sure we have type trees for 'int' (main variant - always present),
> 'const int' (possibly not present variant tree right now?) and
> 'const int __attribute__((btf_type_tag("tag1")))' (the TREE_TYPE of 'a')?
Hmm... right. There should always be a non-attributed variant of the type
somewhere. Rather than remove the attribute and recurse to generate
DWARF for it, it would be better to directly locate that variant and
ensure DWARF is generated for it first. If the non-attributed type is
used, then we duplicate its DIE to use here and attach the annotations.
But, if the non-attributed variant is not used elsewhere in the TU, then
maybe by this point it doesn't exist to find. It sounds a little familiar,
maybe I ran into this originally which led to the remove_attribute...
>
> Otherwise the attribute handling would need to be integrated into the
> "CV" qualification handing of dwarf2out.
>
>> I think the alternative would be to split the type_tags into a separate
>> argument to modified_type_die, similar to cv_quals, which seemed like
>> a much more invasive set of changes to make.
>
> It might be the proper thing if the attribute-less type tree isn't in
> the variant
> set. That is, I'd pass the whole TYPE_ATTRIBUTES as separate qualification
> from cv_quals and to be handled/stripped first, forcing a separate type DIE
> to not interfere with cv_quals handling.
Maybe this will be the best option for now then.
Either way, I agree that the remove_attribute is not good and
should not be necessary. I will look into this more and find a way
to get rid of it.
>
> This is after all laying ground for TYPE_ATTRIBUTES encoding (currently
> nothing in dwarf2out looks at those, almost nothing at DECL_ATTRIBUTES).
>
> Richard.
Thank you for the reviews and feedback so far, it's very helpful :)
>
>>>
>>>> + dw_die_ref mod_die = modified_type_die (type, cv_quals, reverse,
>>>> + context_die);
>>>> + gen_btf_type_tag_dies (type, tags, mod_die, context_die);
>>
>> It should work fine to save TYPE_ATTRIBUTES above and restore it to the
>> original here after the recursive call so that ultimately the tree
>> is unchanged.
>>
>> Would the same approach with that modification be OK?
>>
>>>> + return mod_die;
>>>> + }
>>>> else if (code == POINTER_TYPE || code == REFERENCE_TYPE)
>>>> {
>>>> dwarf_tag tag = DW_TAG_pointer_type;
>>>> @@ -22770,6 +23054,7 @@ gen_array_type_die (tree type, dw_die_ref
>>>> context_die)
>>