On 5/26/22 00:29, Yonghong Song wrote:
> 
> 
> On 5/24/22 10:04 AM, David Faust wrote:
>>
>>
>> On 5/24/22 09:03, Yonghong Song wrote:
>>>
>>>
>>> On 5/24/22 8:53 AM, David Faust wrote:
>>>>
>>>>
>>>> On 5/24/22 04:07, Jose E. Marchesi wrote:
>>>>>
>>>>>> On 5/11/22 11:44 AM, David Faust wrote:
>>>>>>>
>>>>>>> On 5/10/22 22:05, Yonghong Song wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 5/10/22 8:43 PM, Yonghong Song wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 5/6/22 2:18 PM, David Faust wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 5/5/22 16:00, Yonghong Song wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 5/4/22 10:03 AM, David Faust wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 5/3/22 15:32, Joseph Myers wrote:
>>>>>>>>>>>>> On Mon, 2 May 2022, David Faust via Gcc-patches wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Consider the following example:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         #define __typetag1 __attribute__((btf_type_tag("tag1")))
>>>>>>>>>>>>>>         #define __typetag2 __attribute__((btf_type_tag("tag2")))
>>>>>>>>>>>>>>         #define __typetag3 __attribute__((btf_type_tag("tag3")))
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The expected behavior is that 'g' is "a pointer with tags
>>>>>>>>>>>>>> 'tag2' and
>>>>>>>>>>>>>> 'tag3',
>>>>>>>>>>>>>> to a pointer with tag 'tag1' to an int". i.e.:
>>>>>>>>>>>>>
>>>>>>>>>>>>> That's not a correct expectation for either GNU __attribute__ or
>>>>>>>>>>>>> C2x [[]]
>>>>>>>>>>>>> attribute syntax.  In either syntax, __typetag2 __typetag3 should
>>>>>>>>>>>>> apply to
>>>>>>>>>>>>> the type to which g points, not to g or its type, just as if
>>>>>>>>>>>>> you had a
>>>>>>>>>>>>> type qualifier there.  You'd need to put the attributes (or
>>>>>>>>>>>>> qualifier)
>>>>>>>>>>>>> after the *, not before, to make them apply to the pointer
>>>>>>>>>>>>> type.  See
>>>>>>>>>>>>> "Attribute Syntax" in the GCC manual for how the syntax is
>>>>>>>>>>>>> defined for
>>>>>>>>>>>>> GNU
>>>>>>>>>>>>> attributes and deduce in turn, for each subsequence of the tokens
>>>>>>>>>>>>> matching
>>>>>>>>>>>>> the syntax for some kind of declarator, what the type for "T D1"
>>>>>>>>>>>>> would be
>>>>>>>>>>>>> as defined there and in the C standard, as deduced from the type 
>>>>>>>>>>>>> for
>>>>>>>>>>>>> "T D"
>>>>>>>>>>>>> for a sub-declarator D.
>>>>>>>>>>>>>      >> But GCC's attribute parsing produces a variable 'g'
>>>>>>>>>>>>> which is "a
>>>>>>>>>>>> pointer with
>>>>>>>>>>>>>> tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an
>>>>>>>>>>>>>> int", i.e.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In GNU syntax, __typetag1 applies to the declaration, whereas in 
>>>>>>>>>>>>> C2x
>>>>>>>>>>>>> syntax it applies to int.  Again, if you wanted it to apply to the
>>>>>>>>>>>>> pointer
>>>>>>>>>>>>> type it would need to go after the * not before.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you are concerned with the fine details of what construct an
>>>>>>>>>>>>> attribute
>>>>>>>>>>>>> appertains to, I recommend using C2x syntax not GNU syntax.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Joseph, thank you! This is very helpful. My understanding of
>>>>>>>>>>>> the syntax
>>>>>>>>>>>> was not correct.
>>>>>>>>>>>>
>>>>>>>>>>>> (Actually, I made a bad mistake in paraphrasing this example from 
>>>>>>>>>>>> the
>>>>>>>>>>>> discussion of it in the series cover letter. But, the reason
>>>>>>>>>>>> why it is
>>>>>>>>>>>> incorrect is the same.)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yonghong, is the specific ordering an expectation in BPF programs 
>>>>>>>>>>>> or
>>>>>>>>>>>> other users of the tags?
>>>>>>>>>>>
>>>>>>>>>>> This is probably a language writing issue. We are saying tags only
>>>>>>>>>>> apply to pointer. We probably should say it only apply to pointee.
>>>>>>>>>>>
>>>>>>>>>>> $ cat t.c
>>>>>>>>>>> int const *ptr;
>>>>>>>>>>>
>>>>>>>>>>> the llvm ir debuginfo:
>>>>>>>>>>>
>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 
>>>>>>>>>>> 64)
>>>>>>>>>>> !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
>>>>>>>>>>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>>
>>>>>>>>>>> We could replace 'const' with a tag like below:
>>>>>>>>>>>
>>>>>>>>>>> int __attribute__((btf_type_tag("tag"))) *ptr;
>>>>>>>>>>>
>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 
>>>>>>>>>>> 64,
>>>>>>>>>>> annotations: !7)
>>>>>>>>>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>> !7 = !{!8}
>>>>>>>>>>> !8 = !{!"btf_type_tag", !"tag"}
>>>>>>>>>>>
>>>>>>>>>>> In the above IR, we generate annotations to pointer_type because
>>>>>>>>>>> we didn't invent a new DI type for encode btf_type_tag. But it is
>>>>>>>>>>> totally okay to have IR looks like
>>>>>>>>>>>
>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 
>>>>>>>>>>> 64)
>>>>>>>>>>> !11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
>>>>>>>>>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>>
>>>>>>>>>> OK, thanks.
>>>>>>>>>>
>>>>>>>>>> There is still the question of why the DWARF generated for this case
>>>>>>>>>> that I have been concerned about:
>>>>>>>>>>
>>>>>>>>>>       int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>
>>>>>>>>>> differs between GCC (with this series) and clang. After studying it,
>>>>>>>>>> GCC is doing with the attributes exactly as is described in the
>>>>>>>>>> Attribute Syntax portion of the GCC manual where the GNU syntax is
>>>>>>>>>> described. I do not think there is any problem here.
>>>>>>>>>>
>>>>>>>>>> So the difference in DWARF suggests to me that clang is not handling
>>>>>>>>>> the GNU attribute syntax in this particular case correctly, since it
>>>>>>>>>> seems to be associating __typetag2 and __typetag3 to g's type rather
>>>>>>>>>> than the type to which it points.
>>>>>>>>>>
>>>>>>>>>> I am not sure whether for the use purposes of the tags this 
>>>>>>>>>> difference
>>>>>>>>>> is very important, but it is worth noting.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> As Joseph suggested, it may be better to encourage users of these 
>>>>>>>>>> tags
>>>>>>>>>> to use the C2x attribute syntax if they are concerned with precisely
>>>>>>>>>> which construct the tag applies.
>>>>>>>>>>
>>>>>>>>>> This would also be a way around any issues in handling the attributes
>>>>>>>>>> due to the GNU syntax.
>>>>>>>>>>
>>>>>>>>>> I tried a few test cases using C2x syntax BTF type tags with a
>>>>>>>>>> clang-15 build, but ran into some issues (in particular, some of the
>>>>>>>>>> tag attributes being ignored altogether). I couldn't find 
>>>>>>>>>> confirmation
>>>>>>>>>> whether C2x attribute syntax is fully supported in clang yet, so 
>>>>>>>>>> maybe
>>>>>>>>>> this isn't expected to work. Do you know whether the C2x syntax is
>>>>>>>>>> fully supported in clang yet?
>>>>>>>>>
>>>>>>>>> Actually, I don't know either. But since the btf decl_tag and type_tag
>>>>>>>>> are also used to compile linux kernel and the minimum compiler version
>>>>>>>>> to compile kernel is gcc5.1 and clang11. I am not sure whether gcc5.1
>>>>>>>>> supports c2x or not, I guess probably not. So I think we most likely
>>>>>>>>> cannot use c2x syntax.
>>>>>>>>
>>>>>>>> Okay, I think we can guard btf_tag's with newer compiler versions.
>>>>>>>> What kind of c2x syntax you intend to use? I can help compile kernel
>>>>>>>> with that syntax and llvm15 to see what is the issue and may help
>>>>>>>> fix it in clang if possible.
>>>>>>>
>>>>>>> I am thinking to use the [[]] C2x standard attribute syntax. The
>>>>>>> syntax makes it quite clear to which entity each attribute applies,
>>>>>>> and in my opinion is a little more intuitive/less surprising too.
>>>>>>> It's documented here (PDF):
>>>>>>>      https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2731.pdf
>>>>>>> See sections 6.7.11 for the syntax and 6.7.6 for
>>>>>>> declarations. Section 6.7.6.1 specifically describes using the
>>>>>>> attribute syntax with pointer declarators.
>>>>>>> The attribute syntax itself for BTF tags is:
>>>>>>>      [[clang::btf_type_tag("tag1")]]
>>>>>>> or
>>>>>>>      [[gnu::btf_type_tag("tag1")]]
>>>>>>>
>>>>>>> I am also looking into whether, with the C2x syntax, we really need two
>>>>>>> separate attributes (type_tag and decl_tag) at the language
>>>>>>> level. It might be possible with C2x syntax to use just one language
>>>>>>> attribute (e.g. just btf_tag).
>>>>>>>
>>>>>>> A simple declaration for a tagged pointer to an int:
>>>>>>>      int * [[gnu::btf_type_tag("tag1")]] x;
>>>>>>> And for the example from this thread:
>>>>>>>      #define __typetag1 [[gnu::btf_type_tag("type-tag-1")]]
>>>>>>>      #define __typetag2 [[gnu::btf_type_tag("type-tag-2")]]
>>>>>>>      #define __typetag3 [[gnu::btf_type_tag("type-tag-3")]]
>>>>>>>      int * __typetag1 * __typetag2 __typetag3 g;
>>>>>>> Here each tag applies to the preceding pointer, so the result is
>>>>>>> unsurprising.
>>>>>>> Actually, this is where I found something that looks like an issue
>>>>>>> with the C2x attribute syntax in clang. The tags 2 and 3 go missing,
>>>>>>> but with no warning nor other indication.
>>>>>>> Compiling this example with gcc:
>>>>>>> $ ~/toolchains/bpf/bin/bpf-unknown-none-gcc -c -gbtf -gdwarf c2x.c
>>>>>>> -o c2x.o --std=c2x
>>>>>>> $ ~/toolchains/llvm/bin/llvm-dwarfdump c2x.o
>>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>>                  DW_AT_producer    ("GNU C2X 12.0.1 20220401
>>>>>>> (experimental) -gbtf -gdwarf -std=c2x")
>>>>>>>                  DW_AT_language    (DW_LANG_C11)
>>>>>>>                  DW_AT_name    ("c2x.c")
>>>>>>>                  DW_AT_comp_dir    ("/home/dfaust/playpen/btf/tags")
>>>>>>>                  DW_AT_stmt_list    (0x00000000)
>>>>>>> 0x0000001e:   DW_TAG_variable
>>>>>>>                    DW_AT_name    ("g")
>>>>>>>                    DW_AT_decl_file    
>>>>>>> ("/home/dfaust/playpen/btf/tags/c2x.c")
>>>>>>>                    DW_AT_decl_line    (16)
>>>>>>>                    DW_AT_decl_column    (0x2a)
>>>>>>>                    DW_AT_type    (0x00000032 "int **")
>>>>>>>                    DW_AT_external    (true)
>>>>>>>                    DW_AT_location    (DW_OP_addr 0x0)
>>>>>>> 0x00000032:   DW_TAG_pointer_type
>>>>>>>                    DW_AT_byte_size    (8)
>>>>>>>                    DW_AT_type    (0x0000004e "int *")
>>>>>>>                    DW_AT_sibling    (0x0000004e)
>>>>>>> 0x0000003b:     DW_TAG_LLVM_annotation
>>>>>>>                      DW_AT_name    ("btf_type_tag")
>>>>>>>                      DW_AT_const_value    ("type-tag-3")
>>>>>>> 0x00000044:     DW_TAG_LLVM_annotation
>>>>>>>                      DW_AT_name    ("btf_type_tag")
>>>>>>>                      DW_AT_const_value    ("type-tag-2")
>>>>>>> 0x0000004d:     NULL
>>>>>>> 0x0000004e:   DW_TAG_pointer_type
>>>>>>>                    DW_AT_byte_size    (8)
>>>>>>>                    DW_AT_type    (0x00000061 "int")
>>>>>>>                    DW_AT_sibling    (0x00000061)
>>>>>>> 0x00000057:     DW_TAG_LLVM_annotation
>>>>>>>                      DW_AT_name    ("btf_type_tag")
>>>>>>>                      DW_AT_const_value    ("type-tag-1")
>>>>>>> 0x00000060:     NULL
>>>>>>> 0x00000061:   DW_TAG_base_type
>>>>>>>                    DW_AT_byte_size    (0x04)
>>>>>>>                    DW_AT_encoding    (DW_ATE_signed)
>>>>>>>                    DW_AT_name    ("int")
>>>>>>> 0x00000068:   NULL
>>>>>>>
>>>>>>> and with clang (changing the attribute prefix to clang:: appropriately):
>>>>>>> $ ~/toolchains/llvm/bin/clang -target bpf -g -c c2x.c -o c2x.o.ll
>>>>>>> --std=c2x
>>>>>>> $ ~/toolchains/llvm/bin/llvm-dwarfdump c2x.o.ll
>>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>>                  DW_AT_producer    ("clang version 15.0.0
>>>>>>> (https://github.com/llvm/llvm-project.git
>>>>>>> f80e369f61ebd33dd9377bb42fcab64d17072b18)")
>>>>>>>                  DW_AT_language    (DW_LANG_C99)
>>>>>>>                  DW_AT_name    ("c2x.c")
>>>>>>>                  DW_AT_str_offsets_base    (0x00000008)
>>>>>>>                  DW_AT_stmt_list    (0x00000000)
>>>>>>>                  DW_AT_comp_dir    ("/home/dfaust/playpen/btf/tags")
>>>>>>>                  DW_AT_addr_base    (0x00000008)
>>>>>>> 0x0000001e:   DW_TAG_variable
>>>>>>>                    DW_AT_name    ("g")
>>>>>>>                    DW_AT_type    (0x00000029 "int **")
>>>>>>>                    DW_AT_external    (true)
>>>>>>>                    DW_AT_decl_file    
>>>>>>> ("/home/dfaust/playpen/btf/tags/c2x.c")
>>>>>>>                    DW_AT_decl_line    (12)
>>>>>>>                    DW_AT_location    (DW_OP_addrx 0x0)
>>>>>>> 0x00000029:   DW_TAG_pointer_type
>>>>>>>                    DW_AT_type    (0x00000032 "int *")
>>>>>>> 0x0000002e:     DW_TAG_LLVM_annotation
>>>>>>>                      DW_AT_name    ("btf_type_tag")
>>>>>>>                      DW_AT_const_value    ("type-tag-1")
>>>>>>> 0x00000031:     NULL
>>>>>>> 0x00000032:   DW_TAG_pointer_type
>>>>>>>                    DW_AT_type    (0x00000037 "int")
>>>>>>> 0x00000037:   DW_TAG_base_type
>>>>>>>                    DW_AT_name    ("int")
>>>>>>>                    DW_AT_encoding    (DW_ATE_signed)
>>>>>>>                    DW_AT_byte_size    (0x04)
>>>>>>> 0x0000003b:   NULL
>>>>>>
>>>>>> Thanks. I checked with current clang. The generated code looks
>>>>>> like above. Basically, for code like below
>>>>>>
>>>>>>      #define __typetag1 [[clang::btf_type_tag("type-tag-1")]]
>>>>>>      #define __typetag2 [[clang::btf_type_tag("type-tag-2")]]
>>>>>>      #define __typetag3 [[clang::btf_type_tag("type-tag-3")]]
>>>>>>
>>>>>>      int * __typetag1 * __typetag2 __typetag3 g;
>>>>>>
>>>>>> The IR type looks like
>>>>>>      __typetag3 -> __typetag2 -> * (ptr1) -> __typetag1 -> * (ptr2) -> 
>>>>>> int
>>>>>>
>>>>>> The IR is similar to what we did if using
>>>>>> __attribute__((btf_type_tag(""))), but their
>>>>>> semantic interpretation is quite different.
>>>>>> For example, with c2x format,
>>>>>>      __typetag1 applies to ptr2
>>>>>> with __attribute__ format, it applies pointee of ptr1.
>>>>>>
>>>>>> But more importantly, c2x format is incompatible with
>>>>>> the usage of linux kernel. The following are a bunch of kernel
>>>>>> __user usages. Here, __user intends to be replaced with a btf_type_tag.
>>>>>>
>>>>>> vfio_pci_core.h:        ssize_t (*rw)(struct vfio_pci_core_device
>>>>>> *vdev, char __user *buf,
>>>>>> vfio_pci_core.h:                                  char __user *buf,
>>>>>> size_t count,
>>>>>> vfio_pci_core.h:extern ssize_t vfio_pci_bar_rw(struct
>>>>>> vfio_pci_core_device *vdev, char __user *buf,
>>>>>> vfio_pci_core.h:extern ssize_t vfio_pci_vga_rw(struct
>>>>>> vfio_pci_core_device *vdev, char __user *buf,
>>>>>> vfio_pci_core.h:                                      char __user
>>>>>> *buf, size_t count,
>>>>>> vfio_pci_core.h:                                void __user *arg,
>>>>>> size_t argsz);
>>>>>> vfio_pci_core.h:ssize_t vfio_pci_core_read(struct vfio_device
>>>>>> *core_vdev, char __user *buf,
>>>>>> vfio_pci_core.h:ssize_t vfio_pci_core_write(struct vfio_device
>>>>>> *core_vdev, const char __user *buf,
>>>>>> vringh.h:                    vring_desc_t __user *desc,
>>>>>> vringh.h:                    vring_avail_t __user *avail,
>>>>>> vringh.h:                    vring_used_t __user *used);
>>>>>> vt_kern.h:int con_set_cmap(unsigned char __user *cmap);
>>>>>> vt_kern.h:int con_get_cmap(unsigned char __user *cmap);
>>>>>> vt_kern.h:int con_set_trans_old(unsigned char __user * table);
>>>>>> vt_kern.h:int con_get_trans_old(unsigned char __user * table);
>>>>>> vt_kern.h:int con_set_trans_new(unsigned short __user * table);
>>>>>> vt_kern.h:int con_get_trans_new(unsigned short __user * table);
>>>>>>
>>>>>> You can see, we will not able to simply replace __user
>>>>>> with [[clang::btf_type_tag("user")]] because it won't work
>>>>>> according to c2x expectations.
>>>>
>>>> Hi,
>>>>
>>>> Thanks for checking this. I see that we probably cannot use the c2x
>>>> syntax in the kernel, since it will not work as a drop-in replacement
>>>> for the current uses.
>>>>
>>>>>
>>>>> Hi Yongsong.
>>>>>
>>>>> I am a bit confused regarding the GNU attributes problem: our patch
>>>>> supports it, but as David already noted:
>>>>>
>>>>>>>>> There is still the question of why the DWARF generated for this case
>>>>>>>>> that I have been concerned about:
>>>>>>>>>
>>>>>>>>>      int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>
>>>>>>>>> differs between GCC (with this series) and clang. After studying it,
>>>>>>>>> GCC is doing with the attributes exactly as is described in the
>>>>>>>>> Attribute Syntax portion of the GCC manual where the GNU syntax is
>>>>>>>>> described. I do not think there is any problem here.
>>>>>>>>>
>>>>>>>>> So the difference in DWARF suggests to me that clang is not handling
>>>>>>>>> the GNU attribute syntax in this particular case correctly, since it
>>>>>>>>> seems to be associating __typetag2 and __typetag3 to g's type rather
>>>>>>>>> than the type to which it points.
>>>>>
>>>>> Note the example he uses is:
>>>>>
>>>>>     (a) int __typetag1 * __typetag2 __typetag3 * g;
>>>>>
>>>>> Not
>>>>>
>>>>>     (b) int * __typetag1 * __typetag2 __typetag3 g;
>>>>>
>>>>> Apparently for (a) clang is generating DWARF that associates __typetag2
>>>>> and__typetag3 to g's type (the pointer to pointer) instead of the
>>>>> pointer to int, which contravenes the GNU syntax rules.
>>>>>
>>>>> AFAIK thats is where the DWARF we generate differs, and what is blocking
>>>>> us.  David will correct me in the likely case I'm wrong :)
>>>>
>>>> Right. This is what I hoped maybe the C2x syntax could resolve.
>>>>
>>>> The issue I saw is that in the case (a) above, when using the GNU
>>>> attribute syntax, GCC and clang produce different results. I think that
>>>> the underlying cause is some subtle difference in how clang is handling
>>>> the GNU attribute syntax in the case compared to GCC.
>>>>
>>>>
>>>> To remind ourselves, here is the full example. Notice the significant
>>>> difference in which objects the tags are associated with in DWARF.
>>>>
>>>>
>>>> #define __typetag1 __attribute__((btf_type_tag("type-tag-1")))
>>>> #define __typetag2 __attribute__((btf_type_tag("type-tag-2")))
>>>> #define __typetag3 __attribute__((btf_type_tag("type-tag-3")))
>>>>
>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>
>>>>
>>>> GCC: bpf-unknown-none-gcc -c -gdwarf -gbtf annotate.c
>>>>
>>>> 0x0000000c: DW_TAG_compile_unit
>>>>                 DW_AT_producer     ("GNU C17 12.0.1 20220401 
>>>> (experimental) -gdwarf -gbtf")
>>>>                 DW_AT_language     (DW_LANG_C11)
>>>>                 DW_AT_name ("annotate.c")
>>>>                 DW_AT_comp_dir     ("/home/dfaust/playpen/btf/tags")
>>>>                 DW_AT_stmt_list    (0x00000000)
>>>>
>>>> 0x0000001e:   DW_TAG_variable
>>>>                   DW_AT_name       ("g")
>>>>                   DW_AT_decl_file  
>>>> ("/home/dfaust/playpen/btf/tags/annotate.c")
>>>>                   DW_AT_decl_line  (11)
>>>>                   DW_AT_decl_column        (0x2a)
>>>>                   DW_AT_type       (0x00000032 "int **")
>>>>                   DW_AT_external   (true)
>>>>                   DW_AT_location   (DW_OP_addr 0x0)
>>>>
>>>> 0x00000032:   DW_TAG_pointer_type
>>>>                   DW_AT_byte_size  (8)
>>>>                   DW_AT_type       (0x00000045 "int *")
>>>>                   DW_AT_sibling    (0x00000045)
>>>>
>>>> 0x0000003b:     DW_TAG_LLVM_annotation
>>>>                     DW_AT_name     ("btf_type_tag")
>>>>                     DW_AT_const_value      ("type-tag-1")
>>>>
>>>> 0x00000044:     NULL
>>>>
>>>> 0x00000045:   DW_TAG_pointer_type
>>>>                   DW_AT_byte_size  (8)
>>>>                   DW_AT_type       (0x00000061 "int")
>>>>                   DW_AT_sibling    (0x00000061)
>>>>
>>>> 0x0000004e:     DW_TAG_LLVM_annotation
>>>>                     DW_AT_name     ("btf_type_tag")
>>>>                     DW_AT_const_value      ("type-tag-3")
>>>>
>>>> 0x00000057:     DW_TAG_LLVM_annotation
>>>>                     DW_AT_name     ("btf_type_tag")
>>>>                     DW_AT_const_value      ("type-tag-2")
>>>>
>>>> 0x00000060:     NULL
>>>>
>>>> 0x00000061:   DW_TAG_base_type
>>>>                   DW_AT_byte_size  (0x04)
>>>>                   DW_AT_encoding   (DW_ATE_signed)
>>>>                   DW_AT_name       ("int")
>>>>
>>>> 0x00000068:   NULL
>>>
>>> do you have documentation to show why gnu generates attribute this way?
>>> If dwarf generates
>>>       ptr -> tag3 -> tag2 -> ptr -> tag1 -> int
>>> does this help?
>>
>> Okay, I think I see the problem. The internal representations between clang
>> and GCC attach the attributes to different nodes, and as a result they
>> produce different DWARF:
>>
>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
>> annotations: !10)
>> !6 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !7, size: 64,
>> annotations: !8)
>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>> !8 = !{!9}
>> !9 = !{!"btf_type_tag", !"tag1"}
>> !10 = !{!11, !12}
>> !11 = !{!"btf_type_tag", !"tag2"}
>> !12 = !{!"btf_type_tag", !"tag3"}
>>
>> If I am reading this IR right, then the tags "tag2" and "tag3" are being
>> applied to the int**, and "tag1" is applied to the int*
>>
>> But I don't think this lines up with how the attribute syntax is defined.
>> See
>>    https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html
>> In particular the "All other attributes" section. (It's a bit dense).
>> Or, as Joseph summed it up nicely earlier in the thread:
>>> In either syntax, __typetag2 __typetag3 should apply to
>>> the type to which g points, not to g or its type, just as if you had a
>>> type qualifier there.  You'd need to put the attributes (or qualifier)
>>> after the *, not before, to make them apply to the pointer type.
>>
>>
>> Compare that to GCC's internal representation, from which DWARF is generated:
>>
>>   <var_decl 0x7ffff7535090 g
>>      type <pointer_type 0x7ffff74f8888
>>          type <pointer_type 0x7ffff74f8b28 type <integer_type 0x7ffff74385e8 
>> int>
>>              unsigned DI
>>              size <integer_cst 0x7ffff742b450 constant 64>
>>              unit-size <integer_cst 0x7ffff742b468 constant 8>
>>              align:64 warn_if_not_align:0 symtab:0 alias-set -1 
>> canonical-type 0x7ffff743f888
>>              attributes <tree_list 0x7ffff75165c8
>>                  purpose <identifier_node 0x7ffff75290f0 btf_type_tag>
>>                  value <tree_list 0x7ffff7516550
>>                      value <string_cst 0x7ffff75182e0 type <array_type 
>> 0x7ffff74f8738>
>>                          readonly constant static "type-tag-3\000">>
>>                  chain <tree_list 0x7ffff75165a0 purpose <identifier_node 
>> 0x7ffff75290f0 btf_type_tag>
>>                      value <tree_list 0x7ffff75164d8
>>                          value <string_cst 0x7ffff75182c0 type <array_type 
>> 0x7ffff74f8738>
>>                              readonly constant static "type-tag-2\000">>>>
>>              pointer_to_this <pointer_type 0x7ffff74f8bd0>>
>>          unsigned DI size <integer_cst 0x7ffff742b450 64> unit-size 
>> <integer_cst 0x7ffff742b468 8>
>>          align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
>> 0x7ffff74f87e0
>>          attributes <tree_list 0x7ffff75165f0 purpose <identifier_node 
>> 0x7ffff75290f0 btf_type_tag>
>>              value <tree_list 0x7ffff7516438
>>                  value <string_cst 0x7ffff75182a0 type <array_type 
>> 0x7ffff74f8738>
>>                      readonly constant static "type-tag-1\000">>>>
>>      public static unsigned DI defer-output 
>> /home/dfaust/playpen/btf/tags/annotate.c:10:42 size <integer_cst 
>> 0x7ffff742b450 64> unit-size <integer_cst 0x7ffff742b468 8>
>>      align:64 warn_if_not_align:0>
>>
>> See how tags "tag2" and "tag3" are associated with the pointer_type 
>> 0x7ffff74f8b28,
>> that is, "the type to which g points"
>>
>>  From GCC's DWARF the BTF we get currently looks like:
>>    VAR(g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int
>> which is obviously quite different and why this case caught my attention.
>>
>> I think this difference is the root of our problems. It might not be
>> specifically related to the BTF tag attributes but they do reveal some
>> discrepency between how clang and GCC handle the attribute syntax.
> 
> The btf_type attribute is very similar to address_space attribute.
> For example,
> $ cat t1.c
> int __attribute__((address_space(1))) * p;
> $ clang -g -S -emit-llvm t1.c
> 
> In IR, we will have
> @p = dso_local global ptr addrspace(1) null, align 8, !dbg !0
> ...
> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
> 
> Replacing address_space with btf_type_tag, we will get
> ptr->type_tag->int in debuginfo.
> 
> But it looks like gcc doesn't support address_space attribute
> 
> $ gcc -g -S t1.c
> t1.c:1:1: warning: ‘address_space’ attribute directive ignored 
> [-Wattributes]
>   int __attribute__((address_space(1))) * p;
>   ^~~
> 
> Is it possible for gcc to go with address_space attribute
> semantics for btf_type_tag attribute?

In cases like this the behavior is the same.
$ cat foo.c
int __attribute__((btf_type_tag("tag1"))) * p;
$ gcc -c -gdwarf -gbtf foo.c

Internally:
 <var_decl 0x7ffff743abd0 p
    type <pointer_type 0x7ffff7590150
        type <integer_type 0x7ffff74475e8 int public SI
            size <integer_cst 0x7ffff742bf90 constant 32>
            unit-size <integer_cst 0x7ffff742bfa8 constant 4>
            align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x7ffff74475e8 precision:32 min <integer_cst 0x7ffff742bf48 -2147483648> max 
<integer_cst 0x7ffff742bf60 2147483647>
            pointer_to_this <pointer_type 0x7ffff744fa80>>
        unsigned DI
        size <integer_cst 0x7ffff742bd50 constant 64>
        unit-size <integer_cst 0x7ffff742bd68 constant 8>
        align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x7ffff744fa80
        attributes <tree_list 0x7ffff7564d70
            purpose <identifier_node 0x7ffff757f2d0 btf_type_tag>
            value <tree_list 0x7ffff7564cf8
                value <string_cst 0x7ffff757c220 type <array_type 
0x7ffff75900a8>
                    readonly constant static "tag1\000">>>>
    public static unsigned DI defer-output 
/home/dfaust/playpen/btf/tags/foo.c:1:45 size <integer_cst 0x7ffff742bd50 64> 
unit-size <integer_cst 0x7ffff742bd68 8>
    align:64 warn_if_not_align:0>

And the resulting BTF:

[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[2] PTR '(anon)' type_id=3
[3] TYPE_TAG 'tag1' type_id=1
[4] VAR 'p' type_id=2, linkage=global
[5] DATASEC '.bss' size=0 vlen=1
        type_id=4 offset=0 size=8 (VAR 'p')

var(p) -> ptr -> type_tag -> int


> 
>>
>>>
>>>>
>>>>
>>>> clang: clang -target bpf -c -g annotate.c
>>>>
>>>> 0x0000000c: DW_TAG_compile_unit
>>>>                 DW_AT_producer     ("clang version 15.0.0 
>>>> (https://github.com/llvm/llvm-project.git 
>>>> f80e369f61ebd33dd9377bb42fcab64d17072b18)")
>>>>                 DW_AT_language     (DW_LANG_C99)
>>>>                 DW_AT_name ("annotate.c")
>>>>                 DW_AT_str_offsets_base     (0x00000008)
>>>>                 DW_AT_stmt_list    (0x00000000)
>>>>                 DW_AT_comp_dir     ("/home/dfaust/playpen/btf/tags")
>>>>                 DW_AT_addr_base    (0x00000008)
>>>>
>>>> 0x0000001e:   DW_TAG_variable
>>>>                   DW_AT_name       ("g")
>>>>                   DW_AT_type       (0x00000029 "int **")
>>>>                   DW_AT_external   (true)
>>>>                   DW_AT_decl_file  
>>>> ("/home/dfaust/playpen/btf/tags/annotate.c")
>>>>                   DW_AT_decl_line  (11)
>>>>                   DW_AT_location   (DW_OP_addrx 0x0)
>>>>
>>>> 0x00000029:   DW_TAG_pointer_type
>>>>                   DW_AT_type       (0x00000035 "int *")
>>>>
>>>> 0x0000002e:     DW_TAG_LLVM_annotation
>>>>                     DW_AT_name     ("btf_type_tag")
>>>>                     DW_AT_const_value      ("type-tag-2")
>>>>
>>>> 0x00000031:     DW_TAG_LLVM_annotation
>>>>                     DW_AT_name     ("btf_type_tag")
>>>>                     DW_AT_const_value      ("type-tag-3")
>>>>
>>>> 0x00000034:     NULL
>>>>
>>>> 0x00000035:   DW_TAG_pointer_type
>>>>                   DW_AT_type       (0x0000003e "int")
>>>>
>>>> 0x0000003a:     DW_TAG_LLVM_annotation
>>>>                     DW_AT_name     ("btf_type_tag")
>>>>                     DW_AT_const_value      ("type-tag-1")
>>>>
>>>> 0x0000003d:     NULL
>>>>
>>>> 0x0000003e:   DW_TAG_base_type
>>>>                   DW_AT_name       ("int")
>>>>                   DW_AT_encoding   (DW_ATE_signed)
>>>>                   DW_AT_byte_size  (0x04)
>>>>
>>>> 0x00000042:   NULL
>>>>
>>>>

Reply via email to