Re: [Dwarf-discuss] Enhancement: Expression Operation Vendor Extensibility Opcode
I've added this as DWARF Issue 230324.1. I'll report back after the committee has reviewed it. Thank you for your contribution! -cary On Fri, Mar 24, 2023 at 1:21 PM Linder, Scott via Dwarf-discuss wrote: > > [AMD Official Use Only - General] > > Background > == > > The vendor extension encoding space for DWARF expression operations > accommodates only 32 unique operations. In practice, the lack of a central > registry and a desire for backwards compatibility means vendor extensions are > never retired, even when standard versions are accepted into DWARF proper. > This > has produced a situation where the effective encoding space available for new > vendor extensions is miniscule today. > > To expand this encoding space we propose defining one DWARF operation in the > official encoding space which acts as a "prefix" for vendor extensions. It is > followed by a ULEB128 encoded vendor extension opcode, which is then followed > by the operands of the corresponding vendor extension operation. > > This scheme opens up an infinite encoding space for arbitrary vendor > extensions, and in practical terms is no less compact than if a fixed-size > encoding were chosen, as was done for DW_LNS_extended_op. That is to say, when > compared with an alternative scheme which encodes the opcode with a single > unsigned byte: for the first 127 opcodes our approach is indistinguishable > from > the alternative scheme; for the next 128 opcodes it requires one more byte > than > that alternative scheme; and after 255 opcodes the alternative scheme is > exhausted. > > Since vendor extension operations can have arbitrary semantics, the consumer > must understand them to be able to continue evaluating the expression. The > only > use for a size operand would be for a consumer that only needs to print the > expression. Omitting a size operand makes the operation encoding more compact, > and this was deemed more important than the limited printing use case. > Therefore no ULEB128 size operand is present to provide the number of bytes of > following operands, unlike DW_LNS_extended_op. > > A centralized registry of vendor extension opcodes which are in use, > maintained > on the dwarfstd.org website or another suitable location, could also be > implemented as a part of this proposal. This would remove the need for vendors > to coordinate allocation themselves, and make it simpler to use more than one > vendor extension at a time. As there is support for an infinite number of > opcodes, the registration process could involve very limited review, and would > therefore pose a minimal burden to the maintainer of such a registry. > > Proposal > > > 1) In Section 2.5.1.7, p38, add a new code at the end of the list: > > 3. DW_OP_user > The DW_OP_user opcode encodes a vendor extension operation. It has at > least one operand: a ULEB128 constant identifying a vendor extension > operation. The remaining operands are defined by the vendor extension. > The vendor extension opcode 0 is reserved and cannot be used by any > vendor extension. > > The DW_OP_user encoding space can be understood to supplement the > space defined by DW_OP_lo_user and DW_OP_hi_user that is allocated by > the standard for the same purpose. > > 2) In Section 7.7.1, p226, add a new row to table 7.9: > > DW_OP_user | TBD | 1+ | ULEB128 vendor extension opcode, followed by > | | | vendor-extension-defined operands > -- > Dwarf-discuss mailing list > Dwarf-discuss@lists.dwarfstd.org > https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss -- Dwarf-discuss mailing list Dwarf-discuss@lists.dwarfstd.org https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
Re: [Dwarf-discuss] ISSUE: CPU vector types.
On 3/27/23 23:51, Cary Coutant wrote: Vector registers It has been the long standing existing practice to treat hardware vector registers as arrays of a fundamental base type. To deliniate these hardware register arrays from arrays in the language source they have been given the DW_AT_GNU_vector attribute. This proposal simply standardizes the existing behavior. In Section 2.2 Attribute Types, DW_AT_vector and DW_AT_variable_vector_width shall be added to Table 2.2 DW_AT_vector| A hardware vector register DW_AT_variable_vector_width | Array bound for hardware | implementation defined vector register | width I don't understand what tags this DW_AT_vector attribute would apply to. Vector registers aren't *types*, they're *locations*, so it doesn't really make sense to me to put this attribute on a DW_TAG_array_type. Maybe I should have said. DW_AT_vector| A hardware vector register type because what I'm talking about are types not locations. Consider a simple program like: #include __m128 *f( __m128 a, float *b){ __m128 *c=new __m128; *c=_mm_load_ps(b); *c+=a; return c; } Which when compiled by GCC generates DWARF like this: [ 7c] base_type abbrev: 3 byte_size (data1) 4 encoding (data1) float (4) name (strp) "float" [ 7c5] typedef abbrev: 8 name (strp) "__m128" decl_file (data1) xmmintrin.h (2) decl_line (data1) 69 decl_column (data1) 15 type (ref4) [ 7d1] [ 7d1] array_type abbrev: 30 GNU_vector (flag_present) yes type (ref4) [ 7c] sibling (ref4) [ 7dd] [ 7da] subrange_type abbrev: 31 upper_bound (data1) 3 [ 7dd] base_type abbrev: 3 byte_size (data1) 2 encoding (data1) float (4) name (strp) "_Float16" [ 7e4] subprogram abbrev: 32 external (flag_present) yes name (string) "f" decl_file (data1) vecreg.C (1) decl_line (data1) 3 decl_column (data1) 9 linkage_name (strp) "_Z1fDv4_fPf" type (ref4) [ 881] low_pc (addr) .text+00 <_Z1fDv4_fPf> high_pc (data8) 38 (.bss+00) frame_base (exprloc) [ 0] call_frame_cfa call_all_calls (flag_present) yes sibling (ref4) [ 881] [ 808] formal_parameter abbrev: 15 name (string) "a" decl_file (implicit_const) vecreg.C (1) decl_line (implicit_const) 3 decl_column (data1) 19 type (ref4) [ 7c5] location (sec_offset) location list [ 10] GNU_locviews (sec_offset) location list [ c] [ 818] formal_parameter abbrev: 15 name (string) "b" decl_file (implicit_const) vecreg.C (1) decl_line (implicit_const) 3 decl_column (data1) 29 type (ref4) [ 886] location (sec_offset) location list [ 22] GNU_locviews (sec_offset) location list [ 1c] [ 828] variable abbrev: 33 name (string) "c" decl_file (data1) vecreg.C (1) decl_line (data1) 4 decl_column (data1) 11 type (ref4) [ 881] location (sec_offset) location list [ 37] GNU_locviews (sec_offset) location list [ 35] [ 881] pointer_type abbrev: 5 byte_size (implicit_const) 8 type (ref4) [ 7c5] [ 886] pointer_type abbrev: 5 byte_size (implicit_const) 8 type (ref4) [ 7c] Offset: 10, Index: 4 offset_pair 0, 15 .text+00 <_Z1fDv4_fPf>.. .text+0x0014 <_Z1fDv4_fPf+0x14> [ 0] reg17 offset_pair 15, 26 .text+0x000
Re: [Dwarf-discuss] ISSUE: CPU vector types.
On Mon, Mar 27, 2023 at 11:52 PM Cary Coutant via Dwarf-discuss < dwarf-discuss@lists.dwarfstd.org> wrote: > > Vector registers > > > > It has been the long standing existing practice to treat hardware > > vector registers as arrays of a fundamental base type. To deliniate > > these hardware register arrays from arrays in the language source they > > have been given the DW_AT_GNU_vector attribute. This proposal simply > > standardizes the existing behavior. > > > > In Section 2.2 Attribute Types, DW_AT_vector and > > DW_AT_variable_vector_width shall be added to Table 2.2 > > > > > > DW_AT_vector| A hardware vector register > > DW_AT_variable_vector_width | Array bound for hardware > > | implementation defined vector register > > | width > > > > I don't understand what tags this DW_AT_vector attribute would apply > to. Vector registers aren't *types*, they're *locations*, so it > doesn't really make sense to me to put this attribute on a > DW_TAG_array_type. We don't have DW_TAGs that describe registers; the > ABI defines the registers and DWARF producers and consumers should > understand and agree on the sizes and shapes of the various registers. > > In Tony's proposal, the new attribute modifies a base type, thus > introducing a vector type, which might get placed in a vector > register. But there, I don't see how the vector base type is > fundamentally different from an array type. It seems it's just a dodge > to make it a base type so that we can put whole vectors on the stack. > > Maybe what we're looking for is a DW_TAG_vector_type, whose DW_AT_type > attribute gives the base type for each element of the vector. This > seems to be more DWARF-like, and if we decide there's a reason to > allow stack entries with vector types, we can do that. > > -cary > -- > Dwarf-discuss mailing list > Dwarf-discuss@lists.dwarfstd.org > https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss (Caveat this is all amd64 specific, I haven't looked at ARM.) Last I looked into this I concluded that DW_AT_GNU_vector is a big hack to make debuggers capable of handling function entries and exits that involve the intrinsic types that hardware vendors have added for their SIMD features. In the SYSV AMD64 ABI (download link https://gitlab.com/x86-psABIs/x86-64-ABI/-/jobs/artifacts/master/raw/x86-64-ABI/abi.pdf?job=build) in Section 3.2.3 there is a subsection "Classification". Immediately after the base types, _m64/_m128/_m256/_m512 are special cased to be part of the "SSE" class. If you follow the algorithm described you'll note that non-special cased arrays are treated as aggregates of the element type and are not "promoted" to the "SSE" class. So _m256d and double[4], despite having identical layouts in memory, will be classified as SSE/SSEUP and MEMORY respectively by that algorithm (in particular note footnote 15 here). This is also easily verifable by compiling a trivial function that takes _m256d and double[4] parameters. A debugger usually doesn't care about any of this because the DIEs for local/global variables will contain location information specifying where to get the bytes from to render the variable. DWARF does not, however, specify the locations of function return values at exit or parameters at function entry[0]. These locations are needed, in turn, to display the function return value upon function exit (think `ret` in gdb)[1] and to invoke a function with user supplied parameters (think `call` in gdb). Debuggers generally infer these locations from their knowledge of the platform ABI[2]. The DW_AT_GNU_vector attribute informs the debugger that these types are among the ones special-cased in the ABI and prevents the need to do something even more unprincipled such as name matching. DW_AT[_GNU]_vector is best understood not as "a hardware vector register" but rather as a marker that "this type is eligible to be passed in hardware vector registers at function boundaries according to the platform ABI". - Kyle [0] DWARF has DW_TAG_formal_parameter and those can contain location information *but* DWARF does not specify that the location information must be valid at function entry. For debug builds, compilers often emit locations that are only valid after the function prologue has completed. Post-prologue locations are useless for invoking the function in a debugger. [1] Inferring the function return value location causes issues in other contexts and I have a DWARF issue on that (221105.1) [2] See e.g. amd64_classify in gdb/amd64-tdep.c -- Dwarf-discuss mailing list Dwarf-discuss@lists.dwarfstd.org https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
Re: [Dwarf-discuss] ISSUE: CPU vector types.
> DW_AT[_GNU]_vector is best understood not as "a hardware vector register" but > rather as a marker that "this type is eligible to be passed in hardware > vector registers at function boundaries according to the platform ABI". My 2c would not be to describe these in terms of hardware/implementations (that gets confusing/blurs the line between variable/types and locations - as you say, these things can be stored in memory, so they aren't uniquely in registers - you might have a member of this type in a struct passed in memory and need to know the ABI/struct layout for that, etc), but at the source level - which the ABI is defined in those same terms. Overloading, for instance, still applies if these are different types - so other debugger features need to work based on this type information. So it seems like a simpler question is: How should DWARF producers/consumers expect to encode the source example Ben provided (well, simplified a bit): #include void f( __m128 a){ } What DWARF should be used to describe the type of 'a'? And how does this encoding scale to all the other similar intrinsic types? -- Dwarf-discuss mailing list Dwarf-discuss@lists.dwarfstd.org https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
Re: [Dwarf-discuss] Add a mechanism for specifying subprogram return value locations (221105.1)
Thank you Kyle, On 3/28/23 12:49, Kyle Huey wrote: [1] Inferring the function return value location causes issues in other contexts and I have a DWARF issue on that (221105.1) Let me publicly state my support for your proposal for how to specify the return value of a function. I'm not sure that I would put it in 3.3.2 or 3.3.3 but I totally agree with the concept of specifying the location of the return value. -ben -- Dwarf-discuss mailing list Dwarf-discuss@lists.dwarfstd.org https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
Re: [Dwarf-discuss] ISSUE: CPU vector types.
On Tue, Mar 28, 2023 at 1:17 PM David Blaikie wrote: > > DW_AT[_GNU]_vector is best understood not as "a hardware vector > register" but rather as a marker that "this type is eligible to be passed > in hardware vector registers at function boundaries according to the > platform ABI". > > My 2c would not be to describe these in terms of > hardware/implementations (that gets confusing/blurs the line between > variable/types and locations - as you say, these things can be stored > in memory, so they aren't uniquely in registers - you might have a > member of this type in a struct passed in memory and need to know the > ABI/struct layout for that, etc), but at the source level - which the > ABI is defined in those same terms. Overloading, for instance, still > applies if these are different types - so other debugger features need > to work based on this type information. > > So it seems like a simpler question is: > > How should DWARF producers/consumers expect to encode the source > example Ben provided (well, simplified a bit): > > #include > > void f( __m128 a){ > } > > What DWARF should be used to describe the type of 'a'? And how does > this encoding scale to all the other similar intrinsic types? > Stepping back even further I'd ask what information does the compiler need to communicate to the debugger or other tools that's not communicated by the DWARF for e.g. double[4]? The only information I'm personally aware of is the ABI stuff I mentioned before, but there may very well be other things lurking. - Kyle -- Dwarf-discuss mailing list Dwarf-discuss@lists.dwarfstd.org https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss