https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118837
Bug ID: 118837 Summary: Interpretation of DW_FORM_data* Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: tromey at gcc dot gnu.org Target Milestone: --- I recently found out that LLVM will use DW_FORM_data*, expecting the debugger to sign-extend the value when the desired type or context is "signed" and also the desired result is wider than the constant. So, for example, I have a modified LLVM that will emit DW_AT_decimal_scale. This is a signed value, so a scale of -4 will be emitted as DW_FORM_data1 with a value of 252. I'll be filing a gdb bug about this momentarily. Meanwhile I'm filing this to find out if gcc agrees with the LLVM interpretation of the DWARF standard, or if not whether there are any problems in the DWARF emitter. (For the second point, when discussing on irc it sounded like GCC uses DW_FORM_{u,s}data precisely to avoid this issue... ?) While I tend to agree with the LLVM interpretation, I do think there's some ambiguity, especially given the second paragraph. Perhaps a DWARF bug report is in order. The text in question says: The data in DW_FORM_data1, DW_FORM_data2, DW_FORM_data4, DW_FORM_data8 and DW_FORM_data16 can be anything. Depending on context, it may be a signed integer, an unsigned integer, a floating-point constant, or anything else. A consumer must use context to know how to interpret the bits, which if they are target machine data (such as an integer or floating-point constant) will be in target machine byte order. If one of the DW_FORM_data<n>forms is used to represent a signed or unsigned integer, it can be hard for a consumer to discover the context necessary to determine which interpretation is intended. Producers are therefore strongly encouraged to use DW_FORM_sdata or DW_FORM_udata for signed and unsigned integers respectively, rather than DW_FORM_data<n>.