[Dwarf-discuss] Proposal: `DW_LNS_indirect_line`

2024-07-21 Thread Jacob Young via Dwarf-discuss
> On 12/07/2024 19:04, David Blaikie wrote:
> > Thanks for all the context (I noticed you replied directly to me - are
> > you happy/OK having this discussion on the mailing list, rather than
> > in private? It'd help to keep all the history visible, linkable, etc)
> Yes, apologies, that was just a mistake on my part -- I meant to do
> this, then realised I accidentally replied to you directly, so went to
> forward it to the list, and only now realised that I accidentally
> forwarded it to a completely unrelated list :P
> > While I can appreciate the desire to make this update O(N) in the
> > number of source lines affected - would it be acceptable for this to
> > be O(N) in the number of machine/object level functions?
> >
> > Like if we had a feature for resetting the line table program part-way
> > through a line table - would that be adequate? Then your external data
> > could keep track of the line number setting operation at the start of
> > the new/distinct/indepednent sequence and update that?
> While not ideal for us, this would certainly be an improvement. Dealing
> with the relative line number offsets is definitely the biggest pain
> point wrt constructing the LNP today.
> > Such a feature would have some more generality/usefulness directly
> > without external/side data - for instance it would make chunks of the
> > line table discardable, which could make it easier for a producer to
> > use comdats to isolate the line table data associated with an inline
> > function and allow the linker to discard such a contribution if desired.

My idea for solving this problem without additional side data is to instead
add a line table opcode that references an existing DIE with DECL attributes
and sets the state machine's line register to the value of its DW_AT_decl_line
attribute.  Similarly, DW_AT_decl_file and DW_AT_decl_column could also
be copied to file and column for consistency.  By itself, this doesn't reduce
the number of updates required, but it could be combined with an additional
DIE tag for representing the source decl before inlining/instantiation and a
DIE attribute for referencing the source decl from the inlined/instantiated
DIE which would indicate that DECL attributes are copied from the source
decl.  That way, the DECL attributes would only ever need to be updated in
a single place, the source decl DIE.

Instead of creating a new tag, it also seems pretty straightforward to just
reuse DW_AT_subprogram for the source decl.  Since uninstantiated
functions would not correspond to any program addresses, a missing
DW_AT_low_pc or new flag attribute could indicate this. The intended
meaning of DW_AT_abstract_origin already links an inlined function to
the source decl.  For instantiations, it's possible to add meaning to either
DW_AT_abstract_origin or DW_AT_specification, or create a new attribute.
Both of these existing attributes already indicate that the referenced DIE
contains some of the attributes of the referencing DIE, but I am probably
stretching the intended meaning of the latter too far for this particular use
case.  (Actually, I wrote that reading DW_AT_abstract_origin as being only
documented as related to inlined functions, but I see now that it explicitly
"can be used with almost any debugging information entry" and the name
already seems to me to fit this instantiation use case perfectly.)

I should note that I'm already in the process of considering whether new
tags/attributes will be needed, some to support incremental, and some to
support Zig Language concepts that do not exist in DWARF yet.  It seems
likely that at least some number of new definitions will be needed anyway,
and I would expect that being able to represent source decls in a separate
DIE with references from the generated decls is useful independently of
whether this proposed line table opcode is accepted.  (Although with my
newfound understanding of DW_AT_abstract_origin, I'm now back down
to only one new DIE tag and one new DIE attribute to support incremental.)

I'm sure this could easily be converted into a more concrete proposal,
but I am interested in getting some feedback first, since I don't know if
there are any pre-existing constraints preventing .debug_info from being
referenced from .debug_line.  As I hope I have demonstrated, this seems
much more extensible to new use cases than the previous proposal.

> This is a good use case, but I would point out that it would (if I am
> understanding you correctly) also be solved by my original proposal. To
> be clear, by "more generality/usefulness", do you mean compared to my
> proposal, or compared to status quo? With that being said, I can
> understand the aversion to, as you put it, storing external/side data.
> Just for the record, I don't think the original issue is unique to the
> Zig project; any sufficiently incremental compiler (which admittedly is
> a rarity today, but could perhaps see more adoption as a concept if Zig
> is able to prove its v

Re: [Dwarf-discuss] Proposal: `DW_LNS_indirect_line`

2024-07-23 Thread Jacob Young via Dwarf-discuss
On Mon, Jul 22, 2024 at 5:29 PM David Blaikie  wrote:
>
>
>
> On Sun, Jul 21, 2024 at 3:54 PM Jacob Young  wrote:
>>
>> > On 12/07/2024 19:04, David Blaikie wrote:
>> > > Thanks for all the context (I noticed you replied directly to me - are
>> > > you happy/OK having this discussion on the mailing list, rather than
>> > > in private? It'd help to keep all the history visible, linkable, etc)
>> > Yes, apologies, that was just a mistake on my part -- I meant to do
>> > this, then realised I accidentally replied to you directly, so went to
>> > forward it to the list, and only now realised that I accidentally
>> > forwarded it to a completely unrelated list :P
>> > > While I can appreciate the desire to make this update O(N) in the
>> > > number of source lines affected - would it be acceptable for this to
>> > > be O(N) in the number of machine/object level functions?
>> > >
>> > > Like if we had a feature for resetting the line table program part-way
>> > > through a line table - would that be adequate? Then your external data
>> > > could keep track of the line number setting operation at the start of
>> > > the new/distinct/indepednent sequence and update that?
>> > While not ideal for us, this would certainly be an improvement. Dealing
>> > with the relative line number offsets is definitely the biggest pain
>> > point wrt constructing the LNP today.
>> > > Such a feature would have some more generality/usefulness directly
>> > > without external/side data - for instance it would make chunks of the
>> > > line table discardable, which could make it easier for a producer to
>> > > use comdats to isolate the line table data associated with an inline
>> > > function and allow the linker to discard such a contribution if desired.
>>
>> My idea for solving this problem without additional side data is to instead
>> add a line table opcode that references an existing DIE with DECL attributes
>> and sets the state machine's line register to the value of its 
>> DW_AT_decl_line
>> attribute.  Similarly, DW_AT_decl_file and DW_AT_decl_column could also
>> be copied to file and column for consistency.  By itself, this doesn't reduce
>> the number of updates required, but it could be combined with an additional
>> DIE tag for representing the source decl before inlining/instantiation and a
>> DIE attribute for referencing the source decl from the inlined/instantiated
>> DIE which would indicate that DECL attributes are copied from the source
>> decl.  That way, the DECL attributes would only ever need to be updated in
>> a single place, the source decl DIE.
>>
>> Instead of creating a new tag, it also seems pretty straightforward to just
>> reuse DW_AT_subprogram for the source decl.  Since uninstantiated
>> functions would not correspond to any program addresses, a missing
>> DW_AT_low_pc or new flag attribute could indicate this. The intended
>> meaning of DW_AT_abstract_origin already links an inlined function to
>> the source decl.  For instantiations, it's possible to add meaning to either
>> DW_AT_abstract_origin or DW_AT_specification, or create a new attribute.
>> Both of these existing attributes already indicate that the referenced DIE
>> contains some of the attributes of the referencing DIE, but I am probably
>> stretching the intended meaning of the latter too far for this particular use
>> case.  (Actually, I wrote that reading DW_AT_abstract_origin as being only
>> documented as related to inlined functions, but I see now that it explicitly
>> "can be used with almost any debugging information entry" and the name
>> already seems to me to fit this instantiation use case perfectly.)
>>
>> I should note that I'm already in the process of considering whether new
>> tags/attributes will be needed, some to support incremental, and some to
>> support Zig Language concepts that do not exist in DWARF yet.  It seems
>> likely that at least some number of new definitions will be needed anyway,
>> and I would expect that being able to represent source decls in a separate
>> DIE with references from the generated decls is useful independently of
>> whether this proposed line table opcode is accepted.  (Although with my
>> newfound understanding of DW_AT_abstract_origin, I'm now back down
>> to only one new DIE tag and one new DIE attribute to support incremental.)
>>
>> I'm sure this could easily be converted into a more concrete proposal,
>> but I am interested in getting some feedback first, since I don't know if
>> there are any pre-existing constraints preventing .debug_info from being
>> referenced from .debug_line.  As I hope I have demonstrated, this seems
>> much more extensible to new use cases than the previous proposal.
>
>
> Yeah, that'd be the tough part - currently the line table doesn't reference 
> anything in debug_info, and that's both beneficial in some ways (means you 
> can strip everything but the line table and still get symbolized stack traces 
> (& we added debug_line_str and some other