Re: [lldb-dev] Making a new symbol provider

Greg Clayton via lldb-dev Tue, 01 Mar 2016 11:37:19 -0800

> On Mar 1, 2016, at 11:30 AM, Zachary Turner <[email protected]> wrote:
> 
> We do know the last line of a function. In the review i posted, you can see 
> the condition where i set is_epilogue to true. That is the last line of a 
> function corresponding to the } (although the function may contain additional 
> bytes, since that only refers to the first byte of the epilogue.


I don't believe any compilers set is_eqilogue correctly yet.

> But I don't know if it's appropriate to set is_terminal_entry to true here 
> because that's a valid line with a valid address. Terminal entries are 
> ignored when doing address lookups so this line would never be found when 
> looking up that address.

The "is_terminal_entry" are to provide the address range for the last line in a 
sequence. It's line number doesn't mean anything, but it is typically the same 
as the previous one.
> 
> What i might be able to do is figure out the size of the epilogue and inject 
> a new entry with address=epilogue_addr+epilogue_size and make that the 
> termination entry does that work? If so what should i set for its line number?

The line number can just be the same as the previous one. We need to make sure 
we cover every byte of a function with a valid line entry. Anywhere the user 
can actually stop should have a valid line entry when possible.
> 
> Just to make sure I understand, does "terminal entry" specifically mean the 
> end of a *function*? Reading the code I thought it meant the end of a 
> LineSequence

No, it just is there to indicate that it terminates the previous line entry 
since line entries are stored with start address only. If a function is 
discontiguous, or if it has data in the middle, a function might have multiple 
sequences. So the terminal entry is just to provide an address range for the 
last line entry in a contiguous address range of line entries.

> On Tue, Mar 1, 2016 at 10:33 AM Greg Clayton <[email protected]> wrote:
> 
> > On Feb 29, 2016, at 5:51 PM, Zachary Turner <[email protected]> wrote:
> >
> >
> >
> > On Mon, Feb 29, 2016 at 5:49 PM Zachary Turner <[email protected]> wrote:
> > Those are addresses.  Here's the situation I was encountering this on:
> >
> > // foo.h
> > #include "bar.h"
> > inline int f(int n)
> > {
> >     return g(n) + 1;
> > }
> >
> > // bar.h
> > inline int g(int n)
> > {
> >     return n+1;
> > }
> >
> > // foo.cpp
> > #include "foo.h"
> > int main(int argc, char** argv)
> > {
> >     return f(argc);
> > }
> >
> > PDB gives me back line numbers and address range grouped by file.  So I get 
> > all of foo.h's lines, all of bar.h's lines, and all of foo.cpp's lines.  In 
> > sorted form, the lines for g will appear inside the sequence of lines for 
> > f.  So that's how the situation was arising.
> >
> > Just to clarify here.  When I was encountering this problem, I would create 
> > one LineSequence for foo.h's lines, one LineSequence for bar.h's lines, and 
> > one for foo.cpp's.  And each one is monotonically increasing, but the 
> > ranges can overlap as per the previous explanation, which was causing 
> > InsertLineSequence to fail.
> 
> 
> I understand now. Yes, you will need to parse all line entries one big 
> buffer, sort them by address, and then figure out what sequences to submit 
> after this.
> 
> Is there a termination entry for the last line entry in a function? Lets say 
> there were 4096 byte gaps between "f" and "g" and "main"? Are there 
> termination entries for the last '}' in each function so that when you put 
> all of the line entries into one large collection and sort them by address, 
> that you know there is a gap between the line entries? This is very important 
> to get right. If there aren't termination entries, you will need to add them 
> manually by looking up each line entry address and find the address range of 
> the function (which you can cache at the time of making the line sequences 
> from the sorted PDB line entries) and add termination entries for the ends of 
> functions. So lets say f starts at 0x1000 and the "inline int f" is on line 
> 3, g starts at 0x2000 and main starts at 0x3000, you don't want you line 
> table looking like a single sequence:
> 
> 0x1000: foo.cpp line 4  // {
> 0x1010: foo.cpp line 5  //     return g(n) + 1;
> 0x1020: foo.cpp line 6  // }
> 0x2000: foo.cpp line 10 // {
> 0x2010: foo.cpp line 11 //     return n+1;
> 0x2020: foo.cpp line 12 // }
> 0x3000: foo.cpp line 17 // {
> 0x3010: foo.cpp line 18 //     return f(argc);
> 0x3020: foo.cpp line 19 //  }
> 
> If you don't have termination entries, we will think foo.cpp:6 goes from 
> [0x1020-0x2000) which is probably now what we want.
> 
> There should be termination entries between the functions so that the line 
> entries do not contain gaps between functions in their address ranges. So you 
> should actually have 3 sequences in the line table:
> 
> 0x1000: foo.cpp line 4  // {
> 0x1010: foo.cpp line 5  //     return g(n) + 1;
> 0x1020: foo.cpp line 6  // }
> 0x1030: END
> 
> 0x2000: foo.cpp line 10 // {
> 0x2010: foo.cpp line 11 //     return n+1;
> 0x2020: foo.cpp line 12 // }
> 0x2030: END
> 
> 0x3000: foo.cpp line 17 // {
> 0x3010: foo.cpp line 18 //     return f(argc);
> 0x3020: foo.cpp line 19 //  }
> 0x3030: END
> 
> 0x1030, 0x2030 and 0x3030 are the end addresses of the functions f, g and 
> main respectively. So if your line table only contains start addresses, you 
> will need to inject these correctly otherwise source level single step can do 
> the wrong thing since it uses line entry address ranges to implement the 
> steps.
> 
> Greg
> 

_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Re: [lldb-dev] Making a new symbol provider

Reply via email to