Re: [lldb-dev] Break setting aliases...

2020-07-23 Thread Pavel Labath via lldb-dev
On 22/07/2020 19:50, Jim Ingham wrote:
>> On Jul 22, 2020, at 12:34 AM, Pavel Labath  wrote:
>>
>> The "--" is slightly unfortunate, but it's at least consistent with our
>> other commands taking raw input. We could avoid that by making the
>> command not take raw input. I think most of the "modes" of the "b"
>> command wouldn't need quoting in most circumstances -- source regex and
>> "lib`func" modes being exceptions.
> 
> If anybody wants to work on this, I think Jonas is right, the first step 
> would be to convert it to an actual command not a regex command.  The 
> _regexp-break command is already hard enough to comprehend.
> 
> You could still do the actual specifier parsing with a series of regex’s if 
> that seems best, though there might be easier ways to do it.  I also don’t 
> think this would need to be a raw command, rather it could be a parsed 
> command with one argument which was the breakpoint specifier and then all the 
> other breakpoint flags.  
> 
> All the specifications you can currently pass to b are single words w/o 
> spaces in them, or if they do have spaces they are the things you are used to 
> having to quote in lldb: like file names with spaces.  
The lib`func notation contains a backtick, which is used for expression
substitution in the command interpreter. Currently we seem to be just
dropping an unmatched backtick, which would break that.  We could change
it so that the unmatched backtick is kept, though I would actually
prefer to make that an error..


>> "br set" help starts with a long list command switches, which are
>> supposed to show which options can be used together. I think this sort
>> of listing is nice when the command has a couple of modes and a few
>> switches, but it really misses the mark when it ends up listing 11 modes
>> with approximately 20 switches in each one.
>>
>> This is then followed by descriptions of the 20 or so switches. This
>> list is alphabetical, which means the most commonly used options end up
>> burried between the switches I've never even used.
> 
> Yes.  I’ve said many times before that making “break set” the master command 
> for breakpoint setting was a mistake.  ...


Restructuring the commands is one thing. It might be a good idea but
there are less invasive things we could do to make this better. Just
restructuring the help output to make the most common use cases easier
to find would help a lot IMO. We could drop or simplify the "synopsis"
section, maybe replacing it with a couple of examples of the most useful
kinds of breakpoints.

Then we could group the options to keep the similar ones together and
make them easier to find/skip. Maybe with groups like:
- options specifying where to set a breakpoint: --file, --line; --name; etc.
- options restricting the reported breakpoint hits:
--thread-id,--thread-name,--condition, etc.
- various modifiers: --disable, --hardware, --one-shot, etc.
- others (?)

The division may not be perfect (like, is --ignore-count a "modifier" or
does it "restrict breakpoint hits"?), but even so I think this would
make that list a lot easier to navigate. But we digress...

On 22/07/2020 20:20, Greg Clayton wrote:
> BTW: to see what things expand to after reach regex alias, just set
this setting first:
>
> (lldb) settings set interpreter.expand-regex-aliases true

That is cool. I wonder if that should be the default...

pl
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] RFC: -flimit-debug-info + frame variable

2020-07-23 Thread Pavel Labath via lldb-dev
On 22/07/2020 01:31, Jim Ingham wrote:
> 
> 
>> On Jul 21, 2020, at 9:27 AM, Pavel Labath > > wrote:
>> I do see the attractiveness of constructing of a full compiler type. The
>> reason I am hesitant to go that way, because it seems to me that this
>> would negate the two main benefits of the frame variable command over
>> the expression evaluator: a) it's fast; b) it's less likely to crash.
>>
>> And while I don't think it will be as slow or as crashy as the
>> expression evaluator, the usage of the ast importer will force a lot
>> more types to be parsed than are strictly needed for this functionality.
>> And the insertion of all potentially conflicting types from different
>> modules into a single ast context is also somewhat worrying.
> 
> Importation should be incremental as well, so this shouldn’t make things
> that much slower.  And you shouldn’t ever be looking things up by name
> in this AST so you wouldn’t be led astray that way.  You also are going
> to have to do pretty much the same job for “expr”, right?  So you
> wouldn’t be opening new dangerous pathways.

The import is not as incremental as we might want, and it actually sort
of depends on what is the state of the source ast. Let's the source AST
has types A and B, and A depends on B in some way (say as a method
argument). Let's say that A is complete (parsed) and B isn't. While
importing A, the ast importer will import the method which has the B
argument, but whether it will not descend into B (and cause us to parse it).
If however, B happens to be B already parsed then it will import B and
all of its base classes (but not fields and methods).

On top of that we also have our own additions -- whenever we encounter a
method returning a pointer, we import the pointer target type (this has
to do with covariant return types). These things compound and so even a
simple import can end up importing quite a lot.

I actually tried making the ast importer more lazy -- I have a proof of
concept, but it required adding more explicit lookups into clang's Sema,
so that's why I haven't pursued it yet.

I could also try to disable some of these things for these frame
variable imports (they don't need methods at all), but then I would be
opening new dangerous pathways...


> 
> OTOH, the AST’s are complex beasts, so I am not unmoved by your worries...

Yeah... :)

>> The dlclose issue is an interesting one. Presumably, we could ensure
>> that the module does not go away by storing a module shared (or weak?)
>> pointer somewhere inside the value object. BTW, how does this work with
>> ValueObject casts right now? If I cast a ValueObject to a CompilerType
>> belonging to a different module, does anything ensure this module does
>> not go away? Or when dereferencing a pointer to an type which is not
>> complete in the current module?
> 
> I don’t think at present we do anything smart about this.  It’s just
> always bugged me at the back of my brain that we could get into trouble
> with this, and so I don’t want to do something that would make it worse,
> especially in a systemic way.

Is there a reason we don't store a pointer to the module where the
TypeSystem came from? We could do either do that for all ValueObjects,
or just when the type system changes (casts, dereferences of incomplete
types, and now -flimit-debug-info) ?

> 
>>
>> I'm hoping that this stuff won't be "hard work". I haven't prototyped
>> the code yet, but I am hoping to keep this lookup code in under 200 LOC.
>> And as Greg points out, there are ways to put this stuff into the type
>> system -- I'm just not sure whether that is needed given that the
>> ValueObject class is the only user of the GetIndexOfChildMemberWithName
>> interface. The whole function is pretty clearly designed with
>> ValueObject::GetChildMemberWithName in mind.
> 
> It seems fine to me to proceed along the lines you propose.  If it ends
> up being smooth sailing, I can’t see any reason not to do it this way.
>  When/If you end up having lots of corner cases to manage, would be the
> time to consider cutting back to using the real type system to back
> these computations.

Ok, sounds good. Let me create a prototype for this, and we'll see how
it goes from there. It may take a while because I'm now entangled in
some line table stuff.


On 21/07/2020 23:23, Greg Clayton wrote:
>> On Jul 21, 2020, at 9:27 AM, Pavel Labath  wrote:
>> The dlclose issue is an interesting one. Presumably, we could ensure
>> that the module does not go away by storing a module shared (or weak?)
>> pointer somewhere inside the value object. BTW, how does this work with
>> ValueObject casts right now? If I cast a ValueObject to a CompilerType
>> belonging to a different module, does anything ensure this module does
>> not go away? Or when dereferencing a pointer to an type which is not
>> complete in the current module?
>
> I am not sure dlclose is a problem, the module won't usually be
cleaned up. And that shared lib

Re: [lldb-dev] Break setting aliases...

2020-07-23 Thread Jim Ingham via lldb-dev


> On Jul 23, 2020, at 1:51 AM, Pavel Labath  wrote:
> 
> On 22/07/2020 19:50, Jim Ingham wrote:
>>> On Jul 22, 2020, at 12:34 AM, Pavel Labath  wrote:
>>> 
>>> The "--" is slightly unfortunate, but it's at least consistent with our
>>> other commands taking raw input. We could avoid that by making the
>>> command not take raw input. I think most of the "modes" of the "b"
>>> command wouldn't need quoting in most circumstances -- source regex and
>>> "lib`func" modes being exceptions.
>> 
>> If anybody wants to work on this, I think Jonas is right, the first step 
>> would be to convert it to an actual command not a regex command.  The 
>> _regexp-break command is already hard enough to comprehend.
>> 
>> You could still do the actual specifier parsing with a series of regex’s if 
>> that seems best, though there might be easier ways to do it.  I also don’t 
>> think this would need to be a raw command, rather it could be a parsed 
>> command with one argument which was the breakpoint specifier and then all 
>> the other breakpoint flags.  
>> 
>> All the specifications you can currently pass to b are single words w/o 
>> spaces in them, or if they do have spaces they are the things you are used 
>> to having to quote in lldb: like file names with spaces.  
> The lib`func notation contains a backtick, which is used for expression
> substitution in the command interpreter. Currently we seem to be just
> dropping an unmatched backtick, which would break that.  We could change
> it so that the unmatched backtick is kept, though I would actually
> prefer to make that an error..
> 

IMO the backtick parsing in lldb is currently done incorrectly.  In the 
original design, it was supposed to be another form of quote, not a 
preprocessing stage.  I think, for instance, this is more surprising than 
useful:

(lldb) run
Process 44292 launched: '/tmp/a.out' (x86_64)
Process 44292 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x00013f66 a.out`main at variable.c:7
   4main()
   5{
   6  int a = 100;
-> 7  printf("%d\n", a);
 ^
   8  return 0;
   9}
Target 0: (a.out) stopped.
(lldb) expr char *$my_str = "some `a` other"
(lldb) expr $my_str
(char *) $my_str = 0x0001001423e0 "some 100 other"

And this has tripped people up in the past.  

The intent was to substitute an option value or argument with the result of an 
expression if it was appropriately marked.  If backticks worked that way an 
inter-word backtick would not be significant.  

We could also change the character we print for separating shlib from file name 
both in how we print the spec and in how we encode it in ‘b’.  As long as these 
two are consistent I don’t think folks would much care what it was...


> 
>>> "br set" help starts with a long list command switches, which are
>>> supposed to show which options can be used together. I think this sort
>>> of listing is nice when the command has a couple of modes and a few
>>> switches, but it really misses the mark when it ends up listing 11 modes
>>> with approximately 20 switches in each one.
>>> 
>>> This is then followed by descriptions of the 20 or so switches. This
>>> list is alphabetical, which means the most commonly used options end up
>>> burried between the switches I've never even used.
>> 
>> Yes.  I’ve said many times before that making “break set” the master command 
>> for breakpoint setting was a mistake.  ...
> 
> 
> Restructuring the commands is one thing. It might be a good idea but
> there are less invasive things we could do to make this better. Just
> restructuring the help output to make the most common use cases easier
> to find would help a lot IMO. We could drop or simplify the "synopsis"
> section, maybe replacing it with a couple of examples of the most useful
> kinds of breakpoints.
> 
> Then we could group the options to keep the similar ones together and
> make them easier to find/skip. Maybe with groups like:
> - options specifying where to set a breakpoint: --file, --line; --name; etc.
> - options restricting the reported breakpoint hits:
> --thread-id,--thread-name,--condition, etc.
> - various modifiers: --disable, --hardware, --one-shot, etc.
> - others (?)
> 
> The division may not be perfect (like, is --ignore-count a "modifier" or
> does it "restrict breakpoint hits"?), but even so I think this would
> make that list a lot easier to navigate. But we digress...

This is already partly done.  The primaries for all these settings are always 
listed first, because they are required for a given form.  So you see:

Command Options Usage:
  breakpoint set [-DHd] -l  [-G ] [-C ] [-c ] 
[-i ] [-o ] [-q ] [-t ] [-x 
] [-T ] [-R ] [-N ] [-u 
] [-f ] [-m ] [-s ] [-K ]
  breakpoint set [-DHd] -a  [-G ] [-C ] 
[-c ] [-i ] [-o ] [-q ] [-t ] [-x 
] [-T ] [-N ] [-s ]
  breakpoint set [-DHd] -n  [-G ] [-C ] [-c 
] [-i ] [-o ] [-q ] [-t ] [-x 
] [-T ] [-R ] [-N ] [-f 
] [-L ] 

Re: [lldb-dev] RFC: -flimit-debug-info + frame variable

2020-07-23 Thread Jim Ingham via lldb-dev


> On Jul 23, 2020, at 5:15 AM, Pavel Labath  wrote:
> 
> On 22/07/2020 01:31, Jim Ingham wrote:
>> 
>> 
>>> On Jul 21, 2020, at 9:27 AM, Pavel Labath >> > wrote:
>>> I do see the attractiveness of constructing of a full compiler type. The
>>> reason I am hesitant to go that way, because it seems to me that this
>>> would negate the two main benefits of the frame variable command over
>>> the expression evaluator: a) it's fast; b) it's less likely to crash.
>>> 
>>> And while I don't think it will be as slow or as crashy as the
>>> expression evaluator, the usage of the ast importer will force a lot
>>> more types to be parsed than are strictly needed for this functionality.
>>> And the insertion of all potentially conflicting types from different
>>> modules into a single ast context is also somewhat worrying.
>> 
>> Importation should be incremental as well, so this shouldn’t make things
>> that much slower.  And you shouldn’t ever be looking things up by name
>> in this AST so you wouldn’t be led astray that way.  You also are going
>> to have to do pretty much the same job for “expr”, right?  So you
>> wouldn’t be opening new dangerous pathways.
> 
> The import is not as incremental as we might want, and it actually sort
> of depends on what is the state of the source ast. Let's the source AST
> has types A and B, and A depends on B in some way (say as a method
> argument). Let's say that A is complete (parsed) and B isn't. While
> importing A, the ast importer will import the method which has the B
> argument, but whether it will not descend into B (and cause us to parse it).
> If however, B happens to be B already parsed then it will import B and
> all of its base classes (but not fields and methods).
> 
> On top of that we also have our own additions -- whenever we encounter a
> method returning a pointer, we import the pointer target type (this has
> to do with covariant return types). These things compound and so even a
> simple import can end up importing quite a lot.
> 
> I actually tried making the ast importer more lazy -- I have a proof of
> concept, but it required adding more explicit lookups into clang's Sema,
> so that's why I haven't pursued it yet.

Anything we can do along these lines will help folks with large projects.  We 
have been getting slower in this area over the years.  But I understand the 
need to tread with caution here.

> 
> I could also try to disable some of these things for these frame
> variable imports (they don't need methods at all), but then I would be
> opening new dangerous pathways...
> 
> 
>> 
>> OTOH, the AST’s are complex beasts, so I am not unmoved by your worries...
> 
> Yeah... :)
> 
>>> The dlclose issue is an interesting one. Presumably, we could ensure
>>> that the module does not go away by storing a module shared (or weak?)
>>> pointer somewhere inside the value object. BTW, how does this work with
>>> ValueObject casts right now? If I cast a ValueObject to a CompilerType
>>> belonging to a different module, does anything ensure this module does
>>> not go away? Or when dereferencing a pointer to an type which is not
>>> complete in the current module?
>> 
>> I don’t think at present we do anything smart about this.  It’s just
>> always bugged me at the back of my brain that we could get into trouble
>> with this, and so I don’t want to do something that would make it worse,
>> especially in a systemic way.
> 
> Is there a reason we don't store a pointer to the module where the
> TypeSystem came from? We could do either do that for all ValueObjects,
> or just when the type system changes (casts, dereferences of incomplete
> types, and now -flimit-debug-info) ?
> 

ValueObjects currently treat their types as a computed not stored entity.  
There’s not a "CompilerType m_type” ivar, only a pure virtual “CompilerType 
*GetCompilerType”.  But I don’t know whether we’re taking use of that fact or 
not.  But we could broadcast a “ModulesChanged” to the ValueObjects as well as 
to the Breakpoints and have them react to that.

>> 
>>> 
>>> I'm hoping that this stuff won't be "hard work". I haven't prototyped
>>> the code yet, but I am hoping to keep this lookup code in under 200 LOC.
>>> And as Greg points out, there are ways to put this stuff into the type
>>> system -- I'm just not sure whether that is needed given that the
>>> ValueObject class is the only user of the GetIndexOfChildMemberWithName
>>> interface. The whole function is pretty clearly designed with
>>> ValueObject::GetChildMemberWithName in mind.
>> 
>> It seems fine to me to proceed along the lines you propose.  If it ends
>> up being smooth sailing, I can’t see any reason not to do it this way.
>>  When/If you end up having lots of corner cases to manage, would be the
>> time to consider cutting back to using the real type system to back
>> these computations.
> 
> Ok, sounds good. Let me create a prototype for this, and we'll see how
> it goes from there. It may t

Re: [lldb-dev] Remote debug arm bare metal target with lldb - load executable to target

2020-07-23 Thread Greg Clayton via lldb-dev
Sounds like you got close. 

What does your target look like when you type:

(lldb) target list


I forgot to mention one thing that we do for ELF files:

We make sections named PT_LOAD[N] where in N starts at zero. We do this because 
this is essentially how dynamic loaders actually load the binary in an OS, so 
it makes it easier for us. All sections from the section headers that are 
contained within a program header, will be made children of the PT_LOAD section 
they belong to. It is also interesting to note that any sections that are not 
part of a program header, like say DWARF sections, will be left as top level 
sections. You can easily see how the sections are organized by doing:

(lldb) image dump sections test.elf

Or you can use the python interface to see this more clearly. In the example 
below I use the "lldb.target" global variable to get to the target and use 
special python properties to be able to iterate. Also, the modules list within 
a target has the list of executables where the first entry 
(lldb.target.modules[0]) is the main executable.

(lldb) script
>>> for section in lldb.target.modules[0].sections:
... print(section)
... 
[0x-0xffd0) libfoo.so.PT_LOAD[0]
[0x0001-0x000101cc) libfoo.so.PT_LOAD[1]
[0x00011000-0x00011004) libfoo.so.PT_LOAD[2]
[0x-0x) libfoo.so..comment
[0x-0x) libfoo.so..ARM.attributes
[0x-0x) libfoo.so..debug_str
[0x-0x) libfoo.so..debug_loc
[0x-0x) libfoo.so..debug_abbrev
[0x-0x) libfoo.so..debug_info
[0x-0x) libfoo.so..debug_ranges
[0x-0x) libfoo.so..debug_macinfo
[0x-0x) libfoo.so..debug_frame
[0x-0x) libfoo.so..debug_line
[0x-0x) libfoo.so..symtab
[0x-0x) libfoo.so..shstrtab
[0x-0x) libfoo.so..strtab

And you can iterate over the subsections within PT_LOAD[0] with python as well:

>>> for section in lldb.target.modules[0].sections[0]:
... print(section)
... 
[0x0154-0x01ec) libfoo.so.PT_LOAD[0]..note.android.ident
[0x01ec-0x0210) libfoo.so.PT_LOAD[0]..note.gnu.build-id
[0x0210-0x0750) libfoo.so.PT_LOAD[0]..dynsym
[0x0750-0x07f8) libfoo.so.PT_LOAD[0]..gnu.version
[0x07f8-0x0838) libfoo.so.PT_LOAD[0]..gnu.version_r
[0x0838-0x09c4) libfoo.so.PT_LOAD[0]..gnu.hash
[0x09c4-0x0c6c) libfoo.so.PT_LOAD[0]..hash
[0x0c6c-0x11d0) libfoo.so.PT_LOAD[0]..dynstr
[0x11d0-0x12a8) libfoo.so.PT_LOAD[0]..rel.dyn
[0x12a8-0x1980) libfoo.so.PT_LOAD[0]..ARM.exidx
[0x1980-0x1a70) libfoo.so.PT_LOAD[0]..rel.plt
[0x1a70-0x1ad0) libfoo.so.PT_LOAD[0]..ARM.extab
[0x1ad0-0x3cb4) libfoo.so.PT_LOAD[0]..rodata
[0x3cb8-0xfdc4) libfoo.so.PT_LOAD[0]..text
[0xfdd0-0xffd0) libfoo.so.PT_LOAD[0]..plt

We put a '.' character between parent sections and their child sections, so 
this makes things look a bit messy in the output ("libfoo.so" + "." + 
"PT_LOAD[0]" + "." + ".text").

So try doing this:

(lldb) target modules load –file test.elf –load –set-pc-to-entry PT_LOAD[0] 
 PT_LOAD[1]  ...

Specify the address for each PT_LOAD program header and let us know if this 
works?


Greg

> On Jul 23, 2020, at 12:16 PM,  
>  wrote:
> 
> Hello,
> i tried to do it like you described:
> Create target with 
> ‘target create --arch armv7-none-none test.elf’
> Connect to OpenOCD with
> ‘gdb-remote localhost:‘
> (In the original message, I didn’t mention explicitly that I was connecting 
> to OpenOCD, sorry for that)
> Tried to load the executable to the STM32 with
> ‘target modules load –file test.elf –load –set-pc-to-entry’
> This failed with ‘error: one or more section name + load address pair must be 
> specified’
> I tried again, but this time specified a section and an address.
> It didn’t fail, but didn’t load the specified section to the STM32 either.
> 
> “Is this a baseboard situation where you have an ELF file that has program 
> headers with all of the correct load addresses?”
> Yes, I have an ELF file with multiple LOAD program headers.
> The entry point address and program header addresses are all set correctly.
>  
>  
> Von: Greg Clayton  
> Gesendet: Donnerstag, 23. Juli 2020 01:05
> An: lulle2007...@gmail.com
> Cc: lldb-dev@lists.llvm.org
> Betreff: Re: [lldb-dev] Remote debug arm bare metal target with lldb - load 
> executable to target
>  
> The --load option s