[lldb-dev] RFC: -flimit-debug-info + frame variable

2020-07-20 Thread Pavel Labath via lldb-dev
Hello all,

With the expression evaluation part of -flimit-debug-info nearly
completed, I started looking at doing the same for the "frame variable"
command.

I have thought this would be simpler than the expression evaluator,
since we control most of that code. However, I have hit one snag, hence
this RFC.

The problem centers around how to implement
ValueObject::GetChildMemberWithName, which is the engine of the
subobject resultion in the "frame variable" command. Currently, this
function delegates most of the work to
CompilerType::GetIndexOfChildMemberWithName, which returns a list of (!)
indexes needed to access the relevant subobject. The list aspect is
important, because the desired object can be in a base class or in a C11
anonymous struct member.

The CompilerType instance in question belongs to the type system of the
module from which we retrieved the original variable. Therein lies the
problem -- this type system does not have complete information about the
contents of the base class subobjects.

Now, my question is what to do about it. At the moment, it seems to me
that the easiest solution to this problem would be to replace
CompilerType::GetIndexOfChildMemberWithName, with two new interfaces:
- Get(IndexOf)**Direct**ChildMemberWithName -- return any direct
children with the given name
- IsTransparent -- whether to descend into the type during name lookups
(i.e., is this an anonymous struct member)

The idea is that these two functions (in conjunction with existing
methods) can provide their answers even in a -flimit-debug-info setting,
and they also provide enough information for the caller to perform the
full name lookup himself. It would first check for direct members, and
if no matches are found, (recursively) proceed to look in all the
transparent members and base classes, switching type systems if the
current one does not contain the full type definition.

The downside of that is that this would hardcode a specific, c++-based,
algorithm which may not be suitable for all languages. Swift has a
fairly simple inheritance model, so I don't think this should be a
problem there, but for example python uses a slightly different method
to resolve ambiguities. The second downside is that a faithful
implementation of the c++ model, including the virtual inheritance
dominance is going to be fairly complicated.

The first issue could be solved by moving this logic into the clang
plugin, but making it independent of any specific type system instance.
The second issue is unavoidable, except by creating a unified view of
the full type in some scratch ast context, as we do for expression
evaluation.

That said, it's not clear to me how faithful do we need the "frame
variable" algorithm to be. The frame variable syntax does not precisely
follow the c++ semantics anyway. And a simple "recurse into subclasses"
algorithm is going to be understandable and be "close enough" under
nearly all circumstances. Virtual inheritance is used very seldomly, and
shadowing of members defined in a base class is even rarer.

While analysing this code I've found much more serious bugs (e.g.,
accessing a transparent member fetches a random other value if the class
it is in also has base cases; fetching a transparent member in a base
class does not work at all), which seem to have existed for quite some
time without being discovered.

For that reason I am tempted to just implement a basic "recurse into
subclasses" algorithm inside ValueObject::GetChildMemberWithName, and
not bother with the virtual inheritance details, nor with being able to
customize this algorithm to the needs of a specific language.

What do you think?

regards,
Pavel
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


Re: [lldb-dev] RFC: -flimit-debug-info + frame variable

2020-07-20 Thread Greg Clayton via lldb-dev
Thanks for the write up!

I agree that the existing APIs are useful for exploring the types as they 
appear (and are completed within) in the module they came from. Now we are 
asking more complex questions from them. As with all software, things started 
out simple and have gotten quite a bit more complex as we went along and added 
interfaces and new requirements.

My first thought was that as soon as you dive into a CompilerType with a 
question about the type itself, or anything contained within, where any parts 
of the type can be incomplete, you can't just use a CompilerType on its own. 
Each CompilerType might be able to tell you the module, and possibly only a 
target for expressions ASTs or target ASTs for expression results, but it might 
not have access to a target if we have a type from a module. 

So we need new API calls where you must supply a target so that you can find 
the type in other modules when you find something that we know is incomplete 
and need the real type. We can add new APIs to the TypeSystem class to take 
care of this and these APIs will need to take a target to allow finding types 
outside of the current module's types. If we add new APIs and switch any code 
that requires resolving of types on the fly over to using the new APIs, we 
might be able to leave the old APIs in place or remove them if they are no 
longer used after the refactor.

Or the other option is to try and leave the TypeSystem and CompilerType stuff 
alone and add a new "TargetType" class that has a target + CompilerType. And 
lookups on those types can be smart about where they grab types? They could 
still call through to new TypeSystem virtual functions that use the target for 
resolving types. 

More comments inlined below between your paragraphs.



> On Jul 20, 2020, at 5:34 AM, Pavel Labath  wrote:
> 
> Hello all,
> 
> With the expression evaluation part of -flimit-debug-info nearly
> completed, I started looking at doing the same for the "frame variable"
> command.
> 
> I have thought this would be simpler than the expression evaluator,
> since we control most of that code. However, I have hit one snag, hence
> this RFC.
> 
> The problem centers around how to implement
> ValueObject::GetChildMemberWithName, which is the engine of the
> subobject resultion in the "frame variable" command. Currently, this
> function delegates most of the work to
> CompilerType::GetIndexOfChildMemberWithName, which returns a list of (!)
> indexes needed to access the relevant subobject. The list aspect is
> important, because the desired object can be in a base class or in a C11
> anonymous struct member.
> 
> The CompilerType instance in question belongs to the type system of the
> module from which we retrieved the original variable. Therein lies the
> problem -- this type system does not have complete information about the
> contents of the base class subobjects.

yes, and this requires a target (or a module list from the target to be more 
precise) in order to answer the questions.

> 
> Now, my question is what to do about it. At the moment, it seems to me
> that the easiest solution to this problem would be to replace
> CompilerType::GetIndexOfChildMemberWithName, with two new interfaces:
> - Get(IndexOf)**Direct**ChildMemberWithName -- return any direct
> children with the given name
> - IsTransparent -- whether to descend into the type during name lookups
> (i.e., is this an anonymous struct member)
> 
> The idea is that these two functions (in conjunction with existing
> methods) can provide their answers even in a -flimit-debug-info setting,
> and they also provide enough information for the caller to perform the
> full name lookup himself. It would first check for direct members, and
> if no matches are found, (recursively) proceed to look in all the
> transparent members and base classes, switching type systems if the
> current one does not contain the full type definition.
> 
> The downside of that is that this would hardcode a specific, c++-based,
> algorithm which may not be suitable for all languages. Swift has a
> fairly simple inheritance model, so I don't think this should be a
> problem there, but for example python uses a slightly different method
> to resolve ambiguities. The second downside is that a faithful
> implementation of the c++ model, including the virtual inheritance
> dominance is going to be fairly complicated.

Sounds like that can work easily for C/C++. I would prefer to leave things up 
to the type systems for name lookups so they can each do the lookup however 
they can by using the type itself and or looking up completed types in the 
target's modules. The fact that the current solution for name lookup relies on 
indexes was just a convenience and happened to work for C/C++ and the static 
typing we have had up until your new support. The index solution isn't required 
in any new solution.


> 
> The first issue could be solved by moving this logic into the clang
> plugin, but

Re: [lldb-dev] RFC: -flimit-debug-info + frame variable

2020-07-20 Thread Jim Ingham via lldb-dev
It seems like you are having to work hard in the ValueObject system because you 
don’t want to use single AST Type for the ValueObject’s type.  Seems like it be 
much simpler if you could cons up a complete type in the ScratchASTContext, and 
then use the underlying TypeSystem to do the layout computation.

Preserving the full type in the scratch context also avoids other problems.  
For instance, suppose module A has a class that has an opaque reference to a 
type B.  There is a full definition of B in modules B and C.  If you make up a 
ValueObject for an object of type A resolving the full type to the one in 
Module B you can get into trouble.  Suppose the next user step is over the 
dlclose of module B.  When the local variable goes to see if it has changed, it 
will stumble across a type reference to a module that’s no longer present in 
the program.  And if somebody calls RemoveOrphanedModules it won’t even be in 
the shared module cache.

You can try to finesse this by saying you can choose the type from the defining 
module so it can’t go away.  But a) I don’t think you can know that for 
non-virtual classes in C++ and I don’t think you guarantee you can know how to 
do that for any given language.

I wonder if it wouldn’t be a better approach to build up a full compiler-type 
by importing the types you find into the scratch AST context.  That way you 
know they can’t go away.   And since you still have a full CompilerType for the 
variable, you can let the languages tell you where to find children based on 
their knowledge of the types.

Jim


> On Jul 20, 2020, at 5:34 AM, Pavel Labath  wrote:
> 
> Hello all,
> 
> With the expression evaluation part of -flimit-debug-info nearly
> completed, I started looking at doing the same for the "frame variable"
> command.
> 
> I have thought this would be simpler than the expression evaluator,
> since we control most of that code. However, I have hit one snag, hence
> this RFC.
> 
> The problem centers around how to implement
> ValueObject::GetChildMemberWithName, which is the engine of the
> subobject resultion in the "frame variable" command. Currently, this
> function delegates most of the work to
> CompilerType::GetIndexOfChildMemberWithName, which returns a list of (!)
> indexes needed to access the relevant subobject. The list aspect is
> important, because the desired object can be in a base class or in a C11
> anonymous struct member.
> 
> The CompilerType instance in question belongs to the type system of the
> module from which we retrieved the original variable. Therein lies the
> problem -- this type system does not have complete information about the
> contents of the base class subobjects.
> 
> Now, my question is what to do about it. At the moment, it seems to me
> that the easiest solution to this problem would be to replace
> CompilerType::GetIndexOfChildMemberWithName, with two new interfaces:
> - Get(IndexOf)**Direct**ChildMemberWithName -- return any direct
> children with the given name
> - IsTransparent -- whether to descend into the type during name lookups
> (i.e., is this an anonymous struct member)
> 
> The idea is that these two functions (in conjunction with existing
> methods) can provide their answers even in a -flimit-debug-info setting,
> and they also provide enough information for the caller to perform the
> full name lookup himself. It would first check for direct members, and
> if no matches are found, (recursively) proceed to look in all the
> transparent members and base classes, switching type systems if the
> current one does not contain the full type definition.
> 
> The downside of that is that this would hardcode a specific, c++-based,
> algorithm which may not be suitable for all languages. Swift has a
> fairly simple inheritance model, so I don't think this should be a
> problem there, but for example python uses a slightly different method
> to resolve ambiguities. The second downside is that a faithful
> implementation of the c++ model, including the virtual inheritance
> dominance is going to be fairly complicated.
> 
> The first issue could be solved by moving this logic into the clang
> plugin, but making it independent of any specific type system instance.
> The second issue is unavoidable, except by creating a unified view of
> the full type in some scratch ast context, as we do for expression
> evaluation.
> 
> That said, it's not clear to me how faithful do we need the "frame
> variable" algorithm to be. The frame variable syntax does not precisely
> follow the c++ semantics anyway. And a simple "recurse into subclasses"
> algorithm is going to be understandable and be "close enough" under
> nearly all circumstances. Virtual inheritance is used very seldomly, and
> shadowing of members defined in a base class is even rarer.
> 
> While analysing this code I've found much more serious bugs (e.g.,
> accessing a transparent member fetches a random other value if the class
> it is in also has base cases; fetc

Re: [lldb-dev] RFC: -flimit-debug-info + frame variable

2020-07-20 Thread Jim Ingham via lldb-dev
BTW, Adrian already did something along these lines for the dynamic types of 
swift “frame variable” values.  In that case, the dynamic type is quite likely 
from some entirely unrelated module.  Swift makes a lot of use of protocols, so 
code is going to pass through your module that shares no actual types in common 
with it…  So you really don’t want to pollute the module's TypeSystem with 
these unrelated dynamic types.  His solution was to put the dynamic type 
results in the scratch AST context.  I don’t remember the details of his 
implementation, but he probably does…

Jim


> On Jul 20, 2020, at 2:25 PM, Jim Ingham via lldb-dev 
>  wrote:
> 
> It seems like you are having to work hard in the ValueObject system because 
> you don’t want to use single AST Type for the ValueObject’s type.  Seems like 
> it be much simpler if you could cons up a complete type in the 
> ScratchASTContext, and then use the underlying TypeSystem to do the layout 
> computation.
> 
> Preserving the full type in the scratch context also avoids other problems.  
> For instance, suppose module A has a class that has an opaque reference to a 
> type B.  There is a full definition of B in modules B and C.  If you make up 
> a ValueObject for an object of type A resolving the full type to the one in 
> Module B you can get into trouble.  Suppose the next user step is over the 
> dlclose of module B.  When the local variable goes to see if it has changed, 
> it will stumble across a type reference to a module that’s no longer present 
> in the program.  And if somebody calls RemoveOrphanedModules it won’t even be 
> in the shared module cache.
> 
> You can try to finesse this by saying you can choose the type from the 
> defining module so it can’t go away.  But a) I don’t think you can know that 
> for non-virtual classes in C++ and I don’t think you guarantee you can know 
> how to do that for any given language.
> 
> I wonder if it wouldn’t be a better approach to build up a full compiler-type 
> by importing the types you find into the scratch AST context.  That way you 
> know they can’t go away.   And since you still have a full CompilerType for 
> the variable, you can let the languages tell you where to find children based 
> on their knowledge of the types.
> 
> Jim
> 
> 
>> On Jul 20, 2020, at 5:34 AM, Pavel Labath  wrote:
>> 
>> Hello all,
>> 
>> With the expression evaluation part of -flimit-debug-info nearly
>> completed, I started looking at doing the same for the "frame variable"
>> command.
>> 
>> I have thought this would be simpler than the expression evaluator,
>> since we control most of that code. However, I have hit one snag, hence
>> this RFC.
>> 
>> The problem centers around how to implement
>> ValueObject::GetChildMemberWithName, which is the engine of the
>> subobject resultion in the "frame variable" command. Currently, this
>> function delegates most of the work to
>> CompilerType::GetIndexOfChildMemberWithName, which returns a list of (!)
>> indexes needed to access the relevant subobject. The list aspect is
>> important, because the desired object can be in a base class or in a C11
>> anonymous struct member.
>> 
>> The CompilerType instance in question belongs to the type system of the
>> module from which we retrieved the original variable. Therein lies the
>> problem -- this type system does not have complete information about the
>> contents of the base class subobjects.
>> 
>> Now, my question is what to do about it. At the moment, it seems to me
>> that the easiest solution to this problem would be to replace
>> CompilerType::GetIndexOfChildMemberWithName, with two new interfaces:
>> - Get(IndexOf)**Direct**ChildMemberWithName -- return any direct
>> children with the given name
>> - IsTransparent -- whether to descend into the type during name lookups
>> (i.e., is this an anonymous struct member)
>> 
>> The idea is that these two functions (in conjunction with existing
>> methods) can provide their answers even in a -flimit-debug-info setting,
>> and they also provide enough information for the caller to perform the
>> full name lookup himself. It would first check for direct members, and
>> if no matches are found, (recursively) proceed to look in all the
>> transparent members and base classes, switching type systems if the
>> current one does not contain the full type definition.
>> 
>> The downside of that is that this would hardcode a specific, c++-based,
>> algorithm which may not be suitable for all languages. Swift has a
>> fairly simple inheritance model, so I don't think this should be a
>> problem there, but for example python uses a slightly different method
>> to resolve ambiguities. The second downside is that a faithful
>> implementation of the c++ model, including the virtual inheritance
>> dominance is going to be fairly complicated.
>> 
>> The first issue could be solved by moving this logic into the clang
>> plugin, but making it independent of any specific type system inst

[lldb-dev] LLVM 10.0.1-final has been tagged

2020-07-20 Thread Tom Stellard via lldb-dev

Hi,

I've tagged 10.0.1-final.  Testers, please begin uploading your binaries.

Thanks,
Tom

___
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev