Re: [lldb-dev] Identifying instructions that definitely access memory
Hi Vangelis, Not sure this will help you, but you can try to compare llvm::MachineInstr::getOpcode() with TargetOpcode::G_LOAD and TargetOpcode::G_STORE if you can obtain a MachineInstr instance. It also may have sense to ask llvm-dev for a proper solution. From: lldb-dev On Behalf Of Vangelis Tsiatsianas via lldb-dev Sent: Tuesday, November 5, 2019 3:43 PM To: via lldb-dev Cc: Vangelis Tsiatsianas Subject: Re: [lldb-dev] Identifying instructions that definitely access memory Hello, I decided to try once more with a follow-up email, since my previous one got no responses (I hope it’s not considered rude to send more than one message in a row for a particular question). To sum up and clarify my previous question, what I need is a way to track memory stores and save both the old and the new value of the memory location being modified. My thinking so far: 1. Recognize the instructions that definitely access memory before they execute, based on their opcode. 2. Tell whether each operand is a register or a memory location. 3. If it’s a memory location, check whether it is a load or store destination. 4. In case it is a store destination, fetch and save current value from memory. 5. Execute instruction. 6. Fetch and save new value from memory. However, I was not able to find a cross-architecture API that covers all of the conditions above and more specifically Instruction::DoesStore() and Operand::IsStoreDestination(). Last but not least, I should notice that the target is executed in single-step mode, so I do have control right before and after the execution of every instruction. Thanks, again, in advance! 🙂 ― Vangelis On 21 Oct 2019, at 08:54, Vangelis Tsiatsianas mailto:vangeli...@icloud.com>> wrote: Hello, I am looking for a way to identify loads, stores and any other kind of instruction that definitely perform memory access and extract the address operand(s), however I was not able to find a cross-architecture API. The closest I stumbled upon are "MCInstrDesc::mayLoad()" and "MCInstrDesc::mayStore()", but I understand that their results are just a hint, so I would then need to examine the instruction name or opcode in order to find out whether it’s actually a load or store and which operand(s) is (are) memory address(es) and also do so for each architecture separately, which I would really like to avoid. Is there a way to identify such instructions either by examining them through the disassembler (e.g. "DoesLoad()" | "DoesStore()") before they execute or right after they perform any kind of memory access? Thank you very much, in advance! 🙂 ― Vangelis ___ lldb-dev mailing list lldb-dev@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Re: [lldb-dev] Slow expression evaluation (ASTImporter is too eager?)
Hi Jaroslav, Thanks for working on this. Still, there are things that are unclear for me. I see that `LayoutRecordType` is called superfluously during the second evaluation of `c2.x`, ok. But, could you explain why LLDB is calling that multiple times? Maybe it thinks a type is not completed while it is? As far as I know we keep track which RecordDecl needs completeion in LLDB. At least, we do not store that info in clang::ASTImporter. Minimal import gives a CXXRecordDecl whose `DefinitionData` is set, even though the members are not imported the definition data is there! So, there is no way for clang::ASTImporter to know which RecordDecl had been imported in a minimal way. I suspect there are multiple invocations of ASTImporter::ImportDefinition with C2, C1, C0 within ClangASTSource (could you please check that?). And `ImportDefinition` will blindly import the full definitions recursively even if the minimal import is set (see ASTNodeImporter::IDK_Everything). The patch would change this behavior which I'd like to avoid if possible. I think first we should discover why there are multiple invocations of ASTImporter::ImportDefinition from within LLDB. Gabor On Wed, Nov 6, 2019 at 11:21 PM Raphael “Teemperor” Isemann via lldb-dev < lldb-dev@lists.llvm.org> wrote: > Can you post that patch on Phabricator with an '[ASTImporter]’ as a name > prefix? This way the ASTImporter folks will see your patch (The ASTImporter > is more of a Clang thing, so they might not read lldb-dev). Also it makes > it easier to see/test your patch :) > > (And +Gabor just in case) > > > On Nov 6, 2019, at 10:25 PM, Jaroslav Sevcik via lldb-dev < > lldb-dev@lists.llvm.org> wrote: > > > > Hello, > > > > I noticed that the AST importer is very eager to import classes/structs > that were already completed, even if they are not needed. This is best > illustrated with an example. > > > > struct C0 { int x = 0; }; > > struct C1 { int x = 1; C0* c0 = 0; }; > > struct C2 { int x = 2; C1* c1 = 0; }; > > > > int main() { > > C0 c0; > > C1 c1; > > C2 c2; > > > > return 0; // break here > > } > > > > When we evaluate “c2.x” in LLDB, AST importer completes and imports only > class C2. This is working as intended. Similarly, evaluating “c1.x” imports > just C1 and “c0.x” imports C0. However, if we evaluate “c2.x” after > evaluating “c1.x” and “c0.x”, the importer suddenly imports both C1 and C0 > (in addition to C2). See a log from a lldb session at the end of this email > for illustration. > > > > I believe the culprit here is the following code at the end of the > ASTNodeImporter::VisitRecordDecl method: > > > > if (D->isCompleteDefinition()) > > if (Error Err = ImportDefinition(D, D2, IDK_Default)) > > return std::move(Err); > > > > This will import a definition of class from LLDB if LLDB already happens > to have a complete definition from before. For large programs, this can > lead to importing very large chunks of ASTs even if they are not needed. I > have tried to remove the code above from clang and test performance on > several expressions in an Unreal engine sample - preliminary results show > this could cut down evaluation time by roughly 50%. > > > > My prototype patch is below; note that couple of lldb tests are failing > with a wrong error message, this is a work in progress. What would the > experts here think? Is this a plausible direction? > > > > Cheers, Jaro > > > > diff --git a/clang/lib/AST/ASTImporter.cpp > b/clang/lib/AST/ASTImporter.cpp > > index 54acca7dc62..ebbce5c66c7 100644 > > --- a/clang/lib/AST/ASTImporter.cpp > > +++ b/clang/lib/AST/ASTImporter.cpp > > @@ -2799,7 +2799,7 @@ ExpectedDecl > ASTNodeImporter::VisitRecordDecl(RecordDecl *D) { > >if (D->isAnonymousStructOrUnion()) > > D2->setAnonymousStructOrUnion(true); > > > > - if (D->isCompleteDefinition()) > > + if (!Importer.isMinimalImport() && D->isCompleteDefinition()) > > if (Error Err = ImportDefinition(D, D2, IDK_Default)) > >return std::move(Err); > > > > @@ -3438,6 +3438,9 @@ ExpectedDecl > ASTNodeImporter::VisitFieldDecl(FieldDecl *D) { > >if (ToInitializer) > > ToField->setInClassInitializer(ToInitializer); > >ToField->setImplicit(D->isImplicit()); > > + if (CXXRecordDecl *FieldType = D->getType()->getAsCXXRecordDecl()) > > + if (Error Err = ImportDefinitionIfNeeded(FieldType)) > > + return std::move(Err); > >LexicalDC->addDeclInternal(ToField); > >return ToField; > > } > > @@ -5307,7 +5310,7 @@ ExpectedDecl > ASTNodeImporter::VisitClassTemplateSpecializationDecl( > > > >D2->setTemplateSpecializationKind(D->getTemplateSpecializationKind()); > > > > - if (D->isCompleteDefinition()) > > + if (!Importer.isMinimalImport() && D->isCompleteDefinition()) > > if (Error Err = ImportDefinition(D, D2)) > >return std::move(Err); > > > > > > > > —— lldb session illustrating the unnecessary imports —- > > This shows that evaluation of “c2.x” after evaluation “c1.x” a
Re: [lldb-dev] Setting breakpoint on file and function name doesn't work as expected.
Note first, "break set -f foo.c -l 10" and "break set -f foo.c -n bar" are two very different breakpoints. In the first case, we are looking for a specific file & line combination in the line table or inlined functions info. In the latter case we are searching for a function names bar, and then restricting the results to the matching functions whose definition is in foo.c. It seemed neat to not make "break set" be a three-level command, like: (lldb) break set file-and-line (lldb) break set function-name etc... and use options for everything, but it does make it a little harder to document the cases where one option means something slightly different when used in combination with one or the other option. Anyway, so setting a breakpoint by file name and function name was meant to support the case where you have: $ cat foo.c static int baz() {} ... $ cat bar.c static int baz() {} ... and you want to limit the breakpoint to only the version of foo that was in foo.c. You would do: (lldb) b s -n baz -f foo.c Breakpoint 1: where = a.out`baz + 4 at foo.c:5:10, address = 0x00010ee4 In that case searching in the CU makes sense, and is efficient. Adding the name to a function name breakpoint is really only useful if you have multiple different instances of the same name that you need to disambiguate. In your example, you gain nothing by providing the header file name in which the symbol is defined. I don't think I designed the behavior for when the colliding functions are actually defined in separate header files - I don't remember considering the possibility. So what's happening is just an accidental outcome. It would certainly be possible, though a little trickier to do the filtering based on defining header. You will have to look for the symbol and then chase down its declaration records to make sure they are in the right .h file, and also look for any inlined instances. You probably also want to take some care not to regress the performance of the case where the function is defined in a CU, since that's the more common usage. Jim > On Nov 4, 2019, at 7:16 AM, Konrad Kleine via lldb-dev > wrote: > > I read the LLDB troubleshooting page [1] and found interesting quotes: > > > When setting breakpoints in implementation source files (.c, cpp, cxx, .m, > > .mm, etc), LLDB by > > default will only search for compile units whose filename matches. > > [...] > > % echo "settings set target.inline-breakpoint-strategy always" >> > > ~/.lldbinit > > This tells LLDB to always look in all compile units and search for > > breakpoint locations > > by file and line even if the implementation file doesn’t match. Setting > > breakpoints in header > > files always searches all compile units because inline functions are > > commonly defined in > > header files and often cause multiple breakpoints to have source line > > information that matches > > many header file paths. > > In my email before I did this > > $ lldb -x -b -o "breakpoint set --file foo.h --name foo" ./a.out > > I now added the breakpoint strategy and ran the above command without the -x > in order to pick up > the LLDB init code. Still no luck. > > [1]: https://lldb.llvm.org/use/troubleshooting.html#troubleshooting > > Am Mo., 4. Nov. 2019 um 13:56 Uhr schrieb Konrad Kleine : > Hello, > > I noticed this behavior for LLDB under Linux when setting a breakpoint on a > file and a function name: > > When doing "breakpoint set --file --name ", the > is that of the compile unit (CU) and not necessarily where the > function is defined. This is not what an end-user expects. > > Take this simple example program: > > $ cat foo.h > int foo(){ return 42; } > > $ cat main.c > #include "foo.h" > int main(){return foo();} > > $ clang -g main.c > > As you can see, the function foo is defined in foo.h so it seems natural to > set a breakpoint on foo.h, doesn't it? > > $ lldb -x -b -o "breakpoint set --file foo.h --name foo" ./a.out > (lldb) target create "./a.out" > Current executable set to './a.out' (x86_64). > (lldb) breakpoint set --file foo.h --name foo > Breakpoint 1: no locations (pending). > WARNING: Unable to resolve breakpoint to any actual locations. > > Apparently, LLDB cannot find the symbol like this. Let's try the only other > file that we have in the project: > > $ lldb -x -b -o "breakpoint set --file main.c --name foo" ./a.out > (lldb) target create "./a.out" > Current executable set to './a.out' (x86_64). > (lldb) breakpoint set --file main.c --name foo > Breakpoint 1: where = a.out`foo + 4 at foo.h:1:12, address = > 0x00401114 > > Isn't that remarkable? LLDB uses main.c as the file to search in but then > finds it in foo.h. > > Let's recall what the parameters --file and --name mean: > >-n ( --name ) > Set the breakpoint by function name. Can be repeated multiple > times to make one breakpoint for multiple names > >-f ( --file ) >