Hi Paul, thanks for the clarification. Yes, that was my fault. The LTO object DWARF has multiple top-level compile unit tags and only the first one has a DW_AT_ranges attribute.
One more thought here: In DWARFDebugInfo::GetCompileUnitAranges() LLDB currently tries to read .debug_aranges and if it can’t, it falls back to reading DW_AT_ranges from the compile unit tags. Would you have objections from constructing that range from DW_AT_low_pc and DW_AT_high_pc here, if they exist? IIUC code for a single compile unit is not guaranteed to be compact, but in practice it usually is right? Thanks Stefan > On 18. Sep 2018, at 15:01, paul.robin...@sony.com wrote: > > Are the nested DW_TAG_compile_unit tags in my LTO object missing their > DW_AT_ranges or is that expected? > > Compile units should never be nested. That's invalid DWARF. A unit is the > root of the DIE tree described by each unit header. <> > Whatever is producing one unit contained inside another is doing something > wrong. > --paulr > > From: lldb-dev [mailto:lldb-dev-boun...@lists.llvm.org] On Behalf Of Stefan > Gränitz via lldb-dev > Sent: Friday, September 14, 2018 11:57 AM > To: LLDB Dev > Subject: Re: [lldb-dev] Extract compile unit infos from OSO entries for LTO > objects > > I made some progress on this and next to many small issues, I found that > DWARFDebugAranges::FindAddress() returns 0 as the CU offset for my file > addresses and wonder if this is expected? > > DWARFDebugAranges::FindAddress() is supposed to read from .debug_aranges > section or DW_AT_ranges attributes right? When dumping with llvm-dwarfdump, > it looks like regular objects have exactly one DW_TAG_compile_unit and it has > a DW_AT_ranges attribute. Same for the top-level DW_TAG_compile_unit in LTO > objects, but all further DW_TAG_compile_unit tags have no DW_AT_ranges > attribute here. The comment in DWARFUnit::BuildAddressRangeTable() states > that Clang “emits DW_AT_ranges for DW_TAG_compile_units”. > > Are the nested DW_TAG_compile_unit tags in my LTO object missing their > DW_AT_ranges or is that expected? > It looks like FindAddress() would return the offset correctly if it was there > and the fix may get simpler. > > Top-level: > 0x0000000b: DW_TAG_compile_unit > ... > DW_AT_low_pc. (0x0000000000000000) > DW_AT_ranges. (0x00000040 > [0x0000000000000000, 0x00000000000007de) > [0x00000000000007f0, 0x00000000000033cd)) > > Nested: > 0x000130b3: DW_TAG_compile_unit > ... > DW_AT_low_pc (0x0000000000003f50) > DW_AT_high_pc. (0x000000000000dd0f) > > > > As far as I can tell, we need the actual number of CUs only after we > discovered plugins. > > Split that off into HasCompileUnits() and GetNumCompileUnits(). It works well > so far. Also managed to extract the single CUs with correct offsets in the > LTO object file once GetNumCompileUnits() calls InitOSO(). I kept the logic > for associating the CUs with line ranges in the Symtab, so > first/last_symbol_id/index are the same for all those CUs. Maybe the code > needs a few more adjustments to support this, but so far I don’t see > showstoppers. > > Short recap for what works after the simple changes: > * We iterate over the actual CUs in > BreakpointResolverFileLine::SearchCallback(). > * For the matching CU, we get the actual list of support files and find the > correct file_idx and line_entry in CompileUnit::ResolveSymbolContext(). > * We correctly link the OSO line table for this CU (with some overhead due to > overlapping symbol ranges, but IIUC it’s fine as we only pick the ones for > the CU). > > > IF you do make a patch, please remove any functions in > SymbolFileDWARFDebugMap that are dead code. > SymbolFileDWARFDebugMap::GetModuleByOSOIndex() seems to be dead. If that is > dead thenvSymbolFileDWARFDebugMap::GetObjectFileByOSOIndex() seems to be dead > also. > > Yes both functions are dead and yes sure, I can include that in my patch. > Another one is SymbolFileDWARFDebugMap::PrivateFindGlobalVariables(), which > is one of the clients for symbol slices/ranges. The remaining ones are > SymbolFileDWARFDebugMap::SymbolContainsSymbolWithID/Index() and > SymbolFileDWARFDebugMap::CompileUnitInfo::GetFileRangeMap(), which are all > alive. > > > > On 11. Sep 2018, at 20:13, Stefan Gränitz via lldb-dev > <lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org>> wrote: > > Thanks for your replies! > > > Right now we assume that each .o file has only on CU so we don't need to open > all .o files in SymbolFileDWARFDebugMap::CalculateAbilities() which is > something that gets run when we are trying to figure out which SymbolFile > plug-in to load. [...] The problem used to be that even if we had a dSYM > file, the loop that selected the symbol file plug-in would give each and > every symbol file plugin a chance ... > > Ok makes total sense, kind of historical reason. > > But that being said, in the past I re-ordered how the SymbolFile plug-ins > were initialized to always put SymbolFileDWARF first and if it finds DWARF > debug info or a dSYM file and has all the abilities then we stop looking for > symbol file plug-ins that can load the current file. [...] So as soon as a > SymbolFile plug-in can do everything we now stop. > > Good to know! > > Note that counting the number of compile units can be done extremely cheaply > (of course, except the cost of opening the file). Each unit as a header that > contain its length (which gives you the next unit). I’m not sure we have a > primitive that does this without parsing the DWARF, but it should be easy to > add. > > Right, so we only need to parse the CU headers. That should be fast. > Opening each candidate .o file for the relatively rare case of having > multiple CUs still sounds expensive, assuming that “thousands of .o files” > may actually happen. > > CalculateAbilities() does indeed call GetNumCompileUnits(), but what it > really wants to know at this time is “do we have any CU in there”: > > uint32_t SymbolFileDWARFDebugMap::CalculateAbilities() { > ... > const uint32_t oso_index_count = GetNumCompileUnits(); > if (oso_index_count > 0) { > InitOSO(); > if (!m_compile_unit_infos.empty()) { > return SymbolFile::CompileUnits | SymbolFile::Functions | …; > } > } > > As far as I can tell, we need the actual number of CUs only after we > discovered plugins. In my case it’s during breakpoint resolution (i.e. > BreakpointResolverFileLine::SearchCallback()). If we separated these two > concerns conceptually (into HasCompileUnits() and GetNumCompileUnits()), > couldn’t we then also do InitOSO() in two steps? Especially since lazy init > is used everywhere already. This would avoid impact on CalculateAbilities() > entirely. > > That said, I don’t really know how big the change would get then. And it > probably adds complexity, while the implementation is quite complex already. > Anyway, for now what do you think about the idea? > > > Do we really need this CU <-> Symbol mapping? > > It’s used in SymbolFileDWARFDebugMap::SymbolContainsSymbolWithIndex(), which > looks like dead code. > It’s also used in > SymbolFileDWARFDebugMap::CompileUnitInfo::GetFileRangeMap(), which > initialises the OSO range map once. In order to do that it iterates over all > CUs, so changing this or adding a special case here seems possible. > > https://github.com/llvm-mirror/lldb/blob/59608853be9b52d3c01609196d152b3e3dbb4dac/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp#L172 > > <https://github.com/llvm-mirror/lldb/blob/59608853be9b52d3c01609196d152b3e3dbb4dac/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp#L172> > > What do you think? > > Best > Stefan > > > On 11. Sep 2018, at 18:03, Frédéric Riss <fr...@apple.com > <mailto:fr...@apple.com>> wrote: > > > > On Sep 11, 2018, at 8:48 AM, Greg Clayton <clayb...@gmail.com > <mailto:clayb...@gmail.com>> wrote: > > > On Sep 11, 2018, at 2:55 AM, Stefan Gränitz <sgraen...@apple.com > <mailto:sgraen...@apple.com>> wrote: > > Hello everyone > > I am investigating a bug that prevents correct breakpoint resolution in LTO > objects with embedded DWARF (no separate dSYM file) and tracked it down to > the initialization of SymbolFileDWARFDebugMap. This code seems to assume > there is only one compile unit per object file, but LTO objects have more > than that: > > void SymbolFileDWARFDebugMap::InitOSO() { > ... > const uint32_t oso_index_count = > symtab->AppendSymbolIndexesWithTypeAndFlagsValue( > eSymbolTypeObjectFile, k_oso_symbol_flags_value, oso_indexes); > ... > m_compile_unit_infos.resize(oso_index_count); // <—— one CU per OSO entry > in the Symtab > > for (uint32_t i = 0; i < oso_index_count; ++i) { > const uint32_t so_idx = oso_indexes[i] - 1; > const uint32_t oso_idx = oso_indexes[i]; > const Symbol *so_symbol = symtab->SymbolAtIndex(so_idx); > const Symbol *oso_symbol = symtab->SymbolAtIndex(oso_idx); > ... > const Symbol *last_symbol = symtab->SymbolAtIndex(sibling_idx - 1); > m_compile_unit_infos[i].first_symbol_index = so_idx; > m_compile_unit_infos[i].last_symbol_index = sibling_idx - 1; > m_compile_unit_infos[i].first_symbol_id = so_symbol->GetID(); > m_compile_unit_infos[i].last_symbol_id = last_symbol->GetID(); > > The symptom is that LLDB will only read debug_line for one CU and miss all > the rest. Thus, breakpoints in other CUs can’t be associated with line > information. > > I wonder if there is a good way to populate the correct number of compile > units from the OSO entry at this point? > > The reason it is like this is we don't want to have to open all .o files when > we parse the debug map in order to figure out a compile unit index. Right now > the compile unit UserID is just the index of the .o file in the debug map. > Opening thousands of .o files can impose a performance issue. > > > The situation may appear similar to an archive file with a number of objects, > but then we have separate OSO entries like > “path/to/lib/libLLVMMCParser.a(AsmLexer.cpp.o)”. Furthermore LTO objects have > one common symtab for all compile units and it was probably mixed up by > optimization, so we cannot simply say that CUs start/end at certain symbol > indexes as in the above code. The latter is used rarely and only in > SymbolFileDWARFDebugMap, so there may be a workaround, but first I have to > figure out the initial question: > > How to get more information about compile units in an LTO object? Any ideas > welcome! > > The only way is to open each .o file and see how many compile units they > contain. Right now we assume that each .o file has only on CU so we don't > need to open all .o files in SymbolFileDWARFDebugMap::CalculateAbilities() > which is something that gets run when we are trying to figure out which > SymbolFile plug-in to load. But that being said, in the past I re-ordered how > the SymbolFile plug-ins were initialized to always put SymbolFileDWARF first > and if it finds DWARF debug info or a dSYM file and has all the abilities > then we stop looking for symbol file plug-ins that can load the current file. > The problem used to be that even if we had a dSYM file, the loop that > selected the symbol file plug-in would give each and every symbol file plugin > a chance to tell us how much info they could extract via a call to > SymbolFile::CalculateAbilities() and that would cause us to open all .o files > just to say "I parse all debug info" just like a previous plug-in could. So > as soon as a SymbolFile plug-in can do everything we now stop. > > If that’s not possible, I may find another way to fix it further down the > road, but then the name m_compile_unit_infos seems not exactly correct here. > It’s rather something like m_referenced_object_infos, right? > > So now that that change has been in for a while, it might be ok to open each > .o file and see how many compile units they contain and then populate > m_compile_unit_infos as needed. > > Note that counting the number of compile units can be done extremely cheaply > (of course, except the cost of opening the file). Each unit as a header that > contain its length (which gives you the next unit). I’m not sure we have a > primitive that does this without parsing the DWARF, but it should be easy to > add. > > > You will need to watch for any usage of m_compile_unit_infos and make sure it > does the correct thing. > > That’s the part I was worried about. The structure of m_compile_unit_infos > makes the assumption that we can associate slices of symbols to a compile > unit. I don’t think this is a correct assumption to make in the LTO case, and > even if it were, we’d need to parse the DWARF and do some pretty heavy > processing to extract the information. Do we really need this CU <-> Symbol > mapping? > > Fred > > > > > Btw.: My first attempt was a workaround for the symptom (see > https://reviews.llvm.org/D51546 <https://reviews.llvm.org/D51546>). It simply > reads all debug_lines for a single CU, but I’d really appreciate a better > solution. > > The fix in D51546 seems wrong because the only way we get to a line table is > via the DW_AT_stmt_list from a compile unit. If we can fix the LTO case to > load all compile units from the LTO.o files with multiple CU's this fix won't > be needed. > > So the correct solution is to detect how many compile units are in each .o > file and then make sure to find all places that were assuming anything about > the OSO index being the compile unit UserID are fixed. Now that plug-in > loading stops after a SymbolFile says it can handle everything we can > probably do a bit more work. One issue is that .o files might have been > cleaned up or removed, so be sure to test any solution by removing the .o > files and seeing how we do. > > I will be happy to review any patch you have. I can't think of any other > reason the the OSO index needs to be the compile unit index. IF you do make a > patch, please remove any functions in SymbolFileDWARFDebugMap that are dead > code. SymbolFileDWARFDebugMap::GetModuleByOSOIndex() seems to be dead. If > that is dead thenvSymbolFileDWARFDebugMap::GetObjectFileByOSOIndex() seems to > be dead also. > > _______________________________________________ > lldb-dev mailing list > lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev