Hi Paul, thanks for the clarification. Yes, that was my fault. The LTO object 
DWARF has multiple top-level compile unit tags and only the first one has a 
DW_AT_ranges attribute.

One more thought here: In DWARFDebugInfo::GetCompileUnitAranges() LLDB 
currently tries to read .debug_aranges and if it can’t, it falls back to 
reading DW_AT_ranges from the compile unit tags. Would you have  objections 
from constructing that range from DW_AT_low_pc and DW_AT_high_pc here, if they 
exist? IIUC code for a single compile unit is not guaranteed to be compact, but 
in practice it usually is right?

Thanks
Stefan

> On 18. Sep 2018, at 15:01, paul.robin...@sony.com wrote:
> 
> Are the nested DW_TAG_compile_unit tags in my LTO object missing their 
> DW_AT_ranges or is that expected?
>  
> Compile units should never be nested. That's invalid DWARF.  A unit is the 
> root of the DIE tree described by each unit header. <>
> Whatever is producing one unit contained inside another is doing something 
> wrong.
> --paulr
>  
> From: lldb-dev [mailto:lldb-dev-boun...@lists.llvm.org] On Behalf Of Stefan 
> Gränitz via lldb-dev
> Sent: Friday, September 14, 2018 11:57 AM
> To: LLDB Dev
> Subject: Re: [lldb-dev] Extract compile unit infos from OSO entries for LTO 
> objects
>  
> I made some progress on this and next to many small issues, I found that 
> DWARFDebugAranges::FindAddress() returns 0 as the CU offset for my file 
> addresses and wonder if this is expected?
>  
> DWARFDebugAranges::FindAddress() is supposed to read from .debug_aranges 
> section or DW_AT_ranges attributes right? When dumping with llvm-dwarfdump, 
> it looks like regular objects have exactly one DW_TAG_compile_unit and it has 
> a DW_AT_ranges attribute. Same for the top-level DW_TAG_compile_unit in LTO 
> objects, but all further DW_TAG_compile_unit tags have no DW_AT_ranges 
> attribute here. The comment in DWARFUnit::BuildAddressRangeTable() states 
> that Clang “emits DW_AT_ranges for DW_TAG_compile_units”.
>  
> Are the nested DW_TAG_compile_unit tags in my LTO object missing their 
> DW_AT_ranges or is that expected?
> It looks like FindAddress() would return the offset correctly if it was there 
> and the fix may get simpler.
>  
> Top-level:
> 0x0000000b: DW_TAG_compile_unit
>     ...
>     DW_AT_low_pc.  (0x0000000000000000)
>     DW_AT_ranges.  (0x00000040
>        [0x0000000000000000, 0x00000000000007de)
>        [0x00000000000007f0, 0x00000000000033cd))
>  
> Nested:
> 0x000130b3: DW_TAG_compile_unit
>     ...
>     DW_AT_low_pc   (0x0000000000003f50)
>     DW_AT_high_pc. (0x000000000000dd0f)
>  
> 
> 
> As far as I can tell, we need the actual number of CUs only after we 
> discovered plugins.
>  
> Split that off into HasCompileUnits() and GetNumCompileUnits(). It works well 
> so far. Also managed to extract the single CUs with correct offsets in the 
> LTO object file once GetNumCompileUnits() calls InitOSO(). I kept the logic 
> for associating the CUs with line ranges in the Symtab, so 
> first/last_symbol_id/index are the same for all those CUs. Maybe the code 
> needs a few more adjustments to support this, but so far I don’t see 
> showstoppers.
>  
> Short recap for what works after the simple changes:
> * We iterate over the actual CUs in 
> BreakpointResolverFileLine::SearchCallback().
> * For the matching CU, we get the actual list of support files and find the 
> correct file_idx and line_entry in CompileUnit::ResolveSymbolContext().
> * We correctly link the OSO line table for this CU (with some overhead due to 
> overlapping symbol ranges, but IIUC it’s fine as we only pick the ones for 
> the CU).
>  
>  
> IF you do make a patch, please remove any functions in 
> SymbolFileDWARFDebugMap that are dead code. 
> SymbolFileDWARFDebugMap::GetModuleByOSOIndex() seems to be dead. If that is 
> dead thenvSymbolFileDWARFDebugMap::GetObjectFileByOSOIndex() seems to be dead 
> also.
>  
> Yes both functions are dead and yes sure, I can include that in my patch. 
> Another one is SymbolFileDWARFDebugMap::PrivateFindGlobalVariables(), which 
> is one of the clients for symbol slices/ranges. The remaining ones are 
> SymbolFileDWARFDebugMap::SymbolContainsSymbolWithID/Index() and 
> SymbolFileDWARFDebugMap::CompileUnitInfo::GetFileRangeMap(), which are all 
> alive.
>  
> 
> 
> On 11. Sep 2018, at 20:13, Stefan Gränitz via lldb-dev 
> <lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org>> wrote:
>  
> Thanks for your replies!
> 
> 
> Right now we assume that each .o file has only on CU so we don't need to open 
> all .o files in SymbolFileDWARFDebugMap::CalculateAbilities() which is 
> something that gets run when we are trying to figure out which SymbolFile 
> plug-in to load. [...] The problem used to be that even if we had a dSYM 
> file, the loop that selected the symbol file plug-in would give each and 
> every symbol file plugin a chance ...
>  
> Ok makes total sense, kind of historical reason.
>  
> But that being said, in the past I re-ordered how the SymbolFile plug-ins 
> were initialized to always put SymbolFileDWARF first and if it finds DWARF 
> debug info or a dSYM file and has all the abilities then we stop looking for 
> symbol file plug-ins that can load the current file. [...] So as soon as a 
> SymbolFile plug-in can do everything we now stop.
>  
> Good to know!
>  
> Note that counting the number of compile units can be done extremely cheaply 
> (of course, except the cost of opening the file). Each unit as a header that 
> contain its length (which gives you the next unit). I’m not sure we have a 
> primitive that does this without parsing the DWARF, but it should be easy to 
> add.
>  
> Right, so we only need to parse the CU headers. That should be fast.
> Opening each candidate .o file for the relatively rare case of having 
> multiple CUs still sounds expensive, assuming that “thousands of .o files” 
> may actually happen.
>  
> CalculateAbilities() does indeed call GetNumCompileUnits(), but what it 
> really wants to know at this time is “do we have any CU in there”:
> 
> uint32_t SymbolFileDWARFDebugMap::CalculateAbilities() {
>   ...
>   const uint32_t oso_index_count = GetNumCompileUnits();
>   if (oso_index_count > 0) {
>     InitOSO();
>     if (!m_compile_unit_infos.empty()) {
>       return SymbolFile::CompileUnits | SymbolFile::Functions | …;
>   }
> }
>  
> As far as I can tell, we need the actual number of CUs only after we 
> discovered plugins. In my case it’s during breakpoint resolution (i.e. 
> BreakpointResolverFileLine::SearchCallback()). If we separated these two 
> concerns conceptually (into HasCompileUnits() and GetNumCompileUnits()), 
> couldn’t we then also do InitOSO() in two steps? Especially since lazy init 
> is used everywhere already. This would avoid impact on CalculateAbilities() 
> entirely.
>  
> That said, I don’t really know how big the change would get then. And it 
> probably adds complexity, while the implementation is quite complex already.
> Anyway, for now what do you think about the idea?
>  
>  
> Do we really need this CU <-> Symbol mapping?
>  
> It’s used in SymbolFileDWARFDebugMap::SymbolContainsSymbolWithIndex(), which 
> looks like dead code.
> It’s also used in 
> SymbolFileDWARFDebugMap::CompileUnitInfo::GetFileRangeMap(), which 
> initialises the OSO range map once. In order to do that it iterates over all 
> CUs, so changing this or adding a special case here seems possible.
>  
> https://github.com/llvm-mirror/lldb/blob/59608853be9b52d3c01609196d152b3e3dbb4dac/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp#L172
>  
> <https://github.com/llvm-mirror/lldb/blob/59608853be9b52d3c01609196d152b3e3dbb4dac/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp#L172>
>  
> What do you think?
>  
> Best
> Stefan
> 
> 
> On 11. Sep 2018, at 18:03, Frédéric Riss <fr...@apple.com 
> <mailto:fr...@apple.com>> wrote:
> 
> 
> 
> On Sep 11, 2018, at 8:48 AM, Greg Clayton <clayb...@gmail.com 
> <mailto:clayb...@gmail.com>> wrote:
>  
> 
> On Sep 11, 2018, at 2:55 AM, Stefan Gränitz <sgraen...@apple.com 
> <mailto:sgraen...@apple.com>> wrote:
>  
> Hello everyone
>  
> I am investigating a bug that prevents correct breakpoint resolution in LTO 
> objects with embedded DWARF (no separate dSYM file) and tracked it down to 
> the initialization of SymbolFileDWARFDebugMap. This code seems to assume 
> there is only one compile unit per object file, but LTO objects have more 
> than that:
>  
> void SymbolFileDWARFDebugMap::InitOSO() {
> ...
>   const uint32_t oso_index_count =
>     symtab->AppendSymbolIndexesWithTypeAndFlagsValue(
>         eSymbolTypeObjectFile, k_oso_symbol_flags_value, oso_indexes);
> ...
>   m_compile_unit_infos.resize(oso_index_count); // <—— one CU per OSO entry 
> in the Symtab
>  
>   for (uint32_t i = 0; i < oso_index_count; ++i) {
>     const uint32_t so_idx = oso_indexes[i] - 1;
>     const uint32_t oso_idx = oso_indexes[i];
>     const Symbol *so_symbol = symtab->SymbolAtIndex(so_idx);
>     const Symbol *oso_symbol = symtab->SymbolAtIndex(oso_idx);
> ...
>       const Symbol *last_symbol = symtab->SymbolAtIndex(sibling_idx - 1);
>       m_compile_unit_infos[i].first_symbol_index = so_idx;
>       m_compile_unit_infos[i].last_symbol_index = sibling_idx - 1;
>       m_compile_unit_infos[i].first_symbol_id = so_symbol->GetID();
>       m_compile_unit_infos[i].last_symbol_id = last_symbol->GetID();
>  
> The symptom is that LLDB will only read debug_line for one CU and miss all 
> the rest. Thus, breakpoints in other CUs can’t be associated with line 
> information.
>  
> I wonder if there is a good way to populate the correct number of compile 
> units from the OSO entry at this point?
>  
> The reason it is like this is we don't want to have to open all .o files when 
> we parse the debug map in order to figure out a compile unit index. Right now 
> the compile unit UserID is just the index of the .o file in the debug map. 
> Opening thousands of .o files can impose a performance issue.
> 
>  
> The situation may appear similar to an archive file with a number of objects, 
> but then we have separate OSO entries like 
> “path/to/lib/libLLVMMCParser.a(AsmLexer.cpp.o)”. Furthermore LTO objects have 
> one common symtab for all compile units and it was probably mixed up by 
> optimization, so we cannot simply say that CUs start/end at certain symbol 
> indexes as in the above code. The latter is used rarely and only in 
> SymbolFileDWARFDebugMap, so there may be a workaround, but first I have to 
> figure out the initial question:
>  
> How to get more information about compile units in an LTO object? Any ideas 
> welcome!
>  
> The only way is to open each .o file and see how many compile units they 
> contain. Right now we assume that each .o file has only on CU so we don't 
> need to open all .o files in SymbolFileDWARFDebugMap::CalculateAbilities() 
> which is something that gets run when we are trying to figure out which 
> SymbolFile plug-in to load. But that being said, in the past I re-ordered how 
> the SymbolFile plug-ins were initialized to always put SymbolFileDWARF first 
> and if it finds DWARF debug info or a dSYM file and has all the abilities 
> then we stop looking for symbol file plug-ins that can load the current file. 
> The problem used to be that even if we had a dSYM file, the loop that 
> selected the symbol file plug-in would give each and every symbol file plugin 
> a chance to tell us how much info they could extract via a call to 
> SymbolFile::CalculateAbilities() and that would cause us to open all .o files 
> just to say "I parse all debug info" just like a previous plug-in could. So 
> as soon as a SymbolFile plug-in can do everything we now stop.
>  
> If that’s not possible, I may find another way to fix it further down the 
> road, but then the name m_compile_unit_infos seems not exactly correct here. 
> It’s rather something like m_referenced_object_infos, right?
>  
> So now that that change has been in for a while, it might be ok to open each 
> .o file and see how many compile units they contain and then populate 
> m_compile_unit_infos as needed.
>  
> Note that counting the number of compile units can be done extremely cheaply 
> (of course, except the cost of opening the file). Each unit as a header that 
> contain its length (which gives you the next unit). I’m not sure we have a 
> primitive that does this without parsing the DWARF, but it should be easy to 
> add. 
> 
> 
> You will need to watch for any usage of m_compile_unit_infos and make sure it 
> does the correct thing.
>  
> That’s the part I was worried about. The structure of m_compile_unit_infos 
> makes the assumption that we can associate slices of symbols to a compile 
> unit. I don’t think this is a correct assumption to make in the LTO case, and 
> even if it were, we’d need to parse the DWARF and do some pretty heavy 
> processing to extract the information. Do we really need this CU <-> Symbol 
> mapping?
>  
> Fred 
> 
> 
> 
> 
> Btw.: My first attempt was a workaround for the symptom (see 
> https://reviews.llvm.org/D51546 <https://reviews.llvm.org/D51546>). It simply 
> reads all debug_lines for a single CU, but I’d really appreciate a better 
> solution.
>  
> The fix in D51546 seems wrong because the only way we get to a line table is 
> via the DW_AT_stmt_list from a compile unit. If we can fix the LTO case to 
> load all compile units from the LTO.o files with multiple CU's this fix won't 
> be needed.
>  
> So the correct solution is to detect how many compile units are in each .o 
> file and then make sure to find all places that were assuming anything about 
> the OSO index being the compile unit UserID are fixed. Now that plug-in 
> loading stops after a SymbolFile says it can handle everything we can 
> probably do a bit more work. One issue is that .o files might have been 
> cleaned up or removed, so be sure to test any solution by removing the .o 
> files and seeing how we do.
>  
> I will be happy to review any patch you have. I can't think of any other 
> reason the the OSO index needs to be the compile unit index. IF you do make a 
> patch, please remove any functions in SymbolFileDWARFDebugMap that are dead 
> code. SymbolFileDWARFDebugMap::GetModuleByOSOIndex() seems to be dead. If 
> that is dead thenvSymbolFileDWARFDebugMap::GetObjectFileByOSOIndex() seems to 
> be dead also.
>  
> _______________________________________________
> lldb-dev mailing list
> lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

_______________________________________________
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Reply via email to