jankratochvil added inline comments.
================ Comment at: source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp:595 + uint64_t debug_info_size = get_debug_info_data().GetByteSize(); + data_segment.m_data.OffsetData(debug_info_size); + } ---------------- jankratochvil wrote: > clayborg wrote: > > jankratochvil wrote: > > > I do not like this `DWARFDataExtractor::m_start` modification, it sort of > > > corrupts the `DataExtractor` and various operations stop working then - > > > such as `DWARFDataExtractor::GetByteSize()`. DWZ patch makes from current > > > `dw_offset_t` a virtual (remapped) offset and introduces new physical > > > file section offset which is looked up for data extraction. The file > > > offset is represented as `DWARFFileOffset` in D40474, instead of `bool > > > m_is_dwz;` there could be some `enum { DEBUG_INFO, DEBUG_TYPES, > > > DWZ_DEBUG_INFO } m_where;` instead. > > This means that this diff doesn't affect all of the other DWARF code. > > Nothing in .debug_types will refer to anything else (not DW_FORM_ref_addr, > > or any external references). So this trick allows us to just treat > > .debug_info as if .debug_types was appended to the end since nothing in > > .debug_types refers to any DIE outside of its type unit. This also mirrors > > what will actually happen with DWARF5 when all of the data is contained > > inside of the .debug_info section. This allows each DIE to have a unique > > "ID". Any other change requires a lot of changes to the DWARF parser and > > logic. So I actually like this feature. We can fix the GetByteSize() if > > needed. Basically every object in DWARf right now must be able to make a > > unique 64 bit unsigned integer ID in some way that we can get back to that > > info and partially parse more. These are handed out as lldb::user_id_t > > values for types, functions and more. Each flavor of DWARF will encode what > > they want into here. The normal DWARF it is just the absolute offset within > > the .debug_info. With .debug_types we just add the size of the .debug_info > > to the ID. For DWARF in .o files on Darwin, we encode the compile unit > > index into the top 32 bits and the DIE offset into the lower, DWO does > > something just as DWZ will need to. DWARFFileOffset doesn't mean much if > > there are multiple files. We have many competing type uniquing/debug info > > size reduction strategies being employed here. I can't believe we have DWO, > > DWZ, and debug types... But we have to make them all work. We can't just > > use the absolute file offset because DWO used external files where the file > > offsets could be the same in the external .o files... Not sure how this > > works with DWZ or what the best option is. I will read up on DWZ so I can > > propose some viable options. But each new flavor of the day that gets added > > the DWARF parser is adding a bunch of logic and edge cases. If two > > technologies (DWZ + DWO, DWZ + debug_types, etc) are used together, we need > > to ensure they can. > > Any other change requires a lot of changes to the DWARF parser and logic. > > So I actually like this feature. > > I agree it is a fine quick&dirty hack. Just if my DWZ support gets accepted > later anyway then this `.debug_types` feature could be implemented by its > framework in a clean way (as a regular DIEs remapping which is required for > DWZ anyway). > > > If two technologies (DWZ + DWO, DWZ + debug_types, etc) are used together, > > we need to ensure they can. > > `DWZ + DWO` do not make sense to me. I haven't tried to use DWZ for DWO but > DWZ finds common DWARF subtrees typically across CUs so it would not find > much. > > `DWZ + debug_types` is explicitly supported by the DWZ tool although that is > IMO for compatibility only, DWZ can make the common type references slightly > smaller than debug_types. I will sure need to implement debug_types support > into my DWZ-for-LLDB patchset later. > > For DWARF in .o files on Darwin, we encode the compile unit index into the > top 32 bits and the DIE offset into the lower BTW that prevents implementing DWARF64 so I wanted to prevent using the 32/32 split. Currently Fedora/RHEL already contains files with `.debug_info` size near the 32-bit limit: ```http://ftp.muni.cz/pub/linux/fedora/linux/development/rawhide/Everything/x86_64/debug/tree/Packages/q/qt5-qtwebkit-debuginfo-5.212.0-0.20.alpha2.fc29.x86_64.rpm /usr/lib/debug/usr/lib64/libQt5WebKit.so.5.212.0-5.212.0-0.20.alpha2.fc29.x86_64.debug .debug_info size = 0x9bff1b08 = 2.4GiB ``` Although that is because neither DWZ (as it would run out of memory) nor `.debug_types` (as it is expected DWZ will handle the duplicities) are applied and so all CUs are separate and even if it was >4GB there would still be no need for 64-bit `DW_FORM_ref_addr`. Still we are already close to the 32-bit limit, also shifting the split for example to 40/24 would make some limit on number of CUs etc. https://reviews.llvm.org/D32167 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits