clayborg wrote:

> Most of the patch is very clean, but I'm bothered by the 
> `getForeignTUSkeletonCUOffset` function, how it opens up with a mostly 
> redundant (the callers checks this already) call to 
> `getForeignTUTypeSignature`, and then proceeds with a reimplementation of 
> `getCUOffset` (sans the type unit check). I think we could design a better 
> set of APIs for this, but I'm not sure what those is, because I'm unclear of 
> the meaning of various combinations of DW_IDX unit entries.

If we have an entry with a `DW_IDX_type_unit`, it can be a normal type unit, or 
a foreign type unit, that depends on the indexes value. If it is less than the 
number of local type units, then it is a local type unit and there should be no 
`DW_IDX_comp_unit` in the entry, else it is a foreign type unit from a .dwo or 
.dwp file. If we have a .dwp file, we need the `DW_IDX_comp_unit` to be able to 
tell if the foreign type unit made it into the .dwp file, or if this is an 
entry that represents the a different type unit and we should ignore it.  If we 
have no .dwp file, then all entries for foreign type units are valid and we 
don't need to check the compile unit it came from. 

So if you just ask an entry for its `DW_IDX_comp_unit`, you will get an answer 
for an entry from a normal compile unit, or for a foreign type unit with a 
compile unit that will be used to discriminate for the .dwp file case. 

So this function is named `getForeignTUSkeletonCUOffset()` to help say "this 
might have  compile unit entry, but we only want it if this is a foreign type 
unit. I took me a while to understand the spec when it came to foreign type 
units...

> What the new function (I think) essentially does is "give me the CU 
> associated with this entry even if the entry does not describe a DIE in this 
> CU" (because `DW_IDX_die_offset` is relative to a type unit)". I think this 
> API would make sense, even without needing to talk about type units. However, 
> is it actually implementable? For single-CU indexes, that's fine, because we 
> can kind of assume all entries are related to that CU. But what about 
> multi-CU indexes? The only way to determine the CU there is to have an 
> explicit attribute. Are we saying that each entry in a multi-CU index must 
> have a DW_IDX_compile_unit (in addition to a DW_IDX_type_unit)?

Yes, but only if the `DW_IDX_type_unit` represents a foreign type unit, and we 
will only need the `DW_IDX_comp_unit` if we have a .dwp file. But since we 
don't know if the user will use a .dwp file, we always need to put the data for 
the originating compile uint in as a `DW_IDX_comp_unit`.

I would rather have had a `DW_IDX_skeleton_unit` to differentiate between a 
`DW_IDX_comp_unit` and requiring the user to know that they can't just find the 
compile unit, but they must find the non-skeleton compile unit,  before adding 
the `DW_IDX_die_offset`. But we don't have that in the spec...

> 
> If the answer is yes, then this function can be implemented, but then I think 
> the current implementation of `getCUOffset` (which I interpret as "give me 
> the CU offset **IF** this entry is relative to that CU") doesn't make sense 
> -- because it treats entries with an explicit `DW_IDX_compile_unit` different 
> from an implicit/missing `DW_IDX_compile_unit`. 

It doesn't treat them differently. if you ask an entry for its 
`DW_IDX_compile_unit` and it doesn't have one, we can fall back to grabbing a 
CU from the CU table, but only if the CU table has only 1 entry in it.

> And in this world, we assume that `DW_IDX_type_unit` takes precedence over 
> `DW_IDX_compile_unit` -- if both are present then `DW_IDX_die_offset` is 
> relative to the former. And I'm not sure if we would actually want to take up 
> space by putting both values into the index. Looking at existing producers 
> would be interesting, but I'm not sure if there are any (lld does not merge 
> type unit indexes right now, possibly for this very reason). Maybe one could 
> produce something with LTO?

LTO and bolt and soon lld will produce these single tables with multiple 
compile units. 

And we MUST have the `DW_IDX_comp_unit` for foreign type units because type 
units can generate differing content and one copy of the type unit will make it 
into the .dwp and we need to know from which .dwo file it came and only use the 
entries from the matching .dwo file because the offsets can be meaningless for 
type units with the same signature but that came from different .dwo files.

> 
> OTOH, if the answer is no, then the function isn't implementable in the 
> generic case. That doesn't mean you can't implement this feature -- which is 
> useful for guarding against ODR violations (exacerbated by llvm's 
> nonconforming type signature computation algorithm...). However, I think it 
> could be implemented in a much simpler way. For example, lldb could just 
> check if it's looking at a single-CU index, and get the CU offset that way. 
> No extra llvm APIs would be needed.

All of these APIs get the single CU if there is only one for both the case 
where you only have a `DW_IDX_comp_unit` in an entry and for the case where you 
have a `DW_IDX_type_unit` and/or not a `DW_IDX_comp_unit` in the same entry.

So not sure what you are asking for here, but this is the best way I can 
represent the messy state of the .debug_names tables. Monolithic tables with 
multiple CUs are here now from BOLT, and will soon be from lld, so we need to 
be able to handle these.



https://github.com/llvm/llvm-project/pull/87740
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to