labath wrote:

Thanks for the quick response, Jason.

I think it's quite possible that you haven't run into this situation before, 
because the final representation depends on what tool was used to split the 
functions. Judging by the name (`cold.1`), I think you're using the llvm 
"hot-cold-split" pass -- which does not generate this kind of output. Instead 
it creates a separate function, with its own dwarf description 
(DW_TAG_subprogram) and everything. This is good for unwinding, in that the 
"functions" stay continuous, but maybe not so good for other things (FWICS, the 
synthetic DW_TAG_subprogram does not contain any variable information).

The thing that produces this output is the `-fbasic-block-sections` flag (a.k.a 
"propeller"). Here we have a single DW_TAG_subprogram, which has a DW_AT_ranges 
attribute which all of the (discontinuous) parts of the function. However, this 
flag is pretty new, and does not work on darwin yet (probably because noone 
implemented it there). In principle, I don't think the propeller is doing 
anything wrong (DW_AT_ranges exists so that we could describe situations like 
this), but it does break some assumptions in lldb.

The problem begins in `lldb_private::Function`, which assumes that a single 
address range (`m_address_range`) is enough to describe it. The variable 
contains a comment ("The function address range that covers the widest range 
needed to contain all blocks") which could be interpreted to mean that one 
should expect it to also contain some unrelated code, but I don't know if that 
was the intention, and it's definitely not how the unwinder uses this 
information.

I've considered (and that's something I'd like to do independently of this 
patch) changing the `lldb_private::Function` interface to vend discontinuous 
ranges, but that still wouldn't directly help the unwinder, as we'd need to 
handle the discontinuity there as well. What it would allow us is to handle 
this situation with more finesse. We could e.g. check whether the function 
contains more than one address range, and then choose which range (and from 
which source) to use for caching. I think this is a viable path forward if you 
think this change is too broad.

> From a symbol table point of view, if the symbol names haven't been stripped, 
> the function symbol will have the range of the main function body only; the 
> .cold.1 function would be a separate Symbol. In a stripped binary (no symbol 
> name), I know ObjectFileMachO will use the eh_frame start addresses to create 
> fake symbol table names & entries, I don't know if ObjectFileELF does that, 
> but in that case a Symbol would be equivalent to eh_frame as a source of 
> information.

ObjectFileELF does that too. The problems begin only when debug info is present 
because lldb_private::Function will claim the maximal range, as I've described 
above. However, this creation of synthetic symbols from unwind info is perhaps 
a good reason for why changing the order of range sources is safe(ish). 
(Although it could be a reason for not querying the eh_frame for range 
information at all)

https://github.com/llvm/llvm-project/pull/111409
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to