On 11/27/24 3:12 PM, Richard Biener wrote:
I wonder why you could not always do this for a subset of symbols,
namely those exported from the current TU and building a symbol
based on the symbols assembler name?

That is, I dislike relying on a new flag_lto_debuginfo_assume_unique_filepaths
flag.

Most of these symbols are exported from the TU, this is the strictest
subset I found. They might not be used later, but in most cases we will
find out too late (in WPA).

Assembler names are not unique with static or weak symbols.
The only alternative might be hashing the DIE subtree. Which at least
for DW_TAG_namespace can lead to similar divergence as hashing the
entire file if the entire file is wrapped in the namespace.
And it is also unclear how to hash references to outside of the subtree
without essentially hashing the entire file.

I'd also really like to see a way to get rid of those symbols at link time :/
Or at least make them smaller?  For example by hashing the assembler
name?

For comparison cc1 size:
346708592 without the flag
373766424 with the flag
365183720 with 16 hex digits hash instead of assembler name

The main problem is the number of symbols:
269739 added symbols
 71536 other symbols

So we might be able to halve the size of added symbols, but I would
prefer to focus on removing them entirely.

The BFD linker has .gnu_lto_* special-casing for sections to discard,
maybe we can add a special .note section, .note.gnu.discard_syms with
a list of symbols to discard after link editing?

The alternative could be adding special-cased .gnu.lto_* prefix.
Which could be more easily stripped out after linking with other
linkers.

Though I like the .note idea more if we can special case it.
I will try to implement it in the BFD linker.

Reply via email to