Thanks for the quick reply, I'll ask the people responsible for
working on the Linux parts try to compile and link the codebase with
-fno-use-linker-plugin to see what happens. It's a bit disheartening
to hear that LTO support on Windows is behind Linux though. I'd help
get that up to speed if I could, but I don't even know where to start
or look :(

best regards,
Julian

On Mon, Mar 31, 2025 at 8:09 PM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Mon, Mar 31, 2025 at 1:20 PM Julian Waters via Gcc <gcc@gcc.gnu.org> wrote:
> >
> > Hi all,
> >
> > I've been trying to chase down an issue that's been driving me insane
> > for a while now. It has to do with the flatten attribute being
> > combined with LTO. I've heard that flatten and LTO are a match made in
> > hell (Someone else's words, not mine), but from what I observe,
> > several methods marked as flatten on Linux compile to an acceptable
> > size with ok amount of inlining, but on Windows however... The exact
> > same methods marked as flatten have their callees inlined so
> > aggressively that they reach sizes of 5MB per method! Something seems
> > to be different between how inlining works on the 2 platforms, what
> > are the differences (If any) between Linux and Windows when it comes
> > to inlining, particularly involving the flatten attribute? Is there a
> > list of differences that is easily accessible somewhere, or
> > alternatively is there somewhere in the gcc source where the
> > heuristics are defined that I can decipher?
> >
> > Here's one such example of the differences between Linux and Windows
> > (Both were compiled with the same optimization settings, -O3 and
> > -flto=auto):
> >
> > Linux:
> > 00000000010b12d0 0000000000006289 t
> > G1ParScanThreadState::trim_queue_to_threshold(unsigned int)
> >
> > Windows:
> > 0000000296f9b0c0 0000000000642d40 T
> > G1ParScanThreadState::trim_queue_to_threshold(unsigned int) [clone
> > .constprop.0]
> > 0000000295125480 0000000000630080 T
> > G1ParScanThreadState::trim_queue_to_threshold(unsigned int)
> >
> >
> > Thanks in advance for the help, and for humouring my question
>
> The main difference is that LTO on Linux can use the linker plugin
> to derive information about how TUs are combined while on Windows
> we're using the "collect2 path" which is quite unmaintained and which
> gives imprecise information.  This can already result in quite different
> inlining.  You can "simulated" that on Linux with -fno-use-linker-plugin
> (only for experimenting, don't use this unless necssary).
>
> Richard.
>
> >
> > best regards,
> > Julian

Reply via email to