Hi all, I've been trying to chase down an issue that's been driving me insane for a while now. It has to do with the flatten attribute being combined with LTO. I've heard that flatten and LTO are a match made in hell (Someone else's words, not mine), but from what I observe, several methods marked as flatten on Linux compile to an acceptable size with ok amount of inlining, but on Windows however... The exact same methods marked as flatten have their callees inlined so aggressively that they reach sizes of 5MB per method! Something seems to be different between how inlining works on the 2 platforms, what are the differences (If any) between Linux and Windows when it comes to inlining, particularly involving the flatten attribute? Is there a list of differences that is easily accessible somewhere, or alternatively is there somewhere in the gcc source where the heuristics are defined that I can decipher?
Here's one such example of the differences between Linux and Windows (Both were compiled with the same optimization settings, -O3 and -flto=auto): Linux: 00000000010b12d0 0000000000006289 t G1ParScanThreadState::trim_queue_to_threshold(unsigned int) Windows: 0000000296f9b0c0 0000000000642d40 T G1ParScanThreadState::trim_queue_to_threshold(unsigned int) [clone .constprop.0] 0000000295125480 0000000000630080 T G1ParScanThreadState::trim_queue_to_threshold(unsigned int) Thanks in advance for the help, and for humouring my question best regards, Julian