On Fri, Jun 25, 2010 at 06:10:56AM -0700, Jan Hubicka wrote: > When you compile with -Os, the inlining happens only when code size reduces. > Thus we pretty much care about the code size metrics only. I suspect the > problem here might be that normal C++ code needs some inlining to make > abstraction penalty go away. GCC -Os implementation is generally tuned for > CSiBE and it is somewhat C centric (that makes sense for embedded world). As a > result we might get quite noticeable slowdowns on C++ apps compiled with -Os > (and code size growth too since abstraction is never eliminated). It can be > seen also at tramp3d (Pooma testcase) where -Os produces a lot bigger and a > lot > slower code.
One would think that in most of the abstraction-penalty cases, the inlined code (often the direct reading or setting of a class data member) should be both smaller and faster than the call, so -Os should inline. Perhaps there are cases where the inlined version is, say, one or two instructions larger than the version with a call, and this causes the degradation? If so, maybe some heuristic could be produced that would inline anyway for a small function?