Reading point 3. of "Removal of duplicate effort at http://gcc.gnu.org/wiki/Speedup%20areas, I got the following idea :
First of all, I don't think that it is actually done by gcc. If I'm wrong or if this idea doesn't worse the effort, just forget about it. +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+= Many function in real world coding look like : int foo(<some parameters>) { if (<basic test to check the validity of the parameters>) return 0; <rest of the function>; return 1; } bar() { ... foo(...); ... } That is to say : * bar has to call foo, to check the parameters * if the check fails, a call to a function, pushing and poping arguments, stack adjustment, ... could have been avoided * bar may already have some kind of checking before calling foo (sometimes the parameters sent to foo are also parameters sent to bar, in particular with NULL pointer checking and so on) * if the test was performed in bar, the other passes of gcc (scheduler, copy propagation, cse...) could maybe have more opportunities to do a better job So I think that gcc could : * partially inline foo (if the beginning is *simple enough* and can have a return before any real computation) * insert the tests in the location where foo is called * duplicate code of foo to have 2 versions of the code : the original one and one without the already inlined code * call the 2nd version of the function wherever code partially inlined have been inserted. So in the example, we would get : /* still needed if the function is not static ? */ int foo(<some parameters>) { if (<basic test to check the validity of the parameters>) return 0; <rest of the function>; return 1; } int foo_partially_inlined(<some parameters>) { <rest of the function>; return 1; } bar() { ... if (! <basic test to check the validity of the parameters>) foo_partially_inlined(...); ... } I hope I'm clear enough and that the basic idea is clear enough. As I said, I don't know at all if this kind of transformation worse it and if it is possible to implement it. Moreover, if code has to be duplicated, code size could increase