------- Comment #2 from gb-0001 at xsim dot com 2009-11-29 17:34 ------- >[For the call in the loop GCC assumes it is more beneficial]
And in this case it is: the inner loop code is yet simpler than the prologue/eiplogue code. >[If you are sure it is always beneficial...] It is not always beneficial. It is close enough to always if the "op" parameter is a compile-time constant, and "op" usually is a compile-time constant. Taking advantage of that would require annotating the call site with a conditional inlining information. Is that possible in GCC? >[It is unlikely fixed in 4.4] This is not important (for me) to fix in 4.4 -- the code is not yet public and even when it is, it is not clear anybody else will use it. My principal concerns are it would be nice if my code were faster, and this may represent a class of lost optimizations for others. I filed this ticket at reduced severity to reflect that, feel free to adjust priority/severity to reflect that (or tell me what to change). >[As 4.5 works...] My reading is 4.5 inlines it if told to always_inline, but inlining is a loss when "op" is a runtime variable -- it would inline the code up to about 20 times without being able to optimize any inlined copy. Is there a way to annotate "inline if op is a compile-time constant?" -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42209