------- Comment #4 from jakub at gcc dot gnu dot org 2009-05-07 16:52 ------- Look for DEEP_BRANCH_PREDICTION in config/i386/*. On i386/i486/i586 doing call 1f; 1: is just fine, but on several newer CPUs it confuses return prediction logic (more calls than rets), so when optimizing for those CPUs gcc calls a separate pad which just reads the return address from the stack and immediately returns. This pad can be shared among all functions within the same binary or shared library. If hidden linkonce doesn't work on some Solaris version, you should just make sure USE_HIDDEN_LINKONCE is 0 for that target.
-- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40027