------- Comment #16 from paolo dot carlini at oracle dot com 2009-09-15 14:16 ------- I'm also carrying out some experiments with builtin types, like int, and the patched implementation indeed appears to perform well, usually beating by a good amount the current implementation, easily 2x-3x for large k. Its Achille's heel seems k == 1, whereas the current algorithm is faster, about 1.5-2.0 x. These numbers are for an i7 920.
-- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41351