------- Comment #4 from jb at gcc dot gnu dot org 2010-04-30 18:02 ------- Some more experimentation, on different hardware, reveals that the relative performance of "rep stos" vs. loop depends heavily on the size of the object to set, the optimization options (loop unrolling etc.), and presumably on the hardware as well. The nice thing about "rep stos", is at least it's short, and in principle in the future hw manufacturers could tune the microcode to provide an optimal implementation.
As I have no time to set up a comprehensive benchmark that would be required if one were to make changes to the current implementation (presumably, given the importance of memset() others have already done it), closing this as wontfix. -- jb at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution| |WONTFIX http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31750