------- Comment #24 from tkho at ucla dot edu 2005-11-11 01:26 ------- For comparison, here's the code from gcc 2.95.3. It generates the same 18 instructions for both march=i386 and march=pentiumpro. `gcc -c test3.c -save-temps -O2 -momit-leaf-frame-pointer -march=pentiumpro`: pushl %ebx movl 8(%esp),%ecx movl 12(%esp),%ebx movl 16(%esp),%eax movl 20(%esp),%edx shldl $8,%ecx,%ebx sall $8,%ecx movl %edx,%eax xorl %edx,%edx shrl $16,%eax andl $255,%eax andl $0,%edx orl %eax,%ecx orl %edx,%ebx movl %ecx,%eax movl %ebx,%edx popl %ebx ret
Also, in comment #23, I erronously used g++. Luckily, the same code was generated with gcc. On another note, Mark, I tried your patch in comment #10. I grabbed gcc-head from 2005-09-28 and compared a clean build with a build that had your patches applied. There was no difference in the assembly for the test case in comment #23, and there was no performance gain in our benchmark application. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17886