https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33717
--- Comment #6 from Andrew Pinski ---
GCC does better now since GCC 10:
.L2:
movl(%ebx,%ecx,4), %eax
xorl%edx, %edx
addl$-1, %eax
adcl$0, %edx
addl%eax, %esi
adcl%edx, %edi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33717
--- Comment #5 from Andrew Pinski ---
#include
#include
#define rdtscl(low) \
__asm__ __volatile__ ("rdtsc" : "=a" (low) : : "edx")
int main() {
unsigned int x[100];
unsigned int y[100];
unsigned int z[100];
long a,b,c;
size_t
--- Comment #4 from ubizjak at gmail dot com 2009-01-01 17:35 ---
(In reply to comment #3)
> Most likely addsi3_carry should accept 0 as one of the operands.
It does:
(define_insn "addsi3_carry"
[(set (match_operand:SI 0 "nonimmediate_operand" "=rm,r")
(plus:SI (plus:SI (m
--- Comment #3 from pinskia at gcc dot gnu dot org 2008-12-31 18:39 ---
GCC does not produce "adcl $0" which is where the extra xors come from.
Most likely addsi3_carry should accept 0 as one of the operands.
--
pinskia at gcc dot gnu dot org changed:
What|Remo
--- Comment #2 from pinskia at gcc dot gnu dot org 2008-12-31 18:37 ---
4.4 with the new register allocator (which is turned on by default):
C: 522 cycles
asm: 342 cycles
4.4 with the old one:
C: 749 cycles
asm: 344 cycles
So 4.4 is much better but still has extra instructions but tha