------- Comment #5 from rguenther at suse dot de 2007-11-09 12:37 ------- Subject: Re: [4.3 regression] -Os code size nearly doubled
On Fri, 9 Nov 2007, jakub at gcc dot gnu dot org wrote: > ------- Comment #4 from jakub at gcc dot gnu dot org 2007-11-09 12:30 ------- > So then shouldn't this bug be about: > unsigned long long > foo (unsigned long long ns) > { > return ns % 1000000000L; > } > > unsigned long long > bar (unsigned long long ns) > { > return ns - (ns / 1000000000L) * 1000000000L; > } > > not compiling the same code at -Os? On x86_64 with -O2 it actually produces > identical code with the subtraction, supposedly that's faster. Guess even > (ns / 1000000000L) * 1000000000L should be folded into > ns - (ns % 1000000000L). With -O2 we express the division by the constant by multiplication / add sequences. But for both we get the extra multiplication: bar: .LFB3: movl $1000000000, %esi movq %rdi, %rax xorl %edx, %edx divq %rsi movq %rdi, %rcx imulq $1000000000, %rax, %rdx subq %rdx, %rcx movq %rcx, %rax ret bar: .LFB3: movq %rdi, %rdx movabsq $19342813113834067, %rax shrq $9, %rdx mulq %rdx shrq $11, %rdx imulq $1000000000, %rdx, %rdx subq %rdx, %rdi movq %rdi, %rax ret because we miss this folding opportunity. Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34027