--- Comment #12 from pinskia at gcc dot gnu dot org 2006-07-17 11:46
---
the code is in the function expand_divmod in expmed.c.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
--- Comment #11 from vda dot linux at googlemail dot com 2006-07-17 10:32
---
Andrew, I must be extremely dumb today. I looked in the source and for the life
of me, I can't see where that optimization is done. I even generated a
2.6.1->2.6.1 diff where it was added and still don't see i
--- Comment #10 from vda dot linux at googlemail dot com 2006-07-16 18:54
---
gcc-4.1.1 differs only by insterting one more useless insn:
movl$-858993459, %eax
mull8(%esp)
movl%edx, %eax
+ xorl%edx, %edx
shrl$3, %eax
mov
--- Comment #9 from vda dot linux at googlemail dot com 2006-07-16 18:47
---
The test program below shows that in this case doing division with div insn
takes more instructions than with mul+shift.
Also mul+shift path has absolutely useless "movl %edx, %eax" insn, shaving that
will mak
--- Comment #8 from steven at gcc dot gnu dot org 2006-07-16 16:51 ---
No. At -Os, we care about smaller code. Unless that sequence of insns with
muls and shifts is smaller than a div, we should produce the div at -Os. And
as far as I can see, the div will always be smaller.
Not a bug
--- Comment #7 from vda dot linux at googlemail dot com 2006-07-16 16:22
---
Oh my.
It looks that use of -Os played a joke on me. gcc 3.4.3 -Os uses a division
instruction, even though it results in slower and _also bigger_ code.
Maybe it makes sense to enable this optimization for -O
--- Comment #6 from pinskia at gcc dot gnu dot org 2006-07-16 15:54 ---
This has been done in GCC since at least 1994 revision 7598 in the SVN.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28395
--- Comment #5 from pinskia at gcc dot gnu dot org 2006-07-16 15:52 ---
In fact we do it also for signed integers (PPC asm this time):
_f:
lis r0,0x
srawi r2,r3,31
ori r0,r0,26215
mulhw r3,r3,r0
srawi r3,r3,2
subf r3,r2,r3
blr
--- Comment #4 from pinskia at gcc dot gnu dot org 2006-07-16 15:50 ---
GCC already does something like this.
For /10, GCC produces:
f:
movl$-858993459, %eax
mull4(%esp)
shrl$3, %edx
movl%edx, %eax
ret
Maybe I don't understand what