Hi Richard, Kyrill, >> I disagree. If they still trigger and generate better code than without >> we should keep them. > >> What kind of code is *common* varies greatly from user to user.
Not really - doing a multiply and checking whether the result is zero is exceedingly rare. I found only 3 cases out of 7300 mul/mla in all of SPEC2006... Overall codesize effect with -Os: 28 bytes or 0.00045%. So we really should not even consider wasting any more time on maintaining such useless patterns. > Also, the main reason for restricting their use was that in the 'olden > days', when we had multi-cycle implementations of the multiply > instructions with short-circuit fast termination when the result was > completed, the flag setting variants would never short-circuit. That only applied to conditional multiplies IIRC, some implementations would not early-terminate if the condition failed. Today there are serious penalties for conditional multiplies - but that's something to address in a different patch. > These days we have fixed cycle counts for multiply instructions, so this > is no-longer a penalty. No, there is a large overhead on modern cores when you set the flags, and there are other penalties due to the extra micro-ops. > In the thumb2 case in particular we can often > reduce mul-cmp (6 bytes) to muls (2 bytes), that's a 66% saving on this > sequence and definitely worth exploiting when we can, even if it's not > all that common. Using muls+cbz is equally small. With my patch we generate this with -Os: void g(void); int f(int x) { if (x * x != 0) g(); } f: muls r0, r0, r0 push {r3, lr} cbz r0, .L9 bl g .L9: pop {r3, pc} Wilco