Richard Biener wrote: > Hurugalawadi, Naveen wrote: > > The code (m1 > m2) * d code should be optimized as m1> m2 ? d : 0.
> What's the reason of this transform? I expect that the HW multiplier > is quite fast given one operand is either zero or one and a multiplication > is a gimple operation that's better handled in optimizations than > COND_EXPRs which eventually expand to conditional code which > would be much slower. Even really fast multipliers have several cycles latency, and this is generally fixed irrespectively of the inputs. Maybe you were thinking about division? Additionally integer multiply typically has much lower throughput than other ALU operations like conditional move - a modern CPU may have 4 ALUs but only 1 multiplier, so removing redundant integer multiplies is always good. Note (m1 > m2) is also a conditional expression which will result in branches for floating point expressions and on some targets even for integers. Moving the multiply into the conditional expression generates the best code: Integer version: f1: cmp w0, 100 csel w0, w1, wzr, gt ret f2: cmp w0, 100 cset w0, gt mul w0, w0, w1 ret Float version: f3: movi v1.2s, #0 cmp w0, 100 fcsel s0, s0, s1, gt ret f4: cmp w0, 100 bgt .L8 movi v1.2s, #0 fmul s0, s0, s1 // eh??? .L8: ret Wilco