On 08/03/2015 09:50 AM, Ilya Enkovich wrote:


The original code looks better, tree height is just 2 and therefore it
can be executed in 2 cycles. New code has more dependencies and tree
height becomes 5. It is always hard to say for all x86 targets but as
a generic code the original version is better.
Agreed.  Reducing tree height is definitely a good thing as a general rule.

jeff

Reply via email to