Guys, I assume that this is not right way for fixing such simple performance anomaly since we need to do redundant work - combine load to conditional and then split it back in peephole2? Does it look reasonable? Why we should produce non-efficient instrucction that must be splitted later?
Best regards. Yuri. 2012/12/12 Richard Biener <richard.guent...@gmail.com>: > On Wed, Dec 12, 2012 at 1:55 PM, Uros Bizjak <ubiz...@gmail.com> wrote: >> On Wed, Dec 12, 2012 at 12:44 PM, Richard Biener >> <richard.guent...@gmail.com> wrote: >> >>>> This fix is aimed to remove performance degradation introduced by new >>>> LRA phase that in fact is combining problem. Gcc combiner does >>>> propagation of memory load to if-then-else gimple that was splitted >>>> back by old reload phase. LRA does not perform such splitting. To >>>> avoid performance slowdown on important benchmark (this is true for >>>> all x86 targets) we decided to enhance 'ix86_legitimate_combined_insn' >>>> with a check on such propagation and consider such conditional >>>> instruction with memory operand as illegal one from performance point >>>> of view. >>>> >>>> The fix was bootstrapped and regtested for x86-64. >>>> Is it OK for 4.8 and mainline? >>> >>> Isn't it a win for -Os though? Thus, optimize_insn_for_size ()? It can >>> also increase register pressure, no? So eventually this splitting should >>> be done post-reload only. Not sure what appropriate machinery there is, >>> besides from mdreorg (or split itself). >> >> So, you are proposing to use peephole2 with (match_scratch) >> conditional temporary? > > Yes, if that works. (sounds backward to me having a peephole split one > insn into two ... ;)) > > Richard. > >> Uros.