2014-06-20 20:14 GMT+02:00 Jeff Law <l...@redhat.com>: > On 06/20/14 12:07, Kai Tietz wrote: >> >> 2014-06-20 19:55 GMT+02:00 Richard Henderson <r...@redhat.com>: >>> >>> On 06/20/2014 10:52 AM, Kai Tietz wrote: >>>> >>>> 2014-06-20 Kai Tietz <kti...@redhat.com> >>>> >>>> PR target/39284 >>>> * passes.def (peephole2): Add second peephole2 pass before >>>> split before sched2 pass. >>>> * config/i386/i386.md (peehole2): To combine >>>> indirect jump with memory. >>>> (split2): Likewise. >>> >>> >>> Why are we adding a second pass instead of just moving the one? >>> >>> >>> r~ >> >> >> I told that in a prior mail in that thread to Jeff. IIRC there are >> some conversion of impossible pushes then done too late, additional >> some patterns about split & dieing register too. Means we produce >> weaker code. > > So let's dig into this deeper. Examples & explanations would help. I know > it feels like a bit of a runaround, but avoiding adding the pass would be > good. > > jeff
I dug into it a bit. And couldn't find any significant difference for x64 target for existing testcases. I am still a bit concerned - I can't reproduce it for x86/x86_64 targets - that we might cause regressions for targets by moving peephole2 pass too close before the sched2 pass. Therefore I searched for the closest place to the prior place of the peephole2 pass, which solves still the indirect jump optimization on memory. By testing for x86/x64 the pass needs to be run directly after the "reorder blocks" pass. So I suggest following change of passes.def: Index: passes.def =================================================================== --- passes.def (Revision 211850) +++ passes.def (Arbeitskopie) @@ -384,7 +384,6 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_rtl_dse2); NEXT_PASS (pass_stack_adjustments); NEXT_PASS (pass_jump2); - NEXT_PASS (pass_peephole2); NEXT_PASS (pass_if_after_reload); NEXT_PASS (pass_regrename); NEXT_PASS (pass_cprop_hardreg); @@ -391,6 +390,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_fast_rtl_dce); NEXT_PASS (pass_duplicate_computed_gotos); NEXT_PASS (pass_reorder_blocks); + NEXT_PASS (pass_peephole2); NEXT_PASS (pass_branch_target_load_optimize2); NEXT_PASS (pass_leaf_regs); NEXT_PASS (pass_split_before_sched2); Kai