On Thu, 9 Nov 2023, Tamar Christina wrote: > > -----Original Message----- > > From: Richard Biener <rguent...@suse.de> > > Sent: Thursday, November 9, 2023 11:54 AM > > To: Tamar Christina <tamar.christ...@arm.com> > > Cc: gcc-patches@gcc.gnu.org > > Subject: RE: [PATCH] tree-optimization/111950 - vectorizer loop copying > > > > On Thu, 9 Nov 2023, Tamar Christina wrote: > > > > > > -----Original Message----- > > > > From: Richard Biener <rguent...@suse.de> > > > > Sent: Thursday, November 9, 2023 9:24 AM > > > > To: Tamar Christina <tamar.christ...@arm.com> > > > > Cc: gcc-patches@gcc.gnu.org > > > > Subject: RE: [PATCH] tree-optimization/111950 - vectorizer loop > > > > copying > > > > > > > > On Thu, 9 Nov 2023, Tamar Christina wrote: > > > > > > > > > > guard_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest; > > > > > > edge epilog_e = LOOP_VINFO_EPILOGUE_IV_EXIT (loop_vinfo); > > > > > > - guard_to = split_edge (epilog_e); > > > > > > + guard_to = epilog_e->dest; > > > > > > guard_e = slpeel_add_loop_guard (guard_bb, guard_cond, > > > > > > guard_to, > > > > > > skip_vector ? anchor : > > > > > > guard_bb, > > > > > > prob_epilog.invert (), > > > > > > @@ -3443,8 +3229,30 @@ vect_do_peeling (loop_vec_info > > > > > > loop_vinfo, tree niters, tree nitersm1, > > > > > > if (vect_epilogues) > > > > > > epilogue_vinfo->skip_this_loop_edge = guard_e; > > > > > > edge main_iv = LOOP_VINFO_IV_EXIT (loop_vinfo); > > > > > > - slpeel_update_phi_nodes_for_guard2 (loop, epilog, main_iv, > > > > > > guard_e, > > > > > > - epilog_e); > > > > > > + gphi_iterator gsi2 = gsi_start_phis (main_iv->dest); > > > > > > + for (gphi_iterator gsi = gsi_start_phis (guard_to); > > > > > > + !gsi_end_p (gsi); gsi_next (&gsi)) > > > > > > + { > > > > > > + /* We are expecting all of the PHIs we have on epilog_e > > > > > > + to be also on the main loop exit. But sometimes > > > > > > + a stray virtual definition can appear at epilog_e > > > > > > + which we can then take as the same on all exits, > > > > > > + we've removed the LC SSA PHI on the main exit before > > > > > > + so we wouldn't need to create a loop PHI for it. */ > > > > > > + if (virtual_operand_p (gimple_phi_result (*gsi)) > > > > > > + && (gsi_end_p (gsi2) > > > > > > + || !virtual_operand_p (gimple_phi_result > > > > > > (*gsi2)))) > > > > > > + add_phi_arg (*gsi, > > > > > > + gimple_phi_arg_def_from_edge (*gsi, > > epilog_e), > > > > > > + guard_e, UNKNOWN_LOCATION); > > > > > > + else > > > > > > + { > > > > > > + add_phi_arg (*gsi, gimple_phi_result (*gsi2), > > guard_e, > > > > > > + UNKNOWN_LOCATION); > > > > > > + gsi_next (&gsi2); > > > > > > + } > > > > > > + } > > > > > > + > > > > > > > > > > I've been having some trouble incorporating this change into the > > > > > early break > > > > work. > > > > > My understanding is that here you've removed the lookup that > > > > > find_guard did and are assuming that the order between the PHI > > > > > nodes between loop->exit and epilog->exit are the same - sporadic > > > > > virtual > > > > operands. > > > > > > > > > > But the loop->exit for early break has to materialize all PHI > > > > > nodes from the main loop into the epilog loop since we need them > > > > > to restart the > > > > scalar loop iteration. > > > > > > > > > > This means that the number of PHI nodes between the first loop and > > > > > the second Loop are not the same, so we end up mis-linking phi nodes. > > > > > i.e. consider this loop > > > > > > > > > > > > https://gist.github.com/Mistuke/65d476b18f991772fdec159a09b81869 > > > > > > > > I don't see any multi-exits here? I think you need exactly the same > > > > PHIs you need for the branch to the epilogue, no? > > > > > > > > > > Ah it's a failing testcase but not one with an early break, > > > > > > > If you can point me to a testcase that fails on your branch I can > > > > try to have a look. > > > > > > I've updated the branch refs/users/tnfchris/heads/gcc-14-early-break > > > > > > Quite a few tests fail, a simple one is vect-early-break_5.c and > > > vect-early-break_20.c > > > > > > But what you just said above makes me wonder.. at the moment before we > > > have differening amount because we require to have the loop counters > > > and IVs as PHI nodes such that vect_update_ivs_after_vectorizer can > > > thread them through correctly as it searches for PHI nodes. However > > > for the epilog exit, those that are not live are not needed. This is why > > > we get > > different counts. > > > > > > Maybe.. the solution is that I need to do the same thing as > > > vectorizable_live_operations In that when > > > vect_update_ivs_after_vectorizer is done I should either remove the PHI > > nodes or turn them into simple assignments. Since they're always single > > value. > > > > > > Looking at a few examples that seems like it would fix the issue.. Does > > > that > > sound right to you? > > > > Without diving deep into it I can't say for sure. I'd say any complex > > mangling > > of things should happen after we've set up the basic loop structure with > > prologue, epilogue end exit edges (I've hesitated with that virtual PHIs, > > thinking we should eventually delay removing it until we're ready with all > > the > > copying for example). > > > > Ok, thank you, the hint you gave: > > > I think you need exactly the same > > > > PHIs you need for the branch to the epilogue, no? > > Made me realize what the issue is. I realize now that your change Is > essentially > assuming that the main loop exit and the epilog exit contain the same number > of PHI nodes. And indeed for the main exit there's no reason why they can't > since all the phi nodes are singular. We don't need them. I was creating > them > because vect_update_ivs_after_vectorizer normally searches for PHI nodes, > > but it searches for PHI nodes on the merge block. In the case of a single > exit > that means that there's nothing to update for the missing PHI nodes since > peeling has already done the right thing. > > There's no need to update the IVs because in the guard block they'll be update > based on niters instead. So the PHIs are not needed. > > For multiple exits the connection was already done by peeling when it > maintained > LCSSA. So the correct fix is just to for the main exit not create the > unneeded > singular PHI nodes to begin with. The order is then maintained by > redirect_branch > and flush. Then it all works and no change is needed elsewhere. > > This also allows me to simplify the code for peeling a bit. You'll get the > updated patch > series with all the comments made so far addressed on Tuesday. > > Thanks for the hint!
You're welcome ;) And yes, the motivation of my change was to simplify this ugly piece of code ... Thanks, Richard.