Richard Biener <rguent...@suse.de> writes: > On Sat, 16 Nov 2019, Richard Sandiford wrote: >> If stmt analysis failed for an SLP node, r278246 tried building the >> node from scalars instead. By that point we've already processed child >> nodes (usually successfully) and might have made some of them the lead >> nodes for their stmt list. This is why we couldn't free the child nodes >> when deciding to build from scalars: >> >> /* Don't remove and free the child nodes here, since they could be >> referenced by other structures. The analysis and scheduling phases >> (need to) ignore child nodes of anything that isn't vect_internal_def. >> */ >> >> The problem in this PR is that we (correctly) don't process the unused >> child nodes again during the scheduling phase, which means that those >> nodes won't become the leader again. So some other node with same stmts >> becomes the leader instead. However, any internal-def child nodes of >> this new leader won't have been processed during the analysis phase, >> because we also (correctly) skip child nodes of non-leader nodes. >> >> We talked on IRC about fixing this by sharing nodes between SLP >> instances, so that it becomes a "proper" graph. But that seems >> like it could throw up some problems too, so I wanted to fix the >> PR in a less invasive way first. >> >> This patch therefore records the leader nodes during the analysis >> phase and reuses that choice during scheduling. When scheduling >> a non-leader node, we first try to schedule the leader node and >> then reuse its vector stmts. At the moment, doing this means we >> need to know the leader node's SLP instance, so the patch records >> that in the SLP node too. >> >> While there, I noticed that the two-operation handling returned >> early without restoring the original def types. That's really >> an independent bug fix. > > Can you split that out please?
Sure, done as below. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? > For the rest I prefer the following which I am now fully testing > on x86_64-unknown-linux-gnu (vect.exp testing went fine). As > discussed on IRC this makes the SLP graph nodes shared between > SLP instances (which it has in fact been, just "virtual" via > all the leader computations). So SLP instances are now just > the collection of entries into the directed SLP graph. Looks great. Obviously I chickened out too early :-) Thanks, Richard 2019-11-19 Richard Sandiford <richard.sandif...@arm.com> gcc/ * tree-vect-slp.c (vect_schedule_slp_instance): Restore stmt def types for two-operation SLP. Index: gcc/tree-vect-slp.c =================================================================== --- gcc/tree-vect-slp.c 2019-11-20 09:48:57.145101295 +0000 +++ gcc/tree-vect-slp.c 2019-11-20 12:11:56.818216093 +0000 @@ -4165,6 +4165,7 @@ vect_schedule_slp_instance (slp_tree nod /* Handle two-operation SLP nodes by vectorizing the group with both operations and then performing a merge. */ + bool done_p = false; if (SLP_TREE_TWO_OPERATORS (node)) { gassign *stmt = as_a <gassign *> (stmt_info->stmt); @@ -4235,10 +4236,11 @@ vect_schedule_slp_instance (slp_tree nod } v0.release (); v1.release (); - return; + done_p = true; } } - vect_transform_stmt (stmt_info, &si, node, instance); + if (!done_p) + vect_transform_stmt (stmt_info, &si, node, instance); /* Restore stmt def-types. */ FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)