https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622
--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> --- On Mon, 18 Sep 2017, spop at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79622 > > --- Comment #8 from Sebastian Pop <spop at gcc dot gnu.org> --- > > I would have expected at least each memory op to be in a separate "black > > box" > > We could have a pass before graphite that splits BBs with more than one > write into blocks that contain one data write with all the operations > and data reads needed to compute the stored value. This would allow > more freedom to schedule BBs around. Yeah, somewhat iffy but good enough for experiments I guess. But I think we should have different BBs for reads as well? Though that means we'll handle all the data flow from reads to writes with scalar references ... In the end this splitting of BBs should be an internal detail in the GRAPHITE data structures and not applied to GIMPLE. Basically have the black-boxes be writes with all reads/operations to compute the value implicitely included. Code-gen would have to deal with this by eventually duplicating stmts if they end up un-CSEd by the scheduler. So a black-box would be a set of stmts rather than a whole GIMPLE BB. First step would be abstracting the iteration over GIMPLE stmts in a black-box I guess. > > if you follow the original go-out-of-SSA approach you'd have their effects > > on the CFG edges. So a more complete fix would similarly handle uses. > > In other words: how do we handle reductions? > As you remember, the original way was to expose reductions by rewriting > out-of-SSA > scalar dependences crossing basic blocks (loop-phi nodes, loop-close-phi > nodes,) > tagging the properties of the reduction (commutative, associative) > on the array, and adding that info to the data dependence graph. > By adding those properties to the dependence graph, we give the scheduler > more freedom to select transforms. > > We moved away from rewriting scalar dependences out-of-SSA because we do not > want to transform the code if the scheduler has no better transform to be > done: > we do not want to leave around inefficient memory reads/writes. > Instead, we handle SSA names and create scalar references added to the > dependence graph. We still need to tag scalar reductions with their > associative properties to allow the scheduler to reorder the computations. You mean this tagging of associativeness is not yet done?