Hi, sorry for taking so long to reply...
On Wed, Dec 18 2019, Richard Biener wrote: > On December 17, 2019 1:43:15 PM GMT+01:00, Martin Jambor <mjam...@suse.cz> > wrote: >>Hi, >> >>the previous patch unfortunately does not fix the first testcase in PR >>92706 and since I am afraid it might be the important one, I also >>focused on that. The issue here is again total scalarization accesses >>clashing with those representing accesses in the IL - on another >>aggregate but here the sides are switched. Whereas in the previous >>case the total scalarization accesses prevented propagation along >>assignments, here we have the user accesses on the LHS, so even though >>we do not create anything there, we totally scalarize the RHS and >>again end up with assignments with different scalarizations leading to >>bad code. >> >>So we clearly need to propagate information about accesses from RHSs >>to LHSs too, which the patch below does. Because the intent is only >>preventing bad total scalarization - which the last patch now performs >>late enough - and do not care about grp_write flag and so forth, the >>propagation is a bit simpler and so I did not try to unify all of the >>code for both directions. > > But can we really propagate the directions independently? Lacc to racc > propagation would induce accesses to different racc to Lacc branches of the > access tree of the parent, no? So in full generality the access links Form an > undirected graph where you perform propagation in both directions of edges > (and you'd have to consider cycles). 'linked parts' of the graph then need to > have the same (or at least a compatible) scalarization, and three would be > the possibility to compute the optimal 'conflict border' where to fix the > conflict we'd keep one node in the graph unscalarized. > > The way you did it might be sufficient in practice of course and we should > probably go with that for now? I think it should be - at least I think I would not be able to come up with an implementation quickly enough for GCC 10 - I assumed that was the target. And also because there is that one important difference between the propagation, the RHS->LHS also propagates "actually-contains-data" whereas the other direction is just to prevent total scalarization to create conflicts - and it is sufficient to do that and I suppose in any search for optimal scalarization we'd give total scalarization the smallest priority anyway. Thanks, Martin > > Richard. > >>I still think that even with this patch the total scalarization has to >>follow the declared type of the aggregate and cannot be done using >>integers of the biggest suitable power, at least in early SRA, because >>these propagations of course do not work interprocedurally but >>inlining can and does eventually bring accesses from two functions >>together which could (and IMHO would) lead to same problems. >> >>Bootstrapped and LTO-bootstrapped and tested on an x86_64-linux and >>bootstrapped and tested it on aarch64 and i686 (except that on i686 >>the testcase will need to be skipped because __int128_t is not >>available there). I expect that review will lead to requests to >>change things but as far as I am concerned, this is ready for trunk >>too. >> >>Thanks, >> >>Martin >> >>2019-12-11 Martin Jambor <mjam...@suse.cz> >> >> PR tree-optimization/92706 >> * tree-sra.c (struct access): Fields first_link, last_link, >> next_queued and grp_queued renamed to first_rhs_link, last_rhs_link, >> next_rhs_queued and grp_rhs_queued respectively, new fields >> first_lhs_link, last_lhs_link, next_lhs_queued and grp_lhs_queued. >> (struct assign_link): Field next renamed to next_rhs, new field >> next_lhs. Updated comment. >> (work_queue_head): Renamed to rhs_work_queue_head. >> (lhs_work_queue_head): New variable. >> (add_link_to_lhs): New function. >> (relink_to_new_repr): Also relink LHS lists. >> (add_access_to_work_queue): Renamed to add_access_to_rhs_work_queue. >> (add_access_to_lhs_work_queue): New function. >> (pop_access_from_work_queue): Renamed to >> pop_access_from_rhs_work_queue. >> (pop_access_from_lhs_work_queue): New function. >> (build_accesses_from_assign): Also add links to LHS lists and to LHS >> work_queue. >> (child_would_conflict_in_lacc): Renamed to >> child_would_conflict_in_acc. Adjusted parameter names. >> (create_artificial_child_access): New parameter set_grp_read, use it. >> (subtree_mark_written_and_enqueue): Renamed to >> subtree_mark_written_and_rhs_enqueue. >> (propagate_subaccesses_across_link): Renamed to >> propagate_subaccesses_from_rhs. >> (propagate_subaccesses_from_lhs): New function. >> (propagate_all_subaccesses): Also propagate subaccesses from LHSs to >> RHSs. >> >> testsuite/ >> * gcc.dg/tree-ssa/pr92706-1.c: New test.