This letter contains second separate patch for ddg.c It prevents removing some edges in data dependency graph when renaming is allowed.
2011/10/12 Ayal Zaks <ayal.z...@gmail.com>: >>> The other bug happens only with -fmodulo-sched-allow-regmoves. Here we >>> eliminate some anti-dependence edges in data dependency graph in order to >>> resolve them later by adding some register moves (renaming instructions). >>> But >>> in situation as in example above compiler gives an ICE because it can't >>> create >>> a register move, when regR is hardware flag register. So we have to know >>> which >>> register(s) cause anti-dependency in order to understand whether we can >>> ignore >>> it. I can't find any easy way to gather this information, so I create my >>> own >>> structures to store this info and had implemented my own hooks for >>> sched_analyze function. This leads to more complex interconnection between >>> ddg.c and modulo-sched.c. > > Well, having ddg.c's/create_ddg_dep_from_intra_loop_link() consult > "Register sets from modulo scheduler structures" to determine if an > anti-dep can be eliminated, seems awkward. One should be able to build > a ddg independent of any modulo scheduler structures. I think now I found a proper way to gather information about the register(s) causing anti-dependency (see the patch attached). > One simple solution is to keep all anti-dep edges of registers that > are clobbered anywhere in the loop. This might be overly conservative, > but perhaps not so if "On x86_64 it is clobbered by most of arithmetic > instructions". > If an anti-dep between a use and a dep of a clobber register is to be > removed, a dependence should be created between the use and a > clobbering instruction following the def. Hope this is clear. > > This too should be solved by a dedicated (separate) patch, for easier > digestion. > > Presumably, the ddg already has all intra-iteration edges preventing > motion of clobbering instructions within an iteration, and we are only > missing inter-iteration deps or deps surfaced by eliminating > anti-deps, right? > > You might consider introducing a new type of dependence for such > edges, "Clobber", if it would help. > Ayal, I suppose here we have some misunderstanding. This is not directly related to the problem of clobbers discussed previously. Here I talk about the need to find out the register causing the anti-dependence (for instance, we need to check whether it can be renamed). Currently, SMS simply attempts to take the RHS of the second instruction (via single_set()), and if it's a register, SMS assumes it's a register causing the dependency. This breaks in a following scenario: insn1: ... (read flags) insn2: regA = regB - regC; update flags Here, SMS would wrongly take regA as the dependency register, while it was in fact the flags register. So, I rewrote this part in a different way and make a separate patch. -- Roman Zhuykov zhr...@ispras.ru
2011-12-07 Roman Zhuykov <zhr...@ispras.ru> * ddg.c (create_ddg_dep_from_intra_loop_link): Correclty determine which register causes anti-dependency. Prevent removing anti-dep edge from ddg when hard_register_p. --- diff --git a/gcc/ddg.c b/gcc/ddg.c index 2b1cfe8..0a3726f 100644 --- a/gcc/ddg.c +++ b/gcc/ddg.c @@ -204,23 +204,34 @@ create_ddg_dep_from_intra_loop_link (ddg_ptr g, ddg_node_ptr src_node, && (t == ANTI_DEP && dt == REG_DEP) && !autoinc_var_is_used_p (dest_node->insn, src_node->insn)) { - rtx set; + df_ref *def_rec; + bool can_delete_dep = true; - set = single_set (dest_node->insn); - /* TODO: Handle registers that REG_P is not true for them, i.e. + /* TODO: Handle !REG_P register correctly, i.e. subregs and special registers. */ - if (set && REG_P (SET_DEST (set))) - { - int regno = REGNO (SET_DEST (set)); - df_ref first_def; - struct df_rd_bb_info *bb_info = DF_RD_BB_INFO (g->bb); + for (def_rec = DF_INSN_DEFS (dest_node->insn); *def_rec; def_rec++) + { + df_ref first_def, def = *def_rec, *use_rec; + unsigned regno = DF_REF_REGNO (def); + struct df_rd_bb_info *bb_info = DF_RD_BB_INFO (g->bb); + bool src_uses = false; - first_def = df_bb_regno_first_def_find (g->bb, regno); - gcc_assert (first_def); + for (use_rec = DF_INSN_USES (src_node->insn); *use_rec; use_rec++) + if (regno == DF_REF_REGNO (*use_rec)) + src_uses = true; - if (bitmap_bit_p (&bb_info->gen, DF_REF_ID (first_def))) - return; - } + if (!src_uses) + continue; + + first_def = df_bb_regno_first_def_find (g->bb, regno); + gcc_assert (first_def); + + if (HARD_REGISTER_NUM_P (regno) + || !bitmap_bit_p (&bb_info->gen, DF_REF_ID (first_def))) + can_delete_dep = false; + } + if (can_delete_dep) + return; } latency = dep_cost (link);