http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45199

Sebastian Pop <spop at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED

--- Comment #4 from Sebastian Pop <spop at gcc dot gnu.org> 2010-11-30 23:08:20 
UTC ---
The fix for this one is to disable a heuristic that aggregates writes to the
same array into a same partition:

diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 007c4f3..2c2af2c 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -781,8 +781,9 @@ build_rdg_partition_for_component (struct graph *rdg, rdgc
c,
      and determine those vertices that have some memory affinity with
      the current nodes in the component: these are stores to the same
      arrays, i.e. we're taking care of cache locality.  */
-  rdg_flag_similar_memory_accesses (rdg, partition, loops, processed,
-                    other_stores);
+  if (!flag_tree_loop_distribute_patterns)
+    rdg_flag_similar_memory_accesses (rdg, partition, loops, processed,
+                      other_stores);

   rdg_flag_loop_exits (rdg, loops, partition, processed, part_has_writes);


With this patch on the testcase of this PR I get the following code generated:

    # .MEM_54 = VDEF <.MEM_62(D)>
    __builtin_memset (&i_otyp, 0, 4000);
    # .MEM_2 = VDEF <.MEM_54>
    __builtin_memset (&i_styp, 0, 4000);
    # .MEM_78 = VDEF <.MEM_2>
    __builtin_memset (&l_numob, 0, 4000);
    # .MEM_82 = VDEF <.MEM_78>
    __builtin_memset (&i_otyp[1000], 0, 4000);
    # .MEM_83 = VDEF <.MEM_82>
    __builtin_memset (&i_styp[1000], 0, 4000);
    # .MEM_89 = VDEF <.MEM_83>
    __builtin_memset (&l_numob[1000], 0, 4000);
    # .MEM_95 = VDEF <.MEM_89>
    __builtin_memset (&i_otyp[2000], 0, 4000);
    # .MEM_103 = VDEF <.MEM_95>
    __builtin_memset (&i_styp[2000], 0, 4000);
    # .MEM_104 = VDEF <.MEM_103>
    __builtin_memset (&l_numob[2000], 0, 4000);

Note that, for example, i_otyp is written several times, and all these writes
end up in the same loop partition with the heuristic, disabling even the
memset (0) pattern recognition.

Reply via email to