http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45199
Sebastian Pop <spop at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #4 from Sebastian Pop <spop at gcc dot gnu.org> 2010-11-30 23:08:20 UTC --- The fix for this one is to disable a heuristic that aggregates writes to the same array into a same partition: diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c index 007c4f3..2c2af2c 100644 --- a/gcc/tree-loop-distribution.c +++ b/gcc/tree-loop-distribution.c @@ -781,8 +781,9 @@ build_rdg_partition_for_component (struct graph *rdg, rdgc c, and determine those vertices that have some memory affinity with the current nodes in the component: these are stores to the same arrays, i.e. we're taking care of cache locality. */ - rdg_flag_similar_memory_accesses (rdg, partition, loops, processed, - other_stores); + if (!flag_tree_loop_distribute_patterns) + rdg_flag_similar_memory_accesses (rdg, partition, loops, processed, + other_stores); rdg_flag_loop_exits (rdg, loops, partition, processed, part_has_writes); With this patch on the testcase of this PR I get the following code generated: # .MEM_54 = VDEF <.MEM_62(D)> __builtin_memset (&i_otyp, 0, 4000); # .MEM_2 = VDEF <.MEM_54> __builtin_memset (&i_styp, 0, 4000); # .MEM_78 = VDEF <.MEM_2> __builtin_memset (&l_numob, 0, 4000); # .MEM_82 = VDEF <.MEM_78> __builtin_memset (&i_otyp[1000], 0, 4000); # .MEM_83 = VDEF <.MEM_82> __builtin_memset (&i_styp[1000], 0, 4000); # .MEM_89 = VDEF <.MEM_83> __builtin_memset (&l_numob[1000], 0, 4000); # .MEM_95 = VDEF <.MEM_89> __builtin_memset (&i_otyp[2000], 0, 4000); # .MEM_103 = VDEF <.MEM_95> __builtin_memset (&i_styp[2000], 0, 4000); # .MEM_104 = VDEF <.MEM_103> __builtin_memset (&l_numob[2000], 0, 4000); Note that, for example, i_otyp is written several times, and all these writes end up in the same loop partition with the heuristic, disabling even the memset (0) pattern recognition.