On Tue, Dec 20, 2011 at 2:23 PM, Rohit Arul Raj <rohitarul...@gmail.com> wrote: > Hello All, > > With the code given below, i expected the ppc compiler (e500mc v4.6.2) > to generate 'memset' zero call for loop initialization (at '-O3'), > but it generates a loop. > > Case:1 > > int a[18], b[18]; > foo () { > int i; > > for (i=0; i < 18; i++) > a[i] = 0; > } > > Also based on the '-ftree-loop-distribute-patterns' flag, if the test > case (taken from gcc doc) is as shown below, the compiler does > generate 'memset' zero. > > Case:2 > > int a[18], b[18]; > foo () { > int i; > > for (i=0; i < 18; i++) { > a[i] = 0; -------------(A) > b[i] = a[i] + i; -------------(B) > } > } > > Here statements (A) and (B) are split in to two loops and for the 1st > loop the compiler generates 'memset' zero call. Isn't the same > optimization supposed to happen with case (1)? > > Also with case(2) statement (A), for loop iterations < 18, the > compiler unrolls the loop and for iterations >= 18, 'memset' zero is > generated. > > Looking at 'gcc/tree-loop-distribution.c' file, > > static int > ldist_gen (struct loop *loop, struct graph *rdg, > VEC (int, heap) *starting_vertices) > { > ... > BITMAP_FREE (processed); > nbp = VEC_length (bitmap, partitions); > > if (nbp <= 1 > || partition_contains_all_rw (rdg, partitions)) > goto ldist_done; > ------------------------(Z) > > if (dump_file && (dump_flags & TDF_DETAILS)) > dump_rdg_partitions (dump_file, partitions); > > FOR_EACH_VEC_ELT (bitmap, partitions, i, partition) > if (!generate_code_for_partition (loop, partition, i < nbp - 1)) > -------------------(Y) // code for generating built-in > 'memset' is called from here. > goto ldist_done; > > rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa); > update_ssa (TODO_update_ssa_only_virtuals | TODO_update_ssa); > > ldist_done: > > BITMAP_FREE (remaining_stmts); > > ......... > return nbp; > } > > From statement (Z), if the no of distributed loops is <=1 , then the > code generating built-in function (Y) is not executed. > > Is it a good solution to update this conditional check for single loop > (which is not split) also? or Is there any other place/pass where we > can implement this.
Well, at least we do not want to create any code if the builtin code generation would fail. Richard. > Regards, > Rohit