I brought this subject up earlier, and was told to suggest it again for gcc 9, so I have attached the preliminary changes.
My studies have show that with generic x86-64 optimization it reduces binary size with around 0.5%, and when optimizing for x64 targets with SSE4 or better, it reduces binary size by 2-3% on average. The performance changes are negligible however*, and I haven't been able to detect changes in compile time big enough to penetrate general noise on my platform, but perhaps someone has a better setup for that? * I believe that is because it currently works best on non-optimized code, it is better at big basic blocks doing all kinds of things than tightly written inner loops. Anythhing else I should test or report? Best regards 'Allan diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index beba295bef5..05851229354 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -7612,6 +7612,7 @@ also turns on the following optimization flags: -fstore-merging @gol -fstrict-aliasing @gol -ftree-builtin-call-dce @gol +-ftree-slp-vectorize @gol -ftree-switch-conversion -ftree-tail-merge @gol -fcode-hoisting @gol -ftree-pre @gol @@ -7635,7 +7636,6 @@ by @option{-O2} and also turns on the following optimization flags: -floop-interchange @gol -floop-unroll-and-jam @gol -fsplit-paths @gol --ftree-slp-vectorize @gol -fvect-cost-model @gol -ftree-partial-pre @gol -fpeel-loops @gol @@ -8932,7 +8932,7 @@ Perform loop vectorization on trees. This flag is enabled by default at @item -ftree-slp-vectorize @opindex ftree-slp-vectorize Perform basic block vectorization on trees. This flag is enabled by default at -@option{-O3} and when @option{-ftree-vectorize} is enabled. +@option{-O2} or higher, and when @option{-ftree-vectorize} is enabled. @item -fvect-cost-model=@var{model} @opindex fvect-cost-model diff --git a/gcc/opts.c b/gcc/opts.c index 33efcc0d6e7..11027b847e8 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -523,6 +523,7 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_2_PLUS, OPT_fipa_ra, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_flra_remat, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_fstore_merging, NULL, 1 }, + { OPT_LEVELS_2_PLUS, OPT_ftree_slp_vectorize, NULL, 1 }, /* -O3 optimizations. */ { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 }, @@ -539,7 +540,6 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_3_PLUS, OPT_floop_unroll_and_jam, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_fgcse_after_reload, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_ftree_loop_vectorize, NULL, 1 }, - { OPT_LEVELS_3_PLUS, OPT_ftree_slp_vectorize, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_fvect_cost_model_, NULL, VECT_COST_MODEL_DYNAMIC }, { OPT_LEVELS_3_PLUS, OPT_fipa_cp_clone, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_ftree_partial_pre, NULL, 1 },