Hi, This patch is to set param_vect_partial_vector_usage as 1 on P10 by default. Due to the unexpected performance on Power9 of those vector with length instructions, we didn't enable vectorization with partial vectors before. Some recent testings show that they perform expectedly on Power10 now. The performance evaluation on the whole SPEC2017 with latest trunk and option set power10/ Ofast/unroll shows it can speed up 525.x264_r by 10.80% and 554.roms_r by 1.94%. One remarkable degradation is 523.xalancbmk_r -1.79% but it's identified not directly related to this enablement by some investigation.
Bootstrapped/regtested on powerpc64le-linux-gnu P10. Is it ok for trunk? BR, Kewen ------ gcc/ChangeLog: * config/rs6000/rs6000.c (rs6000_option_override_internal): Set param_vect_partial_vector_usage as 1 for Power10 and up by default.
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index d8ac2f0cd2f..c956d5a605b 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -4781,10 +4781,14 @@ rs6000_option_override_internal (bool global_init_p) SET_OPTION_IF_UNSET (&global_options, &global_options_set, param_max_completely_peeled_insns, 400); - /* Temporarily disable it for now since lxvl/stxvl on the default - supported hardware Power9 has unexpected performance behaviors. */ - SET_OPTION_IF_UNSET (&global_options, &global_options_set, - param_vect_partial_vector_usage, 0); + if (TARGET_POWER10) + SET_OPTION_IF_UNSET (&global_options, &global_options_set, + param_vect_partial_vector_usage, 1); + else + /* Disable it on the default supported hardware Power9 since + lxvl/stxvl have unexpected performance behaviors. */ + SET_OPTION_IF_UNSET (&global_options, &global_options_set, + param_vect_partial_vector_usage, 0); /* Use the 'model' -fsched-pressure algorithm by default. */ SET_OPTION_IF_UNSET (&global_options, &global_options_set,