On Tue, Mar 23, 2021 at 12:59 AM H.J. Lu via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Mon, Mar 22, 2021 at 7:10 AM Jan Hubicka <hubi...@ucw.cz> wrote: > > > > > > > > gcc/ > > > > > > * config/i386/i386-expand.c (expand_set_or_cpymem_via_rep): > > > For TARGET_PREFER_KNOWN_REP_MOVSB_STOSB, don't convert QImode > > > to SImode. > > > (decide_alg): For TARGET_PREFER_KNOWN_REP_MOVSB_STOSB, use > > > "rep movsb/stosb" only for known sizes. > > > * config/i386/i386-options.c (processor_cost_table): Use Ice > > > Lake cost for Cannon Lake, Ice Lake, Tiger Lake, Sapphire > > > Rapids and Alder Lake. > > > * config/i386/i386.h (TARGET_PREFER_KNOWN_REP_MOVSB_STOSB): New. > > > * config/i386/x86-tune-costs.h (icelake_memcpy): New. > > > (icelake_memset): Likewise. > > > (icelake_cost): Likewise. > > > * config/i386/x86-tune.def (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): > > > New. > > > > It looks like X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB is quite obviously > > benefical and independent of the rest of changes. I think we will need > > to discuss bit more the move ratio and the code size/uop cache polution > > issues - one option would be to use increased limits for -O3 only. > > My change only increases CLEAR_RATIO, not MOVE_RATIO. We are > checking code size impacts on SPEC CPU 2017 and eembc. > > > Can you break this out to independent patch? I also wonder if it owuld > > X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB improves performance > only when memcpy/memset costs and MOVE_RATIO are updated the same time, > like: > > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567096.html > > Make it a standalone means moving from Ice Lake patch to Skylake patch. > > > not be more readable to special case this just on the beggining of > > decide_alg. > > > @@ -6890,6 +6891,7 @@ decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT > > > expected_size, > > > const struct processor_costs *cost; > > > int i; > > > bool any_alg_usable_p = false; > > > + bool known_size_p = expected_size != -1; > > > > expected_size is not -1 if we have profile feedback and we detected from > > histogram average size of a block. It seems to me that from description > > that you want the const to be actual compile time constant that would be > > min_size == max_size I guess. > > > > You are right. Here is the v2 patch with min_size != max_size check for > unknown size.
OK. Thanks, Richard. > Thanks. > > -- > H.J.