> 
> gcc/
> 
>       * config/i386/i386-expand.c (expand_set_or_cpymem_via_rep):
>       For TARGET_PREFER_KNOWN_REP_MOVSB_STOSB, don't convert QImode
>       to SImode.
>       (decide_alg): For TARGET_PREFER_KNOWN_REP_MOVSB_STOSB, use
>       "rep movsb/stosb" only for known sizes.
>       * config/i386/i386-options.c (processor_cost_table): Use Ice
>       Lake cost for Cannon Lake, Ice Lake, Tiger Lake, Sapphire
>       Rapids and Alder Lake.
>       * config/i386/i386.h (TARGET_PREFER_KNOWN_REP_MOVSB_STOSB): New.
>       * config/i386/x86-tune-costs.h (icelake_memcpy): New.
>       (icelake_memset): Likewise.
>       (icelake_cost): Likewise.
>       * config/i386/x86-tune.def (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB):
>       New.

It looks like X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB is quite obviously
benefical and independent of the rest of changes.  I think we will need
to discuss bit more the move ratio and the code size/uop cache polution
issues - one option would be to use increased limits for -O3 only.

Can you break this out to independent patch?  I also wonder if it owuld
not be more readable to special case this just on the beggining of
decide_alg.
> @@ -6890,6 +6891,7 @@ decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT 
> expected_size,
>    const struct processor_costs *cost;
>    int i;
>    bool any_alg_usable_p = false;
> +  bool known_size_p = expected_size != -1;

expected_size is not -1 if we have profile feedback and we detected from
histogram average size of a block.  It seems to me that from description
that you want the const to be actual compile time constant that would be 
min_size == max_size I guess.

Honza

Reply via email to