> which is not the case with core_cost (and similar with skylake_cost):
>
>   2, 2, 4,                /* cost of moving XMM,YMM,ZMM register */
>   {6, 6, 6, 6, 12},            /* cost of loading SSE registers
>                        in 32,64,128,256 and 512-bit */
>   {6, 6, 6, 6, 12},            /* cost of storing SSE registers
>                        in 32,64,128,256 and 512-bit */
>   2, 2,                    /* SSE->integer and integer->SSE moves */
>
> We have the same cost of moving between integer registers (by default
> set to 2), between SSE registers and between integer and SSE register
> sets. I think that at least the cost of moves between regsets should
> be substantially higher, rs6000 uses 3x cost of intra-regset moves;
> that would translate to the value of 6. The value should be low enough
> to keep the cost below the value that forces move through the memory.
> Changing core register allocation cost of SSE <-> integer to:
>
> --cut here--
> Index: config/i386/x86-tune-costs.h
> ===================================================================
> --- config/i386/x86-tune-costs.h        (revision 275281)
> +++ config/i386/x86-tune-costs.h        (working copy)
> @@ -2555,7 +2555,7 @@ struct processor_costs core_cost = {
>                                            in 32,64,128,256 and 512-bit */
>    {6, 6, 6, 6, 12},                    /* cost of storing SSE registers
>                                            in 32,64,128,256 and 512-bit */
> -  2, 2,                                        /* SSE->integer and
> integer->SSE moves */
> +  6, 6,                                        /* SSE->integer and
> integer->SSE moves */
>    /* End of register allocator costs.  */
>    },
>
> --cut here--
>
> still produces direct move in gcc.target/i386/minmax-6.c
>
> I think that in addition to attached patch, values between 2 and 6
> should be considered in benchmarking. Unfortunately, without access to
> regressed SPEC tests, I can't analyse these changes by myself.
>
> Uros.

Apply similar change to skylake_cost, on skylake workstation we got
performance like:
---------------------------
version                                                            |
548_exchange_r score
gcc10_20180822:                                           |   10
apply remove_max8                                       |   8.9
also apply increase integer_tofrom_sse cost |   9.69
-----------------------------
Still 3% regression which is related to _gfortran_mminloc0_4_i4 in
libgfortran.so.5.0.0.

I found suspicious code as bellow, does it affect?
------------------
modified   gcc/config/i386/i386-features.c
@@ -590,7 +590,7 @@ general_scalar_chain::compute_convert_gain ()
   if (dump_file)
     fprintf (dump_file, "  Instruction conversion gain: %d\n", gain);

-  /* ???  What about integer to SSE?  */
+  /* ???  What about integer to SSE?  */???
   EXECUTE_IF_SET_IN_BITMAP (defs_conv, 0, insn_uid, bi)
     cost += DF_REG_DEF_COUNT (insn_uid) * ix86_cost->sse_to_integer;
------------------
-- 
BR,
Hongtao

Reply via email to