> > gcc/ChangeLog: > > > > * config/i386/i386-features.cc > > (scalar_chain::mark_dual_mode_def): Weight > > n_integer_to_sse/n_sse_to_integer with bb frequency. > > (general_scalar_chain::compute_convert_gain): Ditto, and > > adjust function prototype to return true/false when cost model > > is profitable or not. > > (timode_scalar_chain::compute_convert_gain): Ditto. > > (convert_scalars_to_vector): Adjust after the upper two > > function prototype are changed. > > * config/i386/i386-features.h (class scalar_chain): Change > > n_integer_to_sse/n_sse_to_integer to cost_sse_integer, and add > > weighted_cost_sse_integer. > > (class general_scalar_chain): Adjust prototype to return bool > > intead of int. > > (class timode_scalar_chain): Ditto. > > > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/minmax-6.c: Adjust testcase.
Sorry, I think I inadvertly caused some conflicts with my other patch to stv. > > --- a/gcc/config/i386/i386-features.cc > > +++ b/gcc/config/i386/i386-features.cc > > @@ -296,8 +296,8 @@ scalar_chain::scalar_chain (enum machine_mode smode_, > > enum machine_mode vmode_) > > insns_conv = BITMAP_ALLOC (NULL); > > queue = NULL; > > > > - n_sse_to_integer = 0; > > - n_integer_to_sse = 0; > > + cost_sse_integer = 0; > > + weighted_cost_sse_integer = 0 ; Extra space here. > > > > max_visits = x86_stv_max_visits; > > } > > @@ -337,20 +337,40 @@ scalar_chain::mark_dual_mode_def (df_ref def) > > /* Record the def/insn pair so we can later efficiently iterate over > > the defs to convert on insns not in the chain. */ > > bool reg_new = bitmap_set_bit (defs_conv, DF_REF_REGNO (def)); > > + basic_block bb = BLOCK_FOR_INSN (DF_REF_INSN (def)); > > + profile_count entry_count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count; > > + bool speed_p = optimize_bb_for_speed_p (bb); > > + sreal bb_freq = bb->count.to_sreal_scale (entry_count); > > + int cost = 0; > > + > > if (!bitmap_bit_p (insns, DF_REF_INSN_UID (def))) > > { > > if (!bitmap_set_bit (insns_conv, DF_REF_INSN_UID (def)) > > && !reg_new) > > return; > > - n_integer_to_sse++; > > + > > + /* ??? integer_to_sse but we only have that in the RA cost table. > > + Assume sse_to_integer/integer_to_sse are the same which they. */ > > + cost = speed_p ? ix86_cost->sse_to_integer > > + : ix86_size_cost.sse_to_integer; > > } > > else > > { > > if (!reg_new) > > return; > > - n_sse_to_integer++; > > + cost = speed_p ? ix86_cost->sse_to_integer > > + : ix86_size_cost.sse_to_integer; Please be sure to not revert the changes from my patch adding COSTS_N_INSNS (...) / 2 here and some other places. I think with your change the size metrics will get less accurate. Old code had hard-wired sizes which depends on ISA we use: else if (TARGET_64BIT || smode == SImode) { cost += n_sse_to_integer * COSTS_N_BYTES (4); cost += n_integer_to_sse * COSTS_N_BYTES (4); } else if (TARGET_SSE4_1) { /* vmovd (4 bytes) + vpextrd (6 bytes). */ cost += n_sse_to_integer * COSTS_N_BYTES (10); /* vmovd (4 bytes) + vpinsrd (6 bytes). */ cost += n_integer_to_sse * COSTS_N_BYTES (10); } else { /* movd (4 bytes) + psrlq (5 bytes) + movd (4 bytes). */ cost += n_sse_to_integer * COSTS_N_BYTES (13); /* movd (4 bytes) + movd (4 bytes) + unpckldq (4 bytes). */ cost += n_integer_to_sse * COSTS_N_BYTES (12); } While you now use common size_costs->sse_to_integer which will not be able to distinguish these. Perhaps we want to have a function returning correct sizes depending on ISA and use it here? (In future it may be useful to get correct size costs for vectorizer too.) > > } > > > > + if (speed_p) > > + weighted_cost_sse_integer += bb_freq * cost; You use bb_freq just once, so it may be easier to just call the function. > > - if (optimize_insn_for_size_p ()) > > + if (!speed_p) If the size tables are right, we should be able to eliminate some of these tests since same computation as for speed_p should work for size. However sizes are somewhat approximate, so this needs to be done carefully, so it is probably better done incremeentally. Patch is OK with these changes. My apologizes for the merge conflict. Honza