On Mon, 8 Jul 2024, Filip Kastl wrote: > Hi, > > I'm replying to Richard and keeping Andrew in cc since your suggestions > overlap. > > > On Tue 2024-06-11 14:48:06, Richard Biener wrote: > > On Thu, 30 May 2024, Filip Kastl wrote: > > > +/* { dg-do compile } */ > > > +/* { dg-options "-O2 -fdump-tree-switchconv -march=znver3" } */ > > > > I think it's better to enable -mpopcnt and -mbmi (or what remains > > as minimal requirement). > > Will do. Currently the testcases are in the i386 directory. After I exchange > the -march for -mpopcnt -mbmi can I put these testcases into gcc.dg/tree-ssa? > Will the -mpopcnt -mbmi options work with all target architectures?
No, those are i386 specific flags. At least for popcount there's dejagnu effective targets popcount, popcountl and popcountll so you could do /* { dg-additional-options "-mpopcnt" { target { x86_64-*-* i?86-*-* } } } */ and guard the tree dump scan with { target popcount } to cover other archs that have popcount (without adding extra flags). > > > +/* Check that the "exponential index transform" can be applied to this > > > switch. > > > + > > > + See comment of the exp_index_transform function for details about this > > > + transformation. > > > + > > > + We want: > > > + - This form of the switch is more efficient > > > + - Cases are powers of 2 > > > + > > > + Expects that SWTCH has at least one case. */ > > > + > > > +bool > > > +switch_conversion::is_exp_index_transform_viable (gswitch *swtch) > > > +{ > > > + tree index = gimple_switch_index (swtch); > > > + tree index_type = TREE_TYPE (index); > > > + basic_block swtch_bb = gimple_bb (swtch); > > > + unsigned num_labels = gimple_switch_num_labels (swtch); > > > + > > > + /* Check that we can efficiently compute logarithm of 2^k (using FFS) > > > and > > > + test that a given number is 2^k for some k (using POPCOUNT). */ > > > + optimization_type opt_type = bb_optimization_type (swtch_bb); > > > + if (!direct_internal_fn_supported_p (IFN_FFS, index_type, opt_type) > > > + || !direct_internal_fn_supported_p (IFN_POPCOUNT, index_type, > > > opt_type)) > > > + return false; > > > + > > > > See above, I think this can be improved. Would be nice to split out > > a can_pow2p (type) and can_log2 (type) and a corresponding > > gen_pow2p (op) and gen_log2 (op) function so this could be re-used > > and alternate variants added when required. > > > > Just to check that I understand this correctly: You'd like me to create > functions can_pow2p, can_log2. Those functions will check that there are > optabs for the target machine which allow us to efficiently test > power-of-2-ness of a number and which allow us to efficiently compute the > base-2 log of a power-of-2 number. You'd also like me to create functions > gen_pow2p and gen_log2 which generate this code. For now these functions will > just use POPCOUNT and FFS but they can be later extended to also consider > different instructions. Is that right? Right. > Into which file should I put these functions? Just in this file for now. > Is can_pow2p and gen_pow2p necessary? As you noted one can always use > (x & -x) == x so testing pow2p can always be done efficiently. If you add this fallback then can_pow2p / gen_pow2p wouldn't be necessary indeed. > > > + /* Insert a statement that takes the logarithm of the index variable. > > > */ > > > + tree tmp2 = make_ssa_name (index_type); > > > + gsi = gsi_start_bb (swtch_bb); > > > > Please use gsi_after_labels (swtch_bb) even though you know there's no > > labels there. > > > > > + gcall *stmt_ffs = gimple_build_call_internal (IFN_FFS, 1, index); > > > + gimple_call_set_lhs (stmt_ffs, tmp2); > > > + gsi_insert_before (&gsi, stmt_ffs, GSI_SAME_STMT); > > > + > > > + tree tmp3 = make_ssa_name (index_type); > > > + gassign *stmt_minus_one = gimple_build_assign (tmp3, MINUS_EXPR, tmp2, > > > one); > > > + gsi_insert_before (&gsi, stmt_minus_one, GSI_SAME_STMT); > > > > You could also use > > > > tree tmp2 = gimple_build (gsi, true, GSI_SAME_STMT, > > UNKNOWN_LOCATION, IFN_FFS, index); > > tree tmp3 = gimple_build (gsi, true, GSI_SAME_STMT, > > UNKNOWN_LOCATION, MINUS_EXPR, tmp2, one); > > > > which does the stmt building, temporary SSA name creation and insertion > > plus eventually folding things with a definition. There isn't a > > gimple_build_cond with this API, but it would still work for the > > popcount build above. > > I've tried using the gimple_build API and it is indeed nicer. However, I > wasn't able to get it working with the FFS internal function. IFN_FFS is of a > different type than what gimple_build accepts. I've tried this > > tree tmp2 = gimple_build (&gsi, true, GSI_SAME_STMT, UNKNOWN_LOCATION, > as_combined_fn (IFN_FFS), index); > > but that only caused an ICE. I tried looking for an example of using > gimple_build to build an internal function call in the sources but I haven't > found any. > > If you'd be willing to help me with this, I will happily use the gimple_build > API. If I don't figure this out I will just stick with the less clean > original > version. I think you need to use tree tmp2 = gimple_build (&gsi, true, GSI_SAME_STMT, UNKNOWN_LOCATION, IFN_FFS, integer_type_node, index); that is, you always need to specify the type of the value produced (here the return type of ffs). > > > + /* Use the result of the logarithm as the new index variable. */ > > > + gimple_switch_set_index (swtch, tmp3); > > > + update_stmt (swtch); > > > + > > > + /* Replace each case number with its logarithm. */ > > > + unsigned i; > > > + for (i = 1; i < num_labels; i++) > > > + { > > > + tree label = gimple_switch_label (swtch, i); > > > + unsigned HOST_WIDE_INT label_hwi = tree_to_uhwi (CASE_LOW (label)); > > > + CASE_LOW (label) = build_int_cst (index_type, floor_log2 > > (label_hwi)); > > > > You should be able to directly use > > > > CASE_LOW (label) = tree_log2 (CASE_LOW (label)); > > > > Do you have test coverage for a signed index and a 0x80000000 case? > > It would satisfy popcount and ffs should work but the constant > > doesnt fit an uhwi and likely tree_log2 would ICE on it. > > A situation with signed index variable and a 0x80000000 case shouldn't be a > problem since I check all cases for tree_fits_uhwi_p. To make sure I made > this > testcase > > int foo() > { > int a, b; > switch (a) > { > case 0x10000000: > b = 0; > break; > case 0x20000000: > b = 1; > break; > case 0x40000000: > b = 2; > break; > case 0x80000000: > b = 3; > break; > } > return b; > } > > and indeed, is_exp_transform_viable returned false due to the last case. No > ICE happened. I added this testcase to my personal set of testcases that I'll > run on subsequent versions of this patch. > > > > > As Andrew said working with wide_int might be best though signedness > > might interfere here as well. > > > > Ok, I'm looking into using wide_ints instead of HOST_WIDE_INTs. However, I > need some help: How should I check that case numbers are nonnegative? If I > understand it correctly wide_ints don't carry signedness information so I'll > have to somehow check nonnegativity of trees, right? Should I continue using > tree_fits_uhwi_p for this? You can use wi::ge_p (wide-int, 0, TYPE_SIGN (type-of-constant)) that will then dispatch to ge{s,u}_p based on whether the number is signed or unsigned. > > > + } > > > + > > > + /* Update information about the switch statement. */ > > > + tree first_label = gimple_switch_label (swtch, 1); > > > + tree last_label = gimple_switch_label (swtch, num_labels - 1); > > > + > > > + m_range_min = CASE_LOW (first_label); > > > + m_range_max = CASE_LOW (last_label); > > > + m_index_expr = gimple_switch_index (swtch); > > > + m_switch_bb = swtch_bb; > > > + > > > + unsigned HOST_WIDE_INT range_size_hwi = tree_to_uhwi (m_range_max) > > > + - tree_to_uhwi (m_range_min); > > > + m_range_size = build_int_cst (index_type, range_size_hwi); > > > > Use int_const_binop (MINUS_EXPR, m_range_max, m_range_min) as is done > > elsewhere (m_range_size should be a wide_int, but that's out of scope > > here). > > Just to check that I understand the bit in parentheses correctly: You think > that the switch conversion pass should represent m_range_size as a wide_int > instead of a tree. However, that isn't something that should be changed in my > patch (it could be changed in some future patch instead), right? Right. Thanks, Richard. > > Thanks for the suggestions. > > Cheers, > Filip Kastl > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)