On Mon, 8 Jul 2024, Filip Kastl wrote:

> Hi,
> 
> I'm replying to Richard and keeping Andrew in cc since your suggestions
> overlap.
> 
> 
> On Tue 2024-06-11 14:48:06, Richard Biener wrote:
> > On Thu, 30 May 2024, Filip Kastl wrote:
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2 -fdump-tree-switchconv -march=znver3" } */
> > 
> > I think it's better to enable -mpopcnt and -mbmi (or what remains
> > as minimal requirement).
> 
> Will do.  Currently the testcases are in the i386 directory.  After I exchange
> the -march for -mpopcnt -mbmi can I put these testcases into gcc.dg/tree-ssa?
> Will the -mpopcnt -mbmi options work with all target architectures?

No, those are i386 specific flags.  At least for popcount there's
dejagnu effective targets popcount, popcountl and popcountll so you
could do

/* { dg-additional-options "-mpopcnt" { target { x86_64-*-* i?86-*-* } } } 
*/

and guard the tree dump scan with { target popcount } to cover other
archs that have popcount (without adding extra flags).

> > > +/* Check that the "exponential index transform" can be applied to this 
> > > switch.
> > > +
> > > +   See comment of the exp_index_transform function for details about this
> > > +   transformation.
> > > +
> > > +   We want:
> > > +   - This form of the switch is more efficient
> > > +   - Cases are powers of 2
> > > +
> > > +   Expects that SWTCH has at least one case.  */
> > > +
> > > +bool
> > > +switch_conversion::is_exp_index_transform_viable (gswitch *swtch)
> > > +{
> > > +  tree index = gimple_switch_index (swtch);
> > > +  tree index_type = TREE_TYPE (index);
> > > +  basic_block swtch_bb = gimple_bb (swtch);
> > > +  unsigned num_labels = gimple_switch_num_labels (swtch);
> > > +
> > > +  /* Check that we can efficiently compute logarithm of 2^k (using FFS) 
> > > and
> > > +     test that a given number is 2^k for some k (using POPCOUNT).  */
> > > +  optimization_type opt_type = bb_optimization_type (swtch_bb);
> > > +  if (!direct_internal_fn_supported_p (IFN_FFS, index_type, opt_type)
> > > +    || !direct_internal_fn_supported_p (IFN_POPCOUNT, index_type, 
> > > opt_type))
> > > +    return false;
> > > +
> > 
> > See above, I think this can be improved.  Would be nice to split out
> > a can_pow2p (type) and can_log2 (type) and a corresponding
> > gen_pow2p (op) and gen_log2 (op) function so this could be re-used
> > and alternate variants added when required.
> > 
> 
> Just to check that I understand this correctly:  You'd like me to create
> functions can_pow2p, can_log2.  Those functions will check that there are
> optabs for the target machine which allow us to efficiently test
> power-of-2-ness of a number and which allow us to efficiently compute the
> base-2 log of a power-of-2 number.  You'd also like me to create functions
> gen_pow2p and gen_log2 which generate this code.  For now these functions will
> just use POPCOUNT and FFS but they can be later extended to also consider
> different instructions.  Is that right?

Right.

> Into which file should I put these functions?

Just in this file for now.
 
> Is can_pow2p and gen_pow2p necessary?  As you noted one can always use
> (x & -x) == x so testing pow2p can always be done efficiently.

If you add this fallback then can_pow2p / gen_pow2p wouldn't be
necessary indeed.

> > > +  /* Insert a statement that takes the logarithm of the index variable.  
> > > */
> > > +  tree tmp2 = make_ssa_name (index_type);
> > > +  gsi = gsi_start_bb (swtch_bb);
> > 
> > Please use gsi_after_labels (swtch_bb) even though you know there's no
> > labels there.
> > 
> > > +  gcall *stmt_ffs = gimple_build_call_internal (IFN_FFS, 1, index);
> > > +  gimple_call_set_lhs (stmt_ffs, tmp2);
> > > +  gsi_insert_before (&gsi, stmt_ffs, GSI_SAME_STMT);
> > > +
> > > +  tree tmp3 = make_ssa_name (index_type);
> > > +  gassign *stmt_minus_one = gimple_build_assign (tmp3, MINUS_EXPR, tmp2, 
> > > one);
> > > +  gsi_insert_before (&gsi, stmt_minus_one, GSI_SAME_STMT);
> > 
> > You could also use
> > 
> >      tree tmp2 = gimple_build (gsi, true, GSI_SAME_STMT,
> >                            UNKNOWN_LOCATION, IFN_FFS, index);
> >      tree tmp3 = gimple_build (gsi, true, GSI_SAME_STMT,
> >                                UNKNOWN_LOCATION, MINUS_EXPR, tmp2, one);
> > 
> > which does the stmt building, temporary SSA name creation and insertion
> > plus eventually folding things with a definition.  There isn't a
> > gimple_build_cond with this API, but it would still work for the
> > popcount build above.
> 
> I've tried using the gimple_build API and it is indeed nicer.  However, I
> wasn't able to get it working with the FFS internal function.  IFN_FFS is of a
> different type than what gimple_build accepts.  I've tried this
> 
>   tree tmp2 = gimple_build (&gsi, true, GSI_SAME_STMT, UNKNOWN_LOCATION,
>                             as_combined_fn (IFN_FFS), index);
>
> but that only caused an ICE.  I tried looking for an example of using
> gimple_build to build an internal function call in the sources but I haven't
> found any.
> 
> If you'd be willing to help me with this, I will happily use the gimple_build
> API.  If I don't figure this out I will just stick with the less clean 
> original
> version.

I think you need to use

  tree tmp2 = gimple_build (&gsi, true, GSI_SAME_STMT, UNKNOWN_LOCATION,
                            IFN_FFS, integer_type_node, index);

that is, you always need to specify the type of the value produced
(here the return type of ffs).
 
> > > +  /* Use the result of the logarithm as the new index variable.  */
> > > +  gimple_switch_set_index (swtch, tmp3);
> > > +  update_stmt (swtch);
> > > +
> > > +  /* Replace each case number with its logarithm.  */
> > > +  unsigned i;
> > > +  for (i = 1; i < num_labels; i++)
> > > +    {
> > > +      tree label = gimple_switch_label (swtch, i);
> > > +      unsigned HOST_WIDE_INT label_hwi = tree_to_uhwi (CASE_LOW (label));
> > > +      CASE_LOW (label) = build_int_cst (index_type, floor_log2 
> > (label_hwi));
> > 
> > You should be able to directly use
> > 
> >          CASE_LOW (label) = tree_log2 (CASE_LOW (label));
> > 
> > Do you have test coverage for a signed index and a 0x80000000 case?
> > It would satisfy popcount and ffs should work but the constant
> > doesnt fit an uhwi and likely tree_log2 would ICE on it.
> 
> A situation with signed index variable and a 0x80000000 case shouldn't be a
> problem since I check all cases for tree_fits_uhwi_p.  To make sure I made 
> this
> testcase
> 
> int foo()    
> {    
>     int a, b;    
>     switch (a)    
>     {    
>         case 0x10000000:    
>             b = 0;    
>             break;    
>         case 0x20000000:    
>             b = 1;    
>             break;    
>         case 0x40000000:    
>             b = 2;    
>             break;    
>         case 0x80000000:    
>             b = 3;    
>             break;    
>     }    
>     return b;    
> }
> 
> and indeed, is_exp_transform_viable returned false due to the last case.  No
> ICE happened.  I added this testcase to my personal set of testcases that I'll
> run on subsequent versions of this patch.
> 
> > 
> > As Andrew said working with wide_int might be best though signedness
> > might interfere here as well.
> > 
> 
> Ok, I'm looking into using wide_ints instead of HOST_WIDE_INTs.  However, I
> need some help:  How should I check that case numbers are nonnegative?  If I
> understand it correctly wide_ints don't carry signedness information so I'll
> have to somehow check nonnegativity of trees, right?  Should I continue using
> tree_fits_uhwi_p for this?

You can use wi::ge_p (wide-int, 0, TYPE_SIGN (type-of-constant)) that
will then dispatch to ge{s,u}_p based on whether the number is signed
or unsigned.

> > > +    }
> > > +
> > > +  /* Update information about the switch statement.  */
> > > +  tree first_label = gimple_switch_label (swtch, 1);
> > > +  tree last_label = gimple_switch_label (swtch, num_labels - 1);
> > > +
> > > +  m_range_min = CASE_LOW (first_label);
> > > +  m_range_max = CASE_LOW (last_label);
> > > +  m_index_expr = gimple_switch_index (swtch);
> > > +  m_switch_bb = swtch_bb;
> > > +
> > > +  unsigned HOST_WIDE_INT range_size_hwi = tree_to_uhwi (m_range_max)
> > > +    - tree_to_uhwi (m_range_min);
> > > +  m_range_size = build_int_cst (index_type, range_size_hwi);
> > 
> > Use int_const_binop (MINUS_EXPR, m_range_max, m_range_min) as is done
> > elsewhere (m_range_size should be a wide_int, but that's out of scope 
> > here).
> 
> Just to check that I understand the bit in parentheses correctly:  You think
> that the switch conversion pass should represent m_range_size as a wide_int
> instead of a tree.  However, that isn't something that should be changed in my
> patch (it could be changed in some future patch instead), right?

Right.

Thanks,
Richard.

> 
> Thanks for the suggestions.
> 
> Cheers,
> Filip Kastl
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to