On Thu, 30 May 2024, Filip Kastl wrote:

> Hi,
> 
> This patch adds a transformation into the switch conversion pass --
> the "exponential index transform".  This transformation can help switch
> conversion convert switches it otherwise could not.  The transformation is
> intended for switches whose cases are all powers of 2.  Here is a more 
> detailed
> description of what and how this patch tries to address:
> 
> 
> The problem
> -----------
> 
> Consider this piece of C code
> 
> switch (i)
>   {
>     case (1 << 0): return 0;
>     case (1 << 1): return 1;
>     case (1 << 2): return 2;
>     ...
>     case (1 << 30): return 30;
>     default: return 31;
>   }
> 
> If i is a power of 2 (2^k), the switch just returns the exponent (k).  This 
> can
> be seen as taking the base 2 logarithm of i or as returning the position of 
> the
> singular 1 bit in i.
> 
> Currently, GCC fails to see this and doesn't optimize the switch in any way.
> 
> Switch conversion is able to optimize similar switches to an array lookup.
> This is not possible here because the range of cases is too wide.  Another
> optimization that switch conversion is able to do is the "linear
> transformation" -- switch conversion is able to notice a linear relationship
> between the index variable (variable i in the case) and the result value and
> rewrite switch to just an assignment (or multiple assignments in case of
> multiple result values). Sadly, linear transformation isn't applicable here
> because the linear relationship is based on the exponent of i, not on i 
> itself.
> 
> 
> The solution
> ------------
> 
> The exponential index transformation does the following.  When it recognises
> that a switch only has case numbers that are powers of 2 it replaces them with
> their exponents.  It also replaces the index variable by its exponent.  This 
> is
> done by inserting a statement that takes the logarithm of i and using the
> result as the new index variable.  Actually we use the FFS operation for this
> -- since we expect a power of two, we may just ask for the position of the
> first 1 bit.
> 
> We also need to insert a conditional that checks at runtime that the index
> variable is a power of two.  If it isn't, the resulting value should just be
> the default case value (31 in the example above).
> 
> With exponential index transform, switch conversion is able to simplify the
> above example into something like this
> 
> if (i is power of 2)
>   return log2(i); // actually implemented as ffs(i) - 1
> else
>   return 31;
> 
> Switch conversion bails if the range of case numbers is too big.  Exponential
> index transform shrinks this range (exponentially).  So even if there is no
> linear relationship in the switch, exponential index transform can still help
> convert the switch at least to an array lookup.
> 
> 
> Limitations
> -----------
> 
> Currently we only run the exponential index transform if the target has the
> POPCOUNT (for checking a number is a power of 2) and FFS (for taking the
> logarithm) instructions -- we check direct_internal_fn_supported_p () for
> POPCOUNT and FFS internal functions.  Otherwise maybe computing FFS could be
> less efficient than just using a jump table.  We try to avoid transforming a
> switch into a less efficient form.  Maybe this is too conservative and could 
> be
> tweaked in the future.

I think you can use (x & -x) == x to check if x is power-of-two or zero
and avoid requiring .POPCOUNT.  If you know x is power-of-two the 
logarithm can also be computed with CTZ.

> 
> Bootstrapped and regtested on x86_64 linux.  I have additionally run bootstrap
> and regtest on a version where I removed the check that the target has the
> POPCOUNT and FFS instructions so that the transformation would be triggered
> more often.  That testing also went well.
> 
> Are there any things I should tweak?  Or is the patch ready to be applied?
> 
> Cheers,
> Filip Kastl
> 
> 
> -- 8< --
> 
> 
> Sometimes a switch has case numbers that are powers of 2.  Switch
> conversion usually isn't able to optimize switches.  This patch adds
> "exponential index transformation" to switch conversion.  After switch
> conversion applies this transformation on the switch the index variable
> of the switch becomes the exponent instead of the whole value.  For
> example:
> 
> switch (i)
>   {
>     case (1 << 0): return 0;
>     case (1 << 1): return 1;
>     case (1 << 2): return 2;
>     ...
>     case (1 << 30): return 30;
>     default: return 31;
>   }
> 
> gets transformed roughly into
> 
> switch (log2(i))
>   {
>     case 0: return 0;
>     case 1: return 1;
>     case 2: return 2;
>     ...
>     case 30: return 30;
>     default: return 31;
>   }
> 
> This enables switch conversion to further optimize the switch.
> 
> This patch only enables this transformation if there are optabs for
> POPCOUNT and FFS so that the base 2 logarithm can be computed
> efficiently at runtime.
> 
> gcc/ChangeLog:
> 
>       * tree-switch-conversion.cc (switch_conversion::switch_conversion):
>       Track if the transformation happened.
>       (switch_conversion::is_exp_index_transform_viable): New function
>       to decide whether the transformation should be applied.
>       (switch_conversion::exp_index_transform): New function to
>       execute the transformation.
>       (switch_conversion::gen_inbound_check): Don't remove the default
>       BB if the transformation happened.
>       (switch_conversion::expand): Execute the transform if it is
>       viable.  Skip test for sufficiently small case range if the
>       transformation is going to be executed.
>       * tree-switch-conversion.h: Add is_exp_index_transform viable
>       and exp_index_transform.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.target/i386/switch-exp-transform-1.c: New test.
>       * gcc.target/i386/switch-exp-transform-2.c: New test.
>       * gcc.target/i386/switch-exp-transform-3.c: New test.
> 
> Signed-off-by: Filip Kastl <fka...@suse.cz>
> ---
>  .../gcc.target/i386/switch-exp-transform-1.c  |  34 +++
>  .../gcc.target/i386/switch-exp-transform-2.c  |  36 +++
>  .../gcc.target/i386/switch-exp-transform-3.c  | 151 +++++++++
>  gcc/tree-switch-conversion.cc                 | 289 +++++++++++++++++-
>  gcc/tree-switch-conversion.h                  |  18 ++
>  5 files changed, 523 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c
> 
> diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c 
> b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c
> new file mode 100644
> index 00000000000..d51dd110623
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c
> @@ -0,0 +1,34 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-switchconv -march=znver3" } */

I think it's better to enable -mpopcnt and -mbmi (or what remains
as minimal requirement).

> +
> +/* Checks that exponential index transform enables switch conversion to 
> convert
> +   this switch into an array lookup.  Also checks that the "index variable 
> is a
> +   power of two" check has been generated.  -march=znver3 because that
> +   microarchitecture supports POPCOUNT and FFS instructions and exponential
> +   index transform requires that.  */
> +
> +int foo(unsigned bar)
> +{
> +    switch (bar)
> +    {
> +        case (1 << 0):
> +            return 1;
> +        case (1 << 1):
> +            return 2;
> +        case (1 << 2):
> +            return 3;
> +        case (1 << 3):
> +            return 4;
> +        case (1 << 4):
> +            return 8;
> +        case (1 << 5):
> +            return 13;
> +        case (1 << 6):
> +            return 21;
> +        default:
> +            return 0;
> +    }
> +}
> +
> +/* { dg-final { scan-tree-dump "CSWTCH" "switchconv" } } */
> +/* { dg-final { scan-tree-dump "POPCOUNT" "switchconv" } } */
> diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c 
> b/gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c
> new file mode 100644
> index 00000000000..9f2b536aa74
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-2.c
> @@ -0,0 +1,36 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-switchconv -march=znver3" } */
> +
> +/* Checks that when exponential index transform is viable but switch 
> conversion
> +   decides that the switch cannot be converted, the exponential index 
> transform
> +   is not done.  -march=znver3 because that microarchitecture supports 
> POPCOUNT
> +   and FFS instructions and exponential index transform requires that.  */
> +
> +int a;
> +
> +int foo(unsigned bar)
> +{
> +    switch (bar)
> +    {
> +        case (1 << 0):
> +            return 0;
> +        case (1 << 1):
> +            return 1;
> +        case (1 << 2):
> +            return 2;
> +        case (1 << 3):
> +            return 3;
> +        case (1 << 4):
> +            return 4;
> +        case (1 << 5):
> +            a = 3;
> +            return 5;
> +        case (1 << 6):
> +            return 6;
> +        default:
> +            return 0;
> +    }
> +}
> +
> +/* { dg-final { scan-tree-dump "Exponential index transform viable" 
> "switchconv" } } */
> +/* { dg-final { scan-tree-dump-not "Applying exponential index transform" 
> "switchconv" } } */
> diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c 
> b/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c
> new file mode 100644
> index 00000000000..e9665d4a38d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-3.c
> @@ -0,0 +1,151 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-switchconv -march=znver3" } */
> +
> +/* Checks that the exponential index transformation is done for all these 
> types
> +   of the index variable:
> +   - (unsigned) int
> +   - (unsigned) long
> +   - (unsigned) long long
> +
> +   -march=znver3 because that microarchitecture supports POPCOUNT
> +   and FFS instructions and exponential index transform requires that.  */
> +
> +int unopt_int(int bit_position)
> +{
> +    switch (bit_position)
> +    {
> +        case (1 << 0):
> +            return 0;
> +        case (1 << 1):
> +            return 1;
> +        case (1 << 2):
> +            return 2;
> +        case (1 << 3):
> +            return 3;
> +        case (1 << 4):
> +            return 4;
> +        case (1 << 5):
> +            return 5;
> +        case (1 << 6):
> +            return 6;
> +        default:
> +            return 0;
> +    }
> +}
> +
> +int unopt_unsigned_int(unsigned int bit_position)
> +{
> +    switch (bit_position)
> +    {
> +        case (1 << 0):
> +            return 0;
> +        case (1 << 1):
> +            return 1;
> +        case (1 << 2):
> +            return 2;
> +        case (1 << 3):
> +            return 3;
> +        case (1 << 4):
> +            return 4;
> +        case (1 << 5):
> +            return 5;
> +        case (1 << 6):
> +            return 6;
> +        default:
> +            return 0;
> +    }
> +}
> +
> +int unopt_long(long bit_position)
> +{
> +    switch (bit_position)
> +    {
> +        case (1 << 0):
> +            return 0;
> +        case (1 << 1):
> +            return 1;
> +        case (1 << 2):
> +            return 2;
> +        case (1 << 3):
> +            return 3;
> +        case (1 << 4):
> +            return 4;
> +        case (1 << 5):
> +            return 5;
> +        case (1 << 6):
> +            return 6;
> +        default:
> +            return 0;
> +    }
> +}
> +
> +int unopt_unsigned_long(unsigned long bit_position)
> +{
> +    switch (bit_position)
> +    {
> +        case (1 << 0):
> +            return 0;
> +        case (1 << 1):
> +            return 1;
> +        case (1 << 2):
> +            return 2;
> +        case (1 << 3):
> +            return 3;
> +        case (1 << 4):
> +            return 4;
> +        case (1 << 5):
> +            return 5;
> +        case (1 << 6):
> +            return 6;
> +        default:
> +            return 0;
> +    }
> +}
> +
> +int unopt_long_long(long long bit_position)
> +{
> +    switch (bit_position)
> +    {
> +        case (1 << 0):
> +            return 0;
> +        case (1 << 1):
> +            return 1;
> +        case (1 << 2):
> +            return 2;
> +        case (1 << 3):
> +            return 3;
> +        case (1 << 4):
> +            return 4;
> +        case (1 << 5):
> +            return 5;
> +        case (1 << 6):
> +            return 6;
> +        default:
> +            return 0;
> +    }
> +}
> +
> +int unopt_unsigned_long_long(unsigned long long bit_position)
> +{
> +    switch (bit_position)
> +    {
> +        case (1 << 0):
> +            return 0;
> +        case (1 << 1):
> +            return 1;
> +        case (1 << 2):
> +            return 2;
> +        case (1 << 3):
> +            return 3;
> +        case (1 << 4):
> +            return 4;
> +        case (1 << 5):
> +            return 5;
> +        case (1 << 6):
> +            return 6;
> +        default:
> +            return 0;
> +    }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Applying exponential index transform" 
> 6 "switchconv" } } */
> diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc
> index 3a5b84c09e2..975888c0969 100644
> --- a/gcc/tree-switch-conversion.cc
> +++ b/gcc/tree-switch-conversion.cc
> @@ -52,6 +52,8 @@ Software Foundation, 51 Franklin Street, Fifth Floor, 
> Boston, MA
>  #include "omp-general.h"
>  #include "gimple-range.h"
>  #include "tree-cfgcleanup.h"
> +#include "hwint.h"
> +#include "internal-fn.h"
>  
>  /* ??? For lang_hooks.types.type_for_mode, but is there a word_mode
>     type in the GIMPLE type system that is language-independent?  */
> @@ -66,7 +68,8 @@ using namespace tree_switch_conversion;
>  switch_conversion::switch_conversion (): m_final_bb (NULL),
>    m_constructors (NULL), m_default_values (NULL),
>    m_arr_ref_first (NULL), m_arr_ref_last (NULL),
> -  m_reason (NULL), m_default_case_nonstandard (false), m_cfg_altered (false)
> +  m_reason (NULL), m_default_case_nonstandard (false), m_cfg_altered (false),
> +  m_exp_index_transform_applied (false)
>  {
>  }
>  
> @@ -202,6 +205,267 @@ switch_conversion::collect (gswitch *swtch)
>      }
>  }
>  
> +/* Check that the "exponential index transform" can be applied to this 
> switch.
> +
> +   See comment of the exp_index_transform function for details about this
> +   transformation.
> +
> +   We want:
> +   - This form of the switch is more efficient
> +   - Cases are powers of 2
> +
> +   Expects that SWTCH has at least one case.  */
> +
> +bool
> +switch_conversion::is_exp_index_transform_viable (gswitch *swtch)
> +{
> +  tree index = gimple_switch_index (swtch);
> +  tree index_type = TREE_TYPE (index);
> +  basic_block swtch_bb = gimple_bb (swtch);
> +  unsigned num_labels = gimple_switch_num_labels (swtch);
> +
> +  /* Check that we can efficiently compute logarithm of 2^k (using FFS) and
> +     test that a given number is 2^k for some k (using POPCOUNT).  */
> +  optimization_type opt_type = bb_optimization_type (swtch_bb);
> +  if (!direct_internal_fn_supported_p (IFN_FFS, index_type, opt_type)
> +    || !direct_internal_fn_supported_p (IFN_POPCOUNT, index_type, opt_type))
> +    return false;
> +

See above, I think this can be improved.  Would be nice to split out
a can_pow2p (type) and can_log2 (type) and a corresponding
gen_pow2p (op) and gen_log2 (op) function so this could be re-used
and alternate variants added when required.

I think we already demote the switch index type as much as possible
but for FFS/POPCOUNT availability it might be useful to check whether
a wider type has target support for this.  I'd expect this to be
an issue when the index type is short or char (not sure if that
happens).  This is what Andrew means I think.

> +  /* Check that each case label corresponds only to one value
> +     (no case 1..3).  */
> +  unsigned i;
> +  for (i = 1; i < num_labels; i++)
> +    {
> +      tree label = gimple_switch_label (swtch, i);
> +      if (CASE_HIGH (label))
> +     return false;
> +    }
> +
> +  /* Check that each label is nonnegative.  */
> +  for (i = 1; i < num_labels; i++)
> +    {
> +      tree label = gimple_switch_label (swtch, i);
> +      if (!tree_fits_uhwi_p (CASE_LOW (label)))
> +     return false;
> +    }
> +
> +  /* Check that each case is a power of 2.  */
> +  for (i = 1; i < num_labels; i++)
> +    {
> +      tree label = gimple_switch_label (swtch, i);
> +      unsigned HOST_WIDE_INT label_hwi = tree_to_uhwi (CASE_LOW (label));
> +      if (!pow2p_hwi (label_hwi))
> +     return false;
> +    }

Can you merge the loop bodies please?  There's no good reason to
iterate over labels multiple times.

As future enhancement it would be possible to handle a more
general [-]pow2(x) +- constant by switching on log2([-x] -+ constant).
I guess we already handle linear scaling/shifting thus A * (x +- C1) +- 
C2.

> +  if (dump_file)
> +    fprintf (dump_file, "Exponential index transform viable\n");
> +
> +  return true;
> +}
> +
> +/* Perform the "exponential index transform".
> +
> +   Assume that cases of SWTCH are powers of 2.  The transformation replaces 
> the
> +   cases by their exponents (2^k -> k).  It also inserts a statement that
> +   computes the exponent of the original index variable (basically taking the
> +   logarithm) and then sets the result as the new index variable.
> +
> +   The transformation also inserts a conditional statement checking that the
> +   incoming original index variable is a power of 2 with the false edge 
> leading
> +   to the default case.
> +
> +   The exponential index transform shrinks the range of case numbers which
> +   helps switch conversion convert switches it otherwise could not.
> +
> +   Consider for example:
> +
> +   switch (i)
> +     {
> +       case (1 << 0): return 0;
> +       case (1 << 1): return 1;
> +       case (1 << 2): return 2;
> +       ...
> +       case (1 << 30): return 30;
> +       default: return 31;
> +     }
> +
> +   First, exponential index transform gets applied.  Since each case becomes
> +   case x: return x;, the rest of switch conversion is then able to get rid 
> of
> +   the switch statement.
> +
> +   if (i is power of 2)
> +     return log2 (i);
> +   else
> +     return 31;
> +
> +   */
> +
> +void
> +switch_conversion::exp_index_transform (gswitch *swtch)
> +{
> +  if (dump_file)
> +    fprintf (dump_file, "Applying exponential index transform\n");
> +
> +  tree index = gimple_switch_index (swtch);
> +  tree index_type = TREE_TYPE (index);
> +  basic_block swtch_bb = gimple_bb (swtch);
> +  unsigned num_labels = gimple_switch_num_labels (swtch);
> +  tree one = build_one_cst (index_type);
> +
> +  /* Insert a cond stmt that checks if the index variable is a power of 2.  
> */
> +  gimple_stmt_iterator gsi = gsi_for_stmt (swtch);
> +  gsi_prev (&gsi);
> +  gimple *foo = gsi_stmt (gsi);
> +  edge new_edge1 = split_block (swtch_bb, foo);
> +
> +  swtch_bb = new_edge1->dest;
> +  basic_block cond_bb = new_edge1->src;
> +  new_edge1->flags |= EDGE_TRUE_VALUE;
> +  new_edge1->flags &= ~EDGE_FALLTHRU;
> +  new_edge1->probability = profile_probability::even ();
> +
> +  basic_block default_bb = gimple_switch_default_bb (cfun, swtch);
> +  edge new_edge2 = make_edge (cond_bb, default_bb, EDGE_FALSE_VALUE);
> +  new_edge2->probability = profile_probability::even ();
> +
> +  gsi = gsi_last_bb (cond_bb);
> +
> +  tree tmp1 = make_ssa_name (index_type);
> +  gcall *stmt_popcount = gimple_build_call_internal (IFN_POPCOUNT, 2, index,
> +                                                  one);
> +  gimple_call_set_lhs (stmt_popcount, tmp1);
> +  gsi_insert_after (&gsi, stmt_popcount, GSI_NEW_STMT);
> +
> +  gcond *stmt_cond = gimple_build_cond (EQ_EXPR, tmp1, one, NULL, NULL);
> +  gsi_insert_after (&gsi, stmt_cond, GSI_NEW_STMT);
> +
> +  /* We just added an edge going to default bb so fix PHI nodes in that bb:
> +     For each PHI add new PHI arg.  It will be the same arg as when comming 
> to
> +     the default bb from the switch bb.  */
> +  edge default_edge = find_edge (swtch_bb, default_bb);
> +  for (gphi_iterator gsi = gsi_start_phis (default_bb);
> +       !gsi_end_p (gsi); gsi_next (&gsi))
> +    {
> +      gphi *phi = gsi.phi ();
> +      tree arg = PHI_ARG_DEF_FROM_EDGE (phi, default_edge);
> +      location_t loc = gimple_phi_arg_location_from_edge (phi, default_edge);
> +      add_phi_arg (phi, arg, new_edge2, loc);
> +    }
> +
> +  /* Insert a statement that takes the logarithm of the index variable.  */
> +  tree tmp2 = make_ssa_name (index_type);
> +  gsi = gsi_start_bb (swtch_bb);

Please use gsi_after_labels (swtch_bb) even though you know there's no
labels there.

> +  gcall *stmt_ffs = gimple_build_call_internal (IFN_FFS, 1, index);
> +  gimple_call_set_lhs (stmt_ffs, tmp2);
> +  gsi_insert_before (&gsi, stmt_ffs, GSI_SAME_STMT);
> +
> +  tree tmp3 = make_ssa_name (index_type);
> +  gassign *stmt_minus_one = gimple_build_assign (tmp3, MINUS_EXPR, tmp2, 
> one);
> +  gsi_insert_before (&gsi, stmt_minus_one, GSI_SAME_STMT);

You could also use

     tree tmp2 = gimple_build (gsi, true, GSI_SAME_STMT,
                               UNKNOWN_LOCATION, IFN_FFS, index);
     tree tmp3 = gimple_build (gsi, true, GSI_SAME_STMT,
                               UNKNOWN_LOCATION, MINUS_EXPR, tmp2, one);

which does the stmt building, temporary SSA name creation and insertion
plus eventually folding things with a definition.  There isn't a
gimple_build_cond with this API, but it would still work for the
popcount build above.

Of course as said splitting this out would make extending it to handle
alternate code generation when popcount or ffs are not available
easier.

> +  /* Use the result of the logarithm as the new index variable.  */
> +  gimple_switch_set_index (swtch, tmp3);
> +  update_stmt (swtch);
> +
> +  /* Replace each case number with its logarithm.  */
> +  unsigned i;
> +  for (i = 1; i < num_labels; i++)
> +    {
> +      tree label = gimple_switch_label (swtch, i);
> +      unsigned HOST_WIDE_INT label_hwi = tree_to_uhwi (CASE_LOW (label));
> +      CASE_LOW (label) = build_int_cst (index_type, floor_log2 
(label_hwi));

You should be able to directly use

         CASE_LOW (label) = tree_log2 (CASE_LOW (label));

Do you have test coverage for a signed index and a 0x80000000 case?
It would satisfy popcount and ffs should work but the constant
doesnt fit an uhwi and likely tree_log2 would ICE on it.

As Andrew said working with wide_int might be best though signedness
might interfere here as well.

> +    }
> +
> +  /* Fix the dominator tree, if it is available.  */
> +  if (dom_info_available_p (CDI_DOMINATORS))
> +    {
> +      /* Analysis of how dominators should look after we add the edge E going
> +      from the cond block to the default block.
> +
> +      1 For the blocks between the switch block and the final block
> +      (excluding the final block itself):  They had the switch block as
> +      their immediate dominator.  That shouldn't change.
> +
> +      2 The final block may now have the switch block or the cond block as
> +      its immediate dominator.  There's no easy way of knowing (consider
> +      two cases where in both m_default_case_nonstandard = true, in one a
> +      path through default intersects the final block and in one all paths
> +      through default avoid the final block but intersect a successor of the
> +      final block).
> +
> +      3 Other blocks that had the switch block as their immediate dominator
> +      should now have the cond block as their immediate dominator.
> +
> +      4 Immediate dominators of the rest of the blocks shouldn't change.
> +
> +      Reasoning for 3 and 4:
> +
> +      We'll only consider blocks that do not fall into 1 or 2.
> +
> +      Consider a block X whose original imm dom was the switch block.  All
> +      paths to X must also intersect the cond block since it's the only
> +      pred of the switch block.  The final block doesn't dominate X so at
> +      least one path P must lead through the default block.  Let P' be P but
> +      instead of going through the switch block, take E.  The switch block
> +      doesn't dominate X so its imm dom must now be the cond block.
> +
> +      Consider a block X whose original imm dom was Y != the switch block.
> +      We only added an edge so all original paths to X are still present.
> +      So X gained no new dominators.  Observe that Y still dominates X.
> +      There would have to be a path that avoids Y otherwise.  But any block
> +      we can avoid now except for the switch block we were able to avoid
> +      before adding E.  */
> +
> +      redirect_immediate_dominators (CDI_DOMINATORS, swtch_bb, cond_bb);
> +
> +      /* FIXME: Since exponential transform is run only if we know that the
> +      switch will be converted, we know these blocks will be removed and we
> +      maybe don't have to bother updating their dominators.  */

It's good to not leave the IL in an intermediate broken state.

> +      edge e;
> +      edge_iterator ei;
> +      FOR_EACH_EDGE (e, ei, swtch_bb->succs)
> +     {
> +       basic_block bb = e->dest;
> +       if (bb == m_final_bb || bb == default_bb)
> +         continue;
> +       set_immediate_dominator (CDI_DOMINATORS, bb, swtch_bb);

If there's an alternate edge into the cases, thus the original
dominator wasn't the swtch_bb the dominator shouldn't change.
I wonder why it would change at all - you are only creating
a new edge into the default case so only that block dominance
relationship changes?

> +     }
> +
> +      vec<basic_block> v;
> +      v.create (1);
> +      v.quick_push (m_final_bb);
> +      iterate_fix_dominators (CDI_DOMINATORS, v, true);

The final BB should have the condition block as immediate dominator
if it's old immediate dominator was the switch block, otherwise
it's immediate dominator shouldn't change.

> +    }
> +
> +  /* Update information about the switch statement.  */
> +  tree first_label = gimple_switch_label (swtch, 1);
> +  tree last_label = gimple_switch_label (swtch, num_labels - 1);
> +
> +  m_range_min = CASE_LOW (first_label);
> +  m_range_max = CASE_LOW (last_label);
> +  m_index_expr = gimple_switch_index (swtch);
> +  m_switch_bb = swtch_bb;
> +
> +  unsigned HOST_WIDE_INT range_size_hwi = tree_to_uhwi (m_range_max)
> +    - tree_to_uhwi (m_range_min);
> +  m_range_size = build_int_cst (index_type, range_size_hwi);

Use int_const_binop (MINUS_EXPR, m_range_max, m_range_min) as is done
elsewhere (m_range_size should be a wide_int, but that's out of scope 
here).

Otherwise the patch looks good to me.  You can leave the requirement
for popcount/ffs but please split out the check and code-gen.

Thanks,
Richard.

> +
> +  m_cfg_altered = true;
> +
> +  m_contiguous_range = true;
> +  unsigned HOST_WIDE_INT last_hwi = tree_to_uhwi (CASE_LOW (first_label));
> +  for (i = 2; i < num_labels; i++)
> +    {
> +      tree label = gimple_switch_label (swtch, i);
> +      unsigned HOST_WIDE_INT label_hwi = tree_to_uhwi (CASE_LOW (label));
> +      m_contiguous_range &= last_hwi + 1 == label_hwi;
> +      last_hwi = label_hwi;
> +    }
> +
> +  m_exp_index_transform_applied = true;
> +}
> +
>  /* Checks whether the range given by individual case statements of the switch
>     switch statement isn't too big and whether the number of branches actually
>     satisfies the size of the new array.  */
> @@ -973,8 +1237,9 @@ switch_conversion::gen_inbound_check ()
>      bbf->count = e1f->count () + e2f->count ();
>  
>    /* Tidy blocks that have become unreachable.  */
> -  prune_bbs (bbd, m_final_bb,
> -          m_default_case_nonstandard ? m_default_bb : NULL);
> +  bool prune_default_bb = !m_default_case_nonstandard
> +    && !m_exp_index_transform_applied;
> +  prune_bbs (bbd, m_final_bb, prune_default_bb ? NULL : m_default_bb);
>  
>    /* Fixup the PHI nodes in bbF.  */
>    fix_phi_nodes (e1f, e2f, bbf);
> @@ -1053,8 +1318,19 @@ switch_conversion::expand (gswitch *swtch)
>        return;
>      }
>  
> -  /* Check the case label values are within reasonable range:  */
> -  if (!check_range ())
> +  /* Sometimes it is possible to use the "exponential index transform" to 
> help
> +     switch conversion convert switches which it otherwise could not convert.
> +     However, we want to do this transform only when we know that switch
> +     conversion will then really be able to convert the switch.  So we first
> +     check if the transformation is applicable and then maybe later do the
> +     transformation.  */
> +  bool exp_transform_viable = is_exp_index_transform_viable (swtch);
> +
> +  /* Check the case label values are within reasonable range.
> +
> +     If we will be doing exponential index transform, the range will be 
> always
> +     reasonable.  */
> +  if (!exp_transform_viable && !check_range ())
>      {
>        gcc_assert (m_reason);
>        return;
> @@ -1076,6 +1352,9 @@ switch_conversion::expand (gswitch *swtch)
>    /* At this point all checks have passed and we can proceed with the
>       transformation.  */
>  
> +  if (exp_transform_viable)
> +    exp_index_transform (swtch);
> +
>    create_temp_arrays ();
>    gather_default_values (m_default_case_nonstandard
>                        ? gimple_switch_label (swtch, 1)
> diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h
> index 6939eec6018..1a865f85f3a 100644
> --- a/gcc/tree-switch-conversion.h
> +++ b/gcc/tree-switch-conversion.h
> @@ -743,6 +743,19 @@ public:
>    /* Collection information about SWTCH statement.  */
>    void collect (gswitch *swtch);
>  
> +  /* Check that the 'exponential index transform' can be applied.
> +
> +     See the comment at the function definition for more details.  */
> +  bool is_exp_index_transform_viable (gswitch *swtch);
> +
> +  /* Perform the 'exponential index transform'.
> +
> +     The exponential index transform shrinks the range of case numbers which
> +     helps switch conversion convert switches it otherwise could not.
> +
> +     See the comment at the function definition for more details.  */
> +  void exp_index_transform (gswitch *swtch);
> +
>    /* Checks whether the range given by individual case statements of the 
> switch
>       switch statement isn't too big and whether the number of branches 
> actually
>       satisfies the size of the new array.  */
> @@ -900,6 +913,11 @@ public:
>  
>    /* True if CFG has been changed.  */
>    bool m_cfg_altered;
> +
> +  /* True if exponential index transform has been applied.  See the comment 
> at
> +     the definition of exp_index_transform for details about the
> +     transformation.  */
> +  bool m_exp_index_transform_applied;
>  };
>  
>  void
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to