https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117091

--- Comment #8 from Filip Kastl <pheeck at gcc dot gnu.org> ---
I've looked into analyze_switch_statement(), find_bit_tests() and
find_jump_tables() and did some perf-ing.  Some observations:

1) I don't think that the code in SNIPPET 1 is responsible for the slowness. 
The comment here is correct: the body of the for loop is indeed constant time. 
I think the cause of the slowness is the code in SNIPPET 2 instead (it's from
find_bit_tests()).  I believe this code is O(n^3) (Maybe careful analysis would
reveal it's actually O(n^2) but that is besides the point).

2) I think that find_jump_tables() is also O(n^3) or O(n^2).  Andi is calling
for dynamic programming, but I think this already uses dynamic programming (as
does find_bit_tests()).

3) I don't think there is a limit on clusters.length().  At least I didn't find
it.  Since we're dealing with at least O(n^2) I agree that there should be a
limit.

I think that (at least asymptotically) there isn't much to improve.  I don't
think we can find the best possible clustering in time under O(n^2).  What I
would do now is add the limit for number of clusters (although I'm not sure
what to do if we hit it -- just lower the switch as a decision tree?) and
disable bit tests and jump tables for -O0.

Andi, could you point me to the recursive function you mentioned?  I'm not sure
which function you mean.

P.S. I guess I should be technically using Theta instead of O but I think it is
clear what I'm trying to say.


--- SNIPPET 1 ---
  auto_vec<int, m_max_case_bit_tests> dest_bbs;

  for (unsigned i = start; i <= end; i++)
    {
      simple_cluster *sc = static_cast<simple_cluster *> (clusters[i]);
      /* m_max_case_bit_tests is very small integer, thus the operation
         is constant. */
      if (!dest_bbs.contains (sc->m_case_bb->index))
--- ---

--- SNIPPET 2 ---
  for (unsigned i = 1; i <= l; i++)
    {
      /* Set minimal # of clusters with i-th item to infinite.  */
      min.quick_push (min_cluster_item (INT_MAX, INT_MAX, INT_MAX));

      for (unsigned j = 0; j < i; j++)
        {
          if (min[j].m_count + 1 < min[i].m_count
              && can_be_handled (clusters, j, i - 1))
            min[i] = min_cluster_item (min[j].m_count + 1, j, INT_MAX);
        }

      gcc_checking_assert (min[i].m_count != INT_MAX);
    }
--- ---

Reply via email to