A typical code sequence produced by the `casesi_internal_mips16_<mode>' 
insn is like this:

        sltu    $3, 11   # 16   casesi_internal_mips16_si       [length = 32]
        bteqz   $L2
        sll     $5, $3, 1
        la      $3, $L4
        addu    $5, $3, $5
        lh      $5, 0($5)
        addu    $3, $3, $5
        j       $3
        .align  1
        .align  2
        .type   __jump_foo_4, @object
__jump_foo_4:
$L4:

which in turn assembles to this binary code:

   a:   5b0b            sltiu   v1,11
   c:   601d            bteqz   48 <__pool_foo_12>
   e:   3564            sll     a1,v1,1
  10:   0b03            la      v1,1c <__jump_foo_4>
  12:   e3b5            addu    a1,v1,a1
  14:   8da0            lh      a1,0(a1)
  16:   e3ad            addu    v1,a1
  18:   eb80            jrc     v1
  1a:   6500            nop

0000001c <__jump_foo_4>:

As you can see the code length estimate is 32, which in turn comes from 
the instruction count being set to 16 for the insn, telling the compiler 
that the pattern will produce the equivalent of 16 regular (16-bit or 
unextended) MIPS16 instructions, as per the attribute's definition.

This estimate is too pessimistic as this pattern will never actually 
reach so many instructions.  Taking the instructions produced one by one 
we have:

1.      sltu    $3, 11     => 1 or 2 depending on the immediate  => 2

2.      bteqz   $L2        => 1 or 2 depending on label distance => 2

3.      sll     $5, $3, 1  => (HImode) fixed 1
        sll     $5, $3, 2  => (SImode) fixed 1                   => 1

4.      la      $3, $L4    => (Pmode == SImode) fixed 1 as $L4
                              is close and word-aligned
        dla     $3, $L4    => (Pmode == DImode) fixed 1 as $L4
                              is close and word-aligned          => 1

5.      addu    $5, $3, $5 => (Pmode == SImode) fixed 1
        daddu   $5, $3, $5 => (Pmode == DImode) fixed 1          => 1

6.      lh      $5, 0($5)  => (HImode) fixed 1
        lw      $5, 0($5)  => (SImode) fixed 1                   => 1

7.      addu    $3, $3, $5 => (Pmode == SImode) fixed 1
        daddu   $3, $3, $5 => (Pmode == SImode) fixed 1          => 1

8.      j       $3         => 1 if JRC is used or 2 if JR/NOP is => 2
                                                                 ----
                                                                   11

Word alignment of the jump table start is explicitly arranged by 
ASM_OUTPUT_BEFORE_CASE_LABEL and is beneficial as we can use the short 
encoding of LH at no loss in code size, because any 2-byte padding
produced by the `.align 2' pseudo-op would otherwise be consumed by the 
extended form of LH required to encode a PC-relative offset which is not 
a multiple of 4, possibly at some performance loss required for the 
extra instruction halfword fetch.

Set the instruction count to 11 then.

        gcc/
        * config/mips/mips.md (casesi_internal_mips16_<mode>): Set 
        `insn_count' to 11 rather than 16.
---
 OK to apply?

  Maciej

gcc-mips16-casesi-insn-count.diff
Index: gcc/gcc/config/mips/mips.md
===================================================================
--- gcc.orig/gcc/config/mips/mips.md    2016-11-12 10:57:12.544746018 +0000
+++ gcc/gcc/config/mips/mips.md 2016-11-12 10:57:13.972699749 +0000
@@ -6444,7 +6444,7 @@
 
   return "j\t%4";
 }
-  [(set_attr "insn_count" "16")])
+  [(set_attr "insn_count" "11")])
 
 ;; For TARGET_USE_GOT, we save the gp in the jmp_buf as well.
 ;; While it is possible to either pull it off the stack (in the

Reply via email to