A typical code sequence produced by the `casesi_internal_mips16_<mode>'
insn is like this:
sltu $3, 11 # 16 casesi_internal_mips16_si [length = 32]
bteqz $L2
sll $5, $3, 1
la $3, $L4
addu $5, $3, $5
lh $5, 0($5)
addu $3, $3, $5
j $3
.align 1
.align 2
.type __jump_foo_4, @object
__jump_foo_4:
$L4:
which in turn assembles to this binary code:
a: 5b0b sltiu v1,11
c: 601d bteqz 48 <__pool_foo_12>
e: 3564 sll a1,v1,1
10: 0b03 la v1,1c <__jump_foo_4>
12: e3b5 addu a1,v1,a1
14: 8da0 lh a1,0(a1)
16: e3ad addu v1,a1
18: eb80 jrc v1
1a: 6500 nop
0000001c <__jump_foo_4>:
As you can see the code length estimate is 32, which in turn comes from
the instruction count being set to 16 for the insn, telling the compiler
that the pattern will produce the equivalent of 16 regular (16-bit or
unextended) MIPS16 instructions, as per the attribute's definition.
This estimate is too pessimistic as this pattern will never actually
reach so many instructions. Taking the instructions produced one by one
we have:
1. sltu $3, 11 => 1 or 2 depending on the immediate => 2
2. bteqz $L2 => 1 or 2 depending on label distance => 2
3. sll $5, $3, 1 => (HImode) fixed 1
sll $5, $3, 2 => (SImode) fixed 1 => 1
4. la $3, $L4 => (Pmode == SImode) fixed 1 as $L4
is close and word-aligned
dla $3, $L4 => (Pmode == DImode) fixed 1 as $L4
is close and word-aligned => 1
5. addu $5, $3, $5 => (Pmode == SImode) fixed 1
daddu $5, $3, $5 => (Pmode == DImode) fixed 1 => 1
6. lh $5, 0($5) => (HImode) fixed 1
lw $5, 0($5) => (SImode) fixed 1 => 1
7. addu $3, $3, $5 => (Pmode == SImode) fixed 1
daddu $3, $3, $5 => (Pmode == SImode) fixed 1 => 1
8. j $3 => 1 if JRC is used or 2 if JR/NOP is => 2
----
11
Word alignment of the jump table start is explicitly arranged by
ASM_OUTPUT_BEFORE_CASE_LABEL and is beneficial as we can use the short
encoding of LH at no loss in code size, because any 2-byte padding
produced by the `.align 2' pseudo-op would otherwise be consumed by the
extended form of LH required to encode a PC-relative offset which is not
a multiple of 4, possibly at some performance loss required for the
extra instruction halfword fetch.
Set the instruction count to 11 then.
gcc/
* config/mips/mips.md (casesi_internal_mips16_<mode>): Set
`insn_count' to 11 rather than 16.
---
OK to apply?
Maciej
gcc-mips16-casesi-insn-count.diff
Index: gcc/gcc/config/mips/mips.md
===================================================================
--- gcc.orig/gcc/config/mips/mips.md 2016-11-12 10:57:12.544746018 +0000
+++ gcc/gcc/config/mips/mips.md 2016-11-12 10:57:13.972699749 +0000
@@ -6444,7 +6444,7 @@
return "j\t%4";
}
- [(set_attr "insn_count" "16")])
+ [(set_attr "insn_count" "11")])
;; For TARGET_USE_GOT, we save the gp in the jmp_buf as well.
;; While it is possible to either pull it off the stack (in the