Hi all,

I've had this one sitting in my tree for some time.
The arm1020e automaton has no business being as large as it is (3185 states).
Most of the bloat is due to overly large reservation durations for calls and FP 
division.

This patch reduces the durations to something more sensible.
This brings down the number of states from 3185 states to 320 states.
There are bigger fish to fry on that front, but every little bit helps as we're
already approaching a gigabyte of memory required for genautomata processing.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk or GCC 7?

Thanks,
Kyrill

2016-02-29  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

    * config/arm/arm1020e.md (1020call_op): Reduce reservation
    duration.
    (v10_fdivs): Likewise.
    (v10_fdivd): Likewise.
diff --git a/gcc/config/arm/arm1020e.md b/gcc/config/arm/arm1020e.md
index 7cdab57ddb34346fa21f2935d2bc29c4f0b827d8..84a300d804541d63e82c08f517f4af136df2d642 100644
--- a/gcc/config/arm/arm1020e.md
+++ b/gcc/config/arm/arm1020e.md
@@ -246,13 +246,14 @@ (define_insn_reservation "1020branch_op" 0
       (eq_attr "type" "branch"))
  "1020a_e")
 
-;; The latency for a call is not predictable.  Therefore, we use 32 as
-;; roughly equivalent to positive infinity.
+;; The latency for a call is not predictable.  Therefore, we model as blocking
+;; execution for a number of cycles but we can't do anything more accurate
+;; than that.
 
 (define_insn_reservation "1020call_op" 32
  (and (eq_attr "tune" "arm1020e,arm1022e")
       (eq_attr "type" "call"))
- "1020a_e*32")
+ "1020a_e*4")
 
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;; VFP
@@ -300,12 +301,12 @@ (define_insn_reservation "v10_fmul" 6
 (define_insn_reservation "v10_fdivs" 18
  (and (eq_attr "vfp10" "yes")
       (eq_attr "type" "fdivs, fsqrts"))
- "1020a_e+v10_ds*14")
+ "1020a_e+v10_ds*4")
 
 (define_insn_reservation "v10_fdivd" 32
  (and (eq_attr "vfp10" "yes")
       (eq_attr "type" "fdivd, fsqrtd"))
- "1020a_e+v10_fmac+v10_ds*28")
+ "1020a_e+v10_fmac+v10_ds*4")
 
 (define_insn_reservation "v10_floads" 4
  (and (eq_attr "vfp10" "yes")

Reply via email to