On Fri, Aug 26, 2016 at 11:14 AM, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote: > Hi all, > > The scheduling automata sizes are getting a bit out of control (as the PR > complains about) and the Cortex-A8 > one is one of the largest offenders. An easy, low-hanging fruit in dealing > with this are some of the FP/NEON operations > that have very large reservation durations specified for them. They bloat > the state space by quite a lot and it's not > likely that there is enough parallelism present in the program to fill the > (for example) 64 cycles that are modelled > for the double-precision division. In the past we've dealt with this by > decreasing the modelled reservation duration > to keep the state space down. > > This patch does that for the cortex_a8_neon automaton and caps the > reservation duration for a particular reservation > to 15 cycles. This should be plenty to demonstrate that these are high > latency instructions. > With this patch the number of NDFA states is massively reduced by more than > 70% (26796 -> 6020). > > As I don't have access to reasonable Cortex-A8 hardware I benchmarked it on > SPEC2000 on a Cortex-A15. > The idea (from Ramana) is that since Cortex-A8 tuning is the default tuning > for armv7-a the patch shouldn't hurt > the more widely accessible Cortex-A15 targets. There were no regressions in > performance there. > > Bootstrapped and tested on arm-none-linux-gnueabihf. > Ok for trunk? >
OK, regards Ramana > Thanks, > Kyrill > > 2016-08-26 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > PR target/70473 > * config/arm/cortex-a8-neon.md (cortex_a8_vfp_muld): Reduce > reservation duration to 15 cycles. > (cortex_a8_vfp_macs): Likewise. > (cortex_a8_vfp_macd): Likewise. > (cortex_a8_vfp_divs): Likewise. > (cortex_a8_vfp_divd): Likewise.