Ping. Anybody have a look? Thanks, bin
On Tue, Nov 18, 2014 at 4:34 PM, Bin Cheng <bin.ch...@arm.com> wrote: > Hi, > This is the patch implementing ldp/stp optimization for aarch64. It > consists of two parts. The first one is peephole part, which further > includes ldp/stp patterns (both peephole patterns and the insn match > patterns) and auxiliary functions (both checking the validity and merging). > The second part implements the aarch64 backend hook for sched-fusion pass, > which calculates appropriate priorities for different kinds of load/store > instructions. With these priorities, sched-fusion pass can schedule as many > load/store instructions together as possible, thus the coming peephole2 pass > can merge them. > > I collected data for miscellaneous benchmarks. Some cases are improved; > most of the rest cases are not regressed; only couple of them are regressed > a little by 2-3%. After looking into the regressions I can confirm that > code transformation is generally good with many load/stores paired. These > regressions are most probably false alarms and caused by other issues. > > Conclusion is this patch can pair lots of consecutive load/store > instructions into ldp/stp. The conclusion can be proven by code size > improvement of benchmarks. E.g., in general it cuts off text size of > spec2k6 binaries (O3 level, not statically linked in my build) by 1.68%. > > Bootstrap and test on aarch64. Is it OK? > > 2014-11-18 Bin Cheng <bin.ch...@arm.com> > > * config/aarch64/aarch64.md (load_pair<mode>): Split to > load_pairsi, load_pairdi, load_pairsf and load_pairdf. > (load_pairsi, load_pairdi, load_pairsf, load_pairdf): Split > from load_pair<mode>. New alternative to support int/fp > registers in fp/int mode patterns. > (store_pair<mode>:): Split to store_pairsi, store_pairdi, > store_pairsf and store_pairdi. > (store_pairsi, store_pairdi, store_pairsf, store_pairdf): Split > from store_pair<mode>. New alternative to support int/fp > registers in fp/int mode patterns. > (*load_pair_extendsidi2_aarch64): New pattern. > (*load_pair_zero_extendsidi2_aarch64): New pattern. > (aarch64-ldpstp.md): Include. > * config/aarch64/aarch64-ldpstp.md: New file. > * config/aarch64/aarch64-protos.h (aarch64_gen_adjusted_ldpstp): > New. > (extract_base_offset_in_addr): New. > (aarch64_operands_ok_for_ldpstp): New. > (aarch64_operands_adjust_ok_for_ldpstp): New. > * config/aarch64/aarch64.c (enum sched_fusion_type): New enum. > (TARGET_SCHED_FUSION_PRIORITY): New hook. > (fusion_load_store): New functon. > (extract_base_offset_in_addr): New function. > (aarch64_gen_adjusted_ldpstp): New function. > (aarch64_sched_fusion_priority): New function. > (aarch64_operands_ok_for_ldpstp): New function. > (aarch64_operands_adjust_ok_for_ldpstp): New function. > > 2014-11-18 Bin Cheng <bin.ch...@arm.com> > > * gcc.target/aarch64/ldp-stp-1.c: New test. > * gcc.target/aarch64/ldp-stp-2.c: New test. > * gcc.target/aarch64/ldp-stp-3.c: New test. > * gcc.target/aarch64/ldp-stp-4.c: New test. > * gcc.target/aarch64/ldp-stp-5.c: New test. > * gcc.target/aarch64/lr_free_1.c: Disable scheduling fusion > and peephole2 pass.