On Sun, 2014-08-17 at 14:52 -0400, David Edelsohn wrote: > On Wed, Aug 13, 2014 at 7:14 PM, Bill Schmidt > <wschm...@linux.vnet.ibm.com> wrote: > > Hi, > > > > This patch adds a PowerPC-specific pass just prior to the first cse RTL > > pass. The pass runs only when generating little-endian code for Power8 > > with VSX enabled, and for -O1 and up. For this particular subtarget, > > the use of the big-endian-biased vector load and store instructions > > requires permutations to order vector elements for little endian. To > > reduce the overhead of these permutations, this pass looks for > > computations for which the exact lanes in which computations are > > performed does not matter, so long as the results are returned to > > storage in the proper order. For such computations we can remove the > > xxpermdi's associated with the vector loads and stores. > > > > This patch relies on another patch posted today that converts a struct > > used by the web pass into a base class that this patch can subclass. If > > it's determined that the other patch isn't appropriate, then this patch > > will need modifications to duplicate the union-find logic. > > > > A complete description of the new pass appears in rs6000.c (search for > > "Analyze vector computations"). That description also identifies some > > remaining opportunities we can follow up with later. > > > > A number of new tests are added to verify that the pass works as > > expected for some vectorized code samples. > > > > Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no > > regressions. Is this ok for trunk? > > > > Thanks, > > Bill > > > > > > [gcc] > > > > 2014-08-13 Bill Schmidt <wschm...@linux.vnet.ibm.com> > > > > * config/rs6000/rs6000.c (context.h): New include. > > (tree-pass.h): Likewise. > > (make_pass_analyze_swaps): New decl. > > (rs6000_option_override): Register pass_analyze_swaps. > > (swap_web_entry): New subsclass of web_entry_base (df.h). > > (special_handling_values): New enum. > > (union_defs): New function. > > (union_uses): Likewise. > > (insn_is_load_p): Likewise. > > (insn_is_store_p): Likewise. > > (insn_is_swap_p): Likewise. > > (rtx_is_swappable_p): Likewise. > > (insn_is_swappable_p): Likewise. > > (chain_purpose): New enum. > > (chain_contains_only_swaps): New function. > > (mark_swaps_for_removal): Likewise. > > (swap_const_vector_halves): Likewise. > > (adjust_subreg_index): Likewise. > > (permute_load): Likewise. > > (permute_store): Likewise. > > (handle_special_swappables): Likewise. > > (replace_swap_with_copy): Likewise. > > (dump_swap_insn_table): Likewise. > > (rs6000_analyze_swaps): Likewise. > > (pass_data_analyze_swaps): New pass_data. > > (pass_analyze_swaps): New rtl_opt_pass. > > (make_pass_analyze_swaps): New function. > > * config/rs6000/rs6000.opt (moptimize-swaps): New option. > > > > [gcc/testsuite] > > > > 2014-08-13 Bill Schmidt <wschm...@linux.vnet.ibm.com> > > > > * gcc.target/powerpc/swaps-p8-1.c: New test. > > * gcc.target/powerpc/swaps-p8-2.c: New test. > > * gcc.target/powerpc/swaps-p8-3.c: New test. > > * gcc.target/powerpc/swaps-p8-4.c: New test. > > * gcc.target/powerpc/swaps-p8-5.c: New test. > > * gcc.target/powerpc/swaps-p8-6.c: New test. > > * gcc.target/powerpc/swaps-p8-7.c: New test. > > * gcc.target/powerpc/swaps-p8-8.c: New test. > > * gcc.target/powerpc/swaps-p8-9.c: New test. > > * gcc.target/powerpc/swaps-p8-10.c: New test. > > * gcc.target/powerpc/swaps-p8-11.c: New test. > > * gcc.target/powerpc/swaps-p8-12.c: New test. > > This looks okay, although I was hoping that other developers with more > DF and web experience would double check.
Thanks for the review! > Why are you specifically gating the pass on POWER8? The problem is introduced in POWER8 (since LE isn't supported earlier), and I hope that this will no longer be necessary for -mcpu=power9. If for some reason this doesn't turn out to be the case, we'll need to make a change at that time, I suppose... Thanks, Bill > > Thanks, David >