Pushed to master with the following fixups:
  - new timevar added
  - nits addressed
  - whitespace fixes

Philipp.


On Mon, 25 Nov 2024 at 03:30, Jeff Law <jeffreya...@gmail.com> wrote:
>
>
>
> On 11/9/24 2:48 AM, Konstantinos Eleftheriou wrote:
> > From: kelefth <konstantinos.elefther...@vrull.eu>
> >
> > This pass detects cases of expensive store forwarding and tries to avoid 
> > them
> > by reordering the stores and using suitable bit insertion sequences.
> > For example it can transform this:
> >
> >       strb    w2, [x1, 1]
> >       ldr     x0, [x1]      # Expensive store forwarding to larger load.
> >
> > To:
> >
> >       ldr     x0, [x1]
> >       strb    w2, [x1]
> >       bfi     x0, x2, 0, 8
> >
> > Assembly like this can appear with bitfields or type punning / unions.
> > On stress-ng when running the cpu-union microbenchmark the following 
> > speedups
> > have been observed.
> >
> >    Neoverse-N1:      +29.4%
> >    Intel Coffeelake: +13.1%
> >    AMD 5950X:        +17.5%
> >
> > The transformation is rejected on cases that would cause store_bit_field
> > to generate subreg expressions on different register classes.
> > Files avoid-store-forwarding-4.c and avoid-store-forwarding-5.c contain
> > such cases and have been marked as XFAIL.
> >
> > There is a special handling for machines with BITS_BIG_ENDIAN !=
> > BYTES_BIG_ENDIAN. The need for this came up from an issue in H8
> > architecture, which uses big-endian ordering, but BITS_BIG_ENDIAN
> > is false. In that case, the START parameter of store_bit_field
> > needs to be calculated from the end of the destination register.
> >
> > gcc/ChangeLog:
> >
> >       * Makefile.in (OBJS): Add avoid-store-forwarding.o.
> >       * common.opt (favoid-store-forwarding): New option.
> >       * common.opt.urls: Regenerate.
> >       * doc/invoke.texi: New param store-forwarding-max-distance.
> >       * doc/passes.texi: Document new pass.
> >       * doc/tm.texi: Regenerate.
> >       * doc/tm.texi.in: Document new pass.
> >       * params.opt (store-forwarding-max-distance): New param.
> >       * passes.def: Add pass_rtl_avoid_store_forwarding before
> >       pass_early_remat.
> >       * target.def (avoid_store_forwarding_p): New DEFHOOK.
> >       * target.h (struct store_fwd_info): Declare.
> >       * targhooks.cc (default_avoid_store_forwarding_p): New function.
> >       * targhooks.h (default_avoid_store_forwarding_p): Declare.
> >       * tree-pass.h (make_pass_rtl_avoid_store_forwarding): Declare.
> >       * avoid-store-forwarding.cc: New file.
> >       * avoid-store-forwarding.h: New file.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/aarch64/avoid-store-forwarding-1.c: New test.
> >       * gcc.target/aarch64/avoid-store-forwarding-2.c: New test.
> >       * gcc.target/aarch64/avoid-store-forwarding-3.c: New test.
> >       * gcc.target/aarch64/avoid-store-forwarding-4.c: New test.
> >       * gcc.target/aarch64/avoid-store-forwarding-5.c: New test.
> >       * gcc.target/x86_64/abi/callabi/avoid-store-forwarding-1.c: New test.
> >          * gcc.target/x86_64/abi/callabi/avoid-store-forwarding-2.c: New 
> > test.
> >
> > Signed-off-by: Philipp Tomsich <philipp.toms...@vrull.eu>
> > Signed-off-by: Konstantinos Eleftheriou <konstantinos.elefther...@vrull.eu>
> >
> > Series-version: 8
> >
> > Series-changes: 8
> >       - Fix store_bit_field call for big-endian targets, where
> >         BITS_BIG_ENDIAN is false.
> >       - Handle store_forwarding_max_distance = 0 as a special case that
> >         disables cost checks for avoid-store-forwarding.
> >       - Update testcases for AArch64 and add testcases for x86-64.
> >
> > Series-changes: 7
> >       - Fix bug when copying back the load register, in the case that the
> >         load is eliminated.
> >
> > Series-changes: 6
> >       - Reject the transformation on cases that would cause store_bit_field
> >         to generate subreg expressions on different register classes.
> >         Files avoid-store-forwarding-4.c and avoid-store-forwarding-5.c
> >            contain such cases and have been marked as XFAIL.
> >       - Use optimize_bb_for_speed_p instead of optimize_insn_for_speed_p.
> >       - Inline and remove get_load_mem.
> >       - New implementation for is_store_forwarding.
> >       - Refactor the main loop in avoid_store_forwarding.
> >       - Avoid using the word 'forwardings'.
> >       - Use lowpart_subreg instead of validate_subreg + gen_rtx_subreg.
> >       - Don't use df_insn_rescan where not needed.
> >       - Change order of emitting stores and bit insert instructions.
> >       - Check and reject loads for which the dest register overlaps with 
> > src.
> >       - Remove unused variable.
> >       - Change some gen_mov_insn function calls to gen_rtx_SET.
> >       - Subtract the cost of eliminated load, instead of 1, for the total 
> > cost.
> >       - Use delete_insn instead of set_insn_deleted.
> >       - Regenerate common.opt.urls.
> >       - Add some more comments.
> >
> > Series-changes: 5
> >       - Fix bug with BIG_ENDIAN targets.
> >       - Fix bug with unrecognized instructions.
> >       - Fix / simplify pass init/fini.
> >
> > Series-changes: 4
> >       - Change pass scheduling to run after sched1.
> >       - Add target hook to decide whether a store forwarding instance
> >         should be avoided or not.
> >       - Fix bugs.
> >
> > Series-changes: 3
> >       - Only emit SUBREG after calling validate_subreg.
> >       - Fix memory corruption due to vec self-reference.
> >       - Fix bitmap_bit_in_range_p ICE due to BLKMode.
> >       - Reject MEM to MEM sets.
> >       - Add get_load_mem comment.
> >       - Add new testcase.
> >
> > Series-changes: 2
> >       - Allow modes that are not scalar_int_mode.
> >       - Introduce simple costing to avoid unprofitable transformations.
> >       - Reject bit insert sequences that spill to memory.
> >       - Document new pass.
> >       - Fix and add testcases.
> > ---
>
> > +namespace {
> > +
> > +const pass_data pass_data_avoid_store_forwarding =
> > +{
> > +  RTL_PASS, /* type.  */
> > +  "avoid_store_forwarding", /* name.  */
> > +  OPTGROUP_NONE, /* optinfo_flags.  */
> > +  TV_NONE, /* tv_id.  */
> > +  0, /* properties_required.  */
> > +  0, /* properties_provided.  */
> > +  0, /* properties_destroyed.  */
> > +  0, /* todo_flags_start.  */
> > +  TODO_df_finish /* todo_flags_finish.  */
> > +};
> Probably want a TV entry for store forwarding.  While it shouldn't be a
> big time sink, we've seen other passes that should be efficient go nuts
> in some cases (extension elimination being the most recent example).
>
>
>
>
> > +
> > +/* Try to modify BB so that expensive store forwarding cases are avoided.  
> > */
> > +
> > +void store_forwarding_analyzer::avoid_store_forwarding (basic_block bb)
>
> Formatting nit.  Put the return type on its own line so that the
> function name always starts on column 0 of its own line.
>
>
>
> > +
> > +/* Update pass statistics.  */
> > +
> > +void store_forwarding_analyzer::update_stats (function *fn)
> Similarly.
>
>
> OK with the new timevar and two formatting nits.  Thanks for your
> patience on this.
>
> Jeff
>
>

Reply via email to