Hello Richard: On 17/05/24 11:07 pm, Richard Sandiford wrote: > Ajit Agarwal <aagar...@linux.ibm.com> writes: >> Hello Alex/Richard: >> >> All review comments are addressed. >> >> Common infrastructure of load store pair fusion is divided into target >> independent and target dependent changed code. >> >> Target independent code is the Generic code with pure virtual function >> to interface between target independent and dependent code. >> >> Target dependent code is the implementation of pure virtual function for >> aarch64 target and the call to target independent code. >> >> Bootstrapped and regtested on aarch64-linux-gnu. >> >> Thanks & Regards >> Ajit > > Thanks for the patch and thanks to Alex for the reviews. The patch > looks good to me apart from the minor nits below and the comments that > Alex had. Please post the updated patch for a final ok though. > >> aarch64: Preparatory patch to place target independent and >> dependent changed code in one file >> >> Common infrastructure of load store pair fusion is divided into target >> independent and target dependent changed code. >> >> Target independent code is the Generic code with pure virtual function >> to interface betwwen target independent and dependent code. >> >> Target dependent code is the implementation of pure virtual function for >> aarch64 target and the call to target independent code. >> >> 2024-05-15 Ajit Kumar Agarwal <aagar...@linux.ibm.com> >> >> gcc/ChangeLog: >> >> * config/aarch64/aarch64-ldp-fusion.cc: Place target >> independent and dependent changed code. > > Not sure this is a complete sentence. Maybe: > > * config/aarch64/aarch64-ldp-fusion.cc: Factor out a > target-independent interface and move it to the head of the file. > > That technically isn't detailed enough for a changelog entry, > but IMO we should use it anyway. It's pointless to write the usual > amount of detail when the code is going to move soon. >
Addressed in v7 of the patch. >> --- >> gcc/config/aarch64/aarch64-ldp-fusion.cc | 533 +++++++++++++++-------- >> 1 file changed, 357 insertions(+), 176 deletions(-) >> >> diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc >> b/gcc/config/aarch64/aarch64-ldp-fusion.cc >> index 1d9caeab05d..429e532ea3b 100644 >> --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc >> +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc >> @@ -138,6 +138,225 @@ struct alt_base >> poly_int64 offset; >> }; >> >> +// Virtual base class for load/store walkers used in alias analysis. >> +struct alias_walker >> +{ >> + virtual bool conflict_p (int &budget) const = 0; >> + virtual insn_info *insn () const = 0; >> + virtual bool valid () const = 0; >> + virtual void advance () = 0; >> +}; >> + >> +// When querying handle_writeback_opportunities, this enum is used to >> +// qualify which opportunities we are asking about. >> +enum class writeback { >> + // Only those writeback opportunities that arise from existing >> + // auto-increment accesses. >> + EXISTING, >> + // All writeback opportunities including those that involve folding > > There should be a comma after "opportunities" > >> + // base register updates into a non-writeback pair. >> + ALL >> +}; >> + >> +struct pair_fusion { >> + pair_fusion () >> + { >> + calculate_dominance_info (CDI_DOMINATORS); >> + df_analyze (); >> + crtl->ssa = new rtl_ssa::function_info (cfun); >> + }; > > Unnecessary trailing ";". I think it'd be better to define this and > the destructor out-of-line though. For one thing, it'll reduce the number > of header file dependencies, once the code is moved to its own header file. > Addressed in v7 of the patch. >> + >> + // Given: >> + // - an rtx REG_OP, the non-memory operand in a load/store insn, >> + // - a machine_mode MEM_MODE, the mode of the MEM in that insn, and >> + // - a boolean LOAD_P (true iff the insn is a load), then: >> + // return true if the access should be considered an FP/SIMD access. >> + // Such accesses are segregated from GPR accesses, since we only want >> + // to form pairs for accesses that use the same register file. >> + virtual bool fpsimd_op_p (rtx, machine_mode, bool) >> + { >> + return false; >> + } >> + >> + // Return true if we should consider forming pairs from memory >> + // accesses with operand mode MODE at this stage in compilation. >> + virtual bool pair_operand_mode_ok_p (machine_mode mode) = 0; >> + >> + // Return true iff REG_OP is a suitable register operand for a paired >> + // memory access, where LOAD_P is true if we're asking about loads and >> + // false for stores. MODE gives the mode of the operand. >> + virtual bool pair_reg_operand_ok_p (bool load_p, rtx reg_op, >> + machine_mode mode) = 0; >> + >> + // Return alias check limit. >> + // This is needed to avoid unbounded quadratic behaviour when >> + // performing alias analysis. >> + virtual int pair_mem_alias_check_limit () = 0; > > I think the end result should be to make this a target-independent > --param, but this is ok/good as an intermediate step. > >> + >> + // Returns true if we should try to handle writeback opportunities. > > s/Returns/Return/ Addressed in v7 of the patch. > >> + // WHICH determines the kinds of writeback opportunities the caller >> + // is asking about. >> + virtual bool handle_writeback_opportunities (enum writeback which) = 0 ; > > Excess space before the final ";". > Addressed in v7 of the patch. > How about calling the function "should_handle_writeback" instead? I think > the current name implies that the function does the handling, whereas it > instead checks whether handling should be done. > > The comment above enum class writeback would need to change too. > >> + >> + // Given BASE_MEM, the mem from the lower candidate access for a pair, >> + // and LOAD_P (true if the access is a load), check if we should proceed >> + // to form the pair given the target's code generation policy on >> + // paired accesses. >> + virtual bool pair_mem_ok_with_policy (rtx base_mem, bool load_p) = 0; >> + >> + // Generate the pattern for a paired access. PATS gives the patterns >> + // for the individual memory accesses (which by this point must share a >> + // common base register). If WRITEBACK is non-NULL, then this rtx >> + // describes the update to the base register that should be performed by >> + // the resulting insn. LOAD_P is true iff the accesses are loads. >> + virtual rtx gen_pair (rtx *pats, rtx writeback, bool load_p) = 0; >> + >> + // Return true if INSN is a paired memory access. If so, set LOAD_P to >> + // true iff INSN is a load pair. >> + virtual bool pair_mem_insn_p (rtx_insn *insn, bool &load_p) = 0; >> + >> + // Return true if we should track loads. >> + virtual bool track_loads_p () >> + { >> + return true; >> + } >> + >> + // Return true if we should track stores. >> + virtual bool track_stores_p () >> + { >> + return true; >> + } >> + >> + // Return true if OFFSET is in range for a paired memory access. >> + virtual bool pair_mem_in_range_p (HOST_WIDE_INT offset) = 0; >> + >> + // Given a load/store pair insn in PATTERN, unpack the insn, storing >> + // the register operands in REGS, and returning the mem. LOAD_P is >> + // true for loads and false for stores. >> + virtual rtx destructure_pair (rtx regs[2], rtx pattern, bool load_p) = 0; >> + >> + // Given a pair mem in MEM, register operands in REGS, and an rtx >> + // representing the effect of writeback on the base register in WB_EFFECT, >> + // return an insn representing a writeback variant of this pair. >> + // LOAD_P is true iff the pair is a load. >> + // This is used when promoting existing non-writeback pairs to writeback >> + // variants. >> + virtual rtx gen_promote_writeback_pair (rtx wb_effect, rtx mem, >> + rtx regs[2], bool load_p) = 0; >> + >> + void ldp_fusion_bb (bb_info *bb); > > This seems to be the only place in the interface that preserves the > old "ldp" name. Given that this is a member function, and that the class > that it's in provides context, it might be better to rename the function > to something bland like: > > void process_block (bb_info *bb); > Addressed in v7 of the patch. >> [...] >> @@ -159,9 +378,9 @@ struct ldp_bb_info >> hash_map<def_hash, alt_base> canon_base_map; >> >> static const size_t obstack_alignment = sizeof (void *); >> - bb_info *m_bb; >> >> - ldp_bb_info (bb_info *bb) : m_bb (bb), m_emitted_tombstone (false) >> + ldp_bb_info (bb_info *bb, pair_fusion *d) >> + : m_bb (bb), m_pass (d), m_emitted_tombstone (false) > > Formatting nit: the usual style is to indent ": ..." by two spaces > relative to the function name (rather than line it up with the "("): > > ldp_bb_info (bb_info *bb, pair_fusion *d) > : m_bb (bb), m_pass (d), m_emitted_tombstone (false) > Addressed in v7 of the patch. >> [...] >> @@ -510,11 +760,11 @@ ldp_bb_info::track_access (insn_info *insn, bool >> load_p, rtx mem) >> // constraint; this early return means we never get to the code >> // that calls targetm.legitimize_address_displacement. >> // >> - // So for now, it's better to punt when we can't be sure that the >> - // offset is in range for LDP/STP. Out-of-range cases can then be >> - // handled after RA by the out-of-range LDP/STP peepholes. Eventually, it >> - // would be nice to handle known out-of-range opportunities in the >> - // pass itself (for stack accesses, this would be in the post-RA pass). >> + // It's better to punt when we can't be sure that the > > I'd keep the "So for now,", unless Alex thinks we shouldn't. > >> + // offset is in range for paired access. On aarch64, Out-of-range cases > > s/Out-of-range/out-of-range/ Addressed in v7 of the patch. > >> + // can then be handled after RA by the out-of-range LDP/STP peepholes. >> + // Eventually, it would be nice to handle known out-of-range opportunities >> + // in the pass itself (for stack accesses, this would be in the post-RA >> pass). >> if (!reload_completed >> && (REGNO (base) == FRAME_POINTER_REGNUM >> || REGNO (base) == ARG_POINTER_REGNUM)) >> [...] >> @@ -1199,8 +1449,8 @@ extract_writebacks (bool load_p, rtx pats[2], int >> changed) >> // base register. If there is one, we choose the first such update after >> // PAIR_DST that is still in the same BB as our pair. We return the new >> def in >> // *ADD_DEF and the resulting writeback effect in *WRITEBACK_EFFECT. >> -static insn_info * >> -find_trailing_add (insn_info *insns[2], >> +insn_info * >> +pair_fusion::find_trailing_add (insn_info *insns[2], >> const insn_range_info &pair_range, >> int initial_writeback, >> rtx *writeback_effect, > > The parameters should be reindented to match the longer name. > Addressed in v7 of the patch. > Thanks again for doing this. > > Richard Thanks & Regards Ajit