Here's the patch for sms-and-memory-dependencies. The idea is to bypass
the sched-deps.c {output,read,anti,true}_dependence tests altogether --
which is easy to do thanks to the note_mem_dep hook -- and instead handle
them in ddg.c. The ddg.c tests then use RTL loop iv analysis to try
to get longer distances on the memory dependencies. (Note that other
memory-related dependencies, such as those between volatile MEMs and
other volatile instructions, are still handled by sched-deps.c.)
Dependencies are now always created in pairs, so there's no need for
get_sched_window to set an upper bound when processing incoming MEM_DEPs,
or a lower bound when processing outgoing MEM_DEPs; we can rely on the
partnering edge to do that instead.
Richard
gcc/
* Makefile.in (ddg.o): Depend on $(TREE_PASS_H).
* ddg.h (REG_OR_MEM_DEP, REG_AND_MEM_DEP): Delete.
(ddg_mem_ref): New structure.
(ddg): Add loads and stores array.
(create_ddg): Add a loop argument.
(add_edges_to_ddg): Declare.
(MAX_DDG_DISTANCE): New macro.
* ddg.c: Include tree-pass.h.
(mem_ref_p, mark_mem_use, mark_mem_use_1, mem_read_insn_p)
(mark_mem_store, mem_write_insn_p, rtx_mem_access_p)
(mem_access_insn_p): Delete.
(create_mem_ref): New function.
(graph_and_node): New structure.
(record_loads, record_stores): New functions.
(create_ddg_dep_from_intra_loop_link): Treat all dependencies
as register dependencies.
(walk_mems_2, walk_mems_1, insns_may_alias_p, add_intra_loop_mem_dep)
(add_inter_loop_mem_dep): Delete.
(build_intra_loop_deps): Ignore memory dependencies created by
sched-deps.c. Don't handle memory dependencies here.
(measure_mem_distance, add_memory_dep): New functions.
(FOR_EACH_LATER_MEM_REF): New macro.
(build_memory_deps): New function.
(create_ddg): Take the loop as argument. Don't count loads and
stores here. Call iv_analysis_loop_init. Pass all loads to
record_loads and all stores to record_stores. Move edge
creation to...
(add_edges_to_ddg): ...this new function. Also call
build_memory_deps.
* modulo-sched.c (sat_mulpp, sat_addsp, sat_subsp): New functions.
(schedule_reg_moves): Only handle register dependencies.
(sms_schedule): Update call to create_ddg. Call iv_analysis_done
after creating all ddgs. Only set issue_rate if there are ddgs.
Only call setup_sched_infos and haifa_sched_init if there are ddgs.
Call add_edges_to_ddg before processing each ddg.
(get_sched_window): Use saturating arithmetic. Do not add an
implicit upper bound for incoming MEM_DEPs, or an implicit lower
bound for outgoing MEM_DEPs. Rework calculation of final window.
(calculate_must_precede_follow, compute_split_row): Use saturating
arithmetic.
Index: gcc/Makefile.in
===
--- gcc/Makefile.in 2011-12-30 13:13:45.077544981 +
+++ gcc/Makefile.in 2011-12-30 13:24:57.330195801 +
@@ -3316,7 +3316,7 @@ ddg.o : ddg.c $(DDG_H) $(CONFIG_H) $(SYS
$(DIAGNOSTIC_CORE_H) $(RTL_H) $(TM_P_H) $(REGS_H) $(FUNCTION_H) \
$(FLAGS_H) insn-config.h $(INSN_ATTR_H) $(EXCEPT_H) $(RECOG_H) \
$(SCHED_INT_H) $(CFGLAYOUT_H) $(CFGLOOP_H) $(EXPR_H) $(BITMAP_H) \
- hard-reg-set.h sbitmap.h $(TM_H)
+ hard-reg-set.h sbitmap.h $(TM_H) $(TREE_PASS_H)
modulo-sched.o : modulo-sched.c $(DDG_H) $(CONFIG_H) $(CONFIG_H) $(SYSTEM_H) \
coretypes.h $(TARGET_H) $(DIAGNOSTIC_CORE_H) $(RTL_H) $(TM_P_H) $(REGS_H)
$(FUNCTION_H) \
$(FLAGS_H) insn-config.h $(INSN_ATTR_H) $(EXCEPT_H) $(RECOG_H) \
Index: gcc/ddg.h
===
--- gcc/ddg.h 2011-12-30 13:13:45.077544981 +
+++ gcc/ddg.h 2011-12-30 13:24:57.324195831 +
@@ -35,8 +35,7 @@ typedef struct ddg_scc *ddg_scc_ptr;
typedef struct ddg_all_sccs *ddg_all_sccs_ptr;
typedef enum {TRUE_DEP, OUTPUT_DEP, ANTI_DEP} dep_type;
-typedef enum {REG_OR_MEM_DEP, REG_DEP, MEM_DEP, REG_AND_MEM_DEP}
-dep_data_type;
+typedef enum {REG_DEP, MEM_DEP} dep_data_type;
/* The following two macros enables direct access to the successors and
predecessors bitmaps held in each ddg_node. Do not make changes to
@@ -44,6 +43,28 @@ typedef enum {REG_OR_MEM_DEP, REG_DEP, M
#define NODE_SUCCESSORS(x) ((x)->successors)
#define NODE_PREDECESSORS(x) ((x)->predecessors)
+/* A structure that represents a memory read or write in the DDG;
+ context decides which. */
+struct ddg_mem_ref {
+ /* The previous reference of the same type (read or write) in the DDG. */
+ struct ddg_mem_ref *prev;
+
+ /* The DDG node that contains the memory reference. */
+ ddg_node_ptr node;
+
+ /* The memory reference itself. */
+ rtx mem;
+
+ /* If the address is a known induction var