On 6/3/06, Steven Bosscher <[EMAIL PROTECTED]> wrote:
> For floating point data, the latter is the only interesting case because 
float loads only
> access the L2.  Thus using "lfetch" for floating point arrays will 
unnecessarily wipe out
> the contents of L1.  (gcc 3.2.3 only seems to generate "lfetch", which is why 
I ask...)

You could experiment with this for ia64 by hacking issue_prefetch_ref
in tree-ssa-loop-prefetch.c to issue a prefetch to L2 for floating
point types.

E.g. something like this, which is (needless to say) untested but
something you could play with.

Gr.
Steven
Index: tree-ssa-loop-prefetch.c
===================================================================
--- tree-ssa-loop-prefetch.c	(revision 114315)
+++ tree-ssa-loop-prefetch.c	(working copy)
@@ -816,7 +816,7 @@ static void
 issue_prefetch_ref (struct mem_ref *ref, unsigned unroll_factor, unsigned ahead)
 {
   HOST_WIDE_INT delta;
-  tree addr, addr_base, prefetch, params, write_p;
+  tree addr, addr_base, prefetch, params, write_p, locality;
   block_stmt_iterator bsi;
   unsigned n_prefetches, ap;
 
@@ -838,11 +838,21 @@ issue_prefetch_ref (struct mem_ref *ref,
 			  addr_base, build_int_cst (ptr_type_node, delta));
       addr = force_gimple_operand_bsi (&bsi, unshare_expr (addr), true, NULL);
 
-      /* Create the prefetch instruction.  */
+      /* Create the prefetch instruction.  Do this by building a call to
+         `void __builtin_prefetch (const void *ADDR, int RW, int LOCALITY)'.
+
+	 ??? The `locality' parameter is a shameless, untested hack to
+	 force lfetch.nt1 -- hopefully.  */
       write_p = ref->write_p ? integer_one_node : integer_zero_node;
-      params = tree_cons (NULL_TREE, addr,
-			  tree_cons (NULL_TREE, write_p, NULL_TREE));
-				 
+      locality = FLOAT_TYPE_P (mem_ref->base)
+		 ? integer_one_node : integer_zero_node;
+      params = tree_cons (NULL_TREE,
+			  addr,
+			  tree_cons (NULL_TREE,
+				     write_p,
+				     tree_cons (NULL_TREE,
+						locality,
+						NULL_TREE)));
       prefetch = build_function_call_expr (built_in_decls[BUILT_IN_PREFETCH],
 					   params);
       bsi_insert_before (&bsi, prefetch, BSI_SAME_STMT);

Reply via email to