Hi,

We currently do not set any interesting default values for jump and function
alignment in AArch64. I've made the formula for these values derive from
the issue rate of the processor as so:

  jumps: 4 * processor issue-rate (rounded down to nearest power of two)
  functions: 4 * processor issue-rate (rounded up to nearest power of two)

This is sensible for the ARMv8-a implementations I tested on. An
alternative patch would make these values new fields in the tuning
tables.

This happens to work well for some benchmarks and doesn't harm others.
The benefit swings depending on the existing alignment and the knock-on
effects.

Bootstrapped on aarch64-none-linux-gnu with no issues.

Does anyone have any thoughts or preferences as to how we set these
values in future? If not, OK For trunk?

Thanks,
James

---
2014-11-14  James Greenhalgh  <james.greenha...@arm.com>

        * config/aarch64/aarch64.c (aarch64_override_options): Set default
        alignments for functions and jumps.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index d4a8a2f..6b51885 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6493,6 +6493,32 @@ aarch64_override_options (void)
 #endif
     }
 
+  /* If we haven't been asked for any particular alignment for loops, jumps
+     and functions, choose defaults for the user.  We pick an alignment
+     which is word-size * issue-rate, rounded up to the nearest power of
+     two up for functions and down to the nearest power of two for
+     jumps.  */
+  if (!optimize_size)
+    {
+      if (align_jumps <= 0)
+	{
+	  /* Ideally, we want to be aligned to a block at least as large
+	     as the issue width of the processor, but too much padding
+	     risks wasting cache space.  Settle for the nearest power
+	     of below what we wanted.  */
+	  align_jumps = aarch64_tune_params->issue_rate * 4;
+	  align_jumps = 1 << floor_log2 (align_jumps);
+	}
+      if (align_functions <= 0)
+	{
+	  /* We want to be aligned to a block at least as large as the issue
+	     width of the processor.  */
+	  align_functions = aarch64_tune_params->issue_rate * 4;
+	  /* Round up to the nearest power of two.  */
+	  align_functions = 1 << ceil_log2 (align_functions);
+	}
+    }
+
   aarch64_override_options_after_change ();
 }
 

Reply via email to