On 20/06/17 16:41, James Greenhalgh wrote: > > Hi, > > This patch adds support for the ARM Cortex-A75 and > Cortex-A55 processors through the -mcpu/-mtune values cortex-a55 and > cortex-a75, and an ARM DynamIQ big.LITTLE configuration of these two > processors through the -mcpu/-mtune value cortex-a75.cortex-a55 > > The ARM Cortex-A75 is ARM's latest and highest performance applications > processor. For the initial tuning provided in this patch, I have chosen to > share the tuning structure with its predecessor, the Cortex-A73. > > The ARM Cortex-A55 delivers the best combination of power efficiency > and performance in its class. For the initial tuning provided in this patch, > I have chosen to share the tuning structure with its predecessor, the > Cortex-A53. > > Both Cortex-A55 and Cortex-A75 support ARMv8-A with the ARM8.1-A and > ARMv8.2-A extensions, along with the cryptography extension, and > the RCPC extensions from ARMv8.3-A. This is reflected in the patch, > -mcpu=cortex-a75 is treated as equivalent to passing -mtune=cortex-a75 > -march=armv8.2-a+rcpc . > > Tested on aarch64-none-elf with no issues. > > OK for trunk? > > Thanks, > James > > --- > 2017-06-20 James Greenhalgh <james.greenha...@arm.com> > > * config/aarch64/aarch64-cores.def (cortex-a55): New. > (cortex-a75): Likewise. > (cortex-a75.cortex-a55): Likewise. > * config/aarch64/aarch64-tune.md: Regenerate. > * doc/invoke.texi (-mtune): Document new values for -mtune. > >
Mostly ok, but... > 0001-Patch-AArch64-Add-initial-tuning-support-for-Cortex-.patch > > > diff --git a/gcc/config/aarch64/aarch64-cores.def > b/gcc/config/aarch64/aarch64-cores.def > index e333d5f..0baa20c 100644 > --- a/gcc/config/aarch64/aarch64-cores.def > +++ b/gcc/config/aarch64/aarch64-cores.def > @@ -80,6 +80,12 @@ AARCH64_CORE("vulcan", vulcan, thunderx2t99, 8_1A, > AARCH64_FL_FOR_ARCH8_1 | AA > /* Cavium ('C') cores. */ > AARCH64_CORE("thunderx2t99", thunderx2t99, thunderx2t99, 8_1A, > AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x43, 0x0af, -1) > > +/* ARMv8.2-A Architecture Processors. */ > + > +/* ARM ('A') cores. */ > +AARCH64_CORE("cortex-a55", cortexa55, cortexa53, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa53, 0x41, 0xd05, -1) > +AARCH64_CORE("cortex-a75", cortexa75, cortexa57, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa73, 0x41, 0xd0a, -1) > + > /* ARMv8-A big.LITTLE implementations. */ > > AARCH64_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, 8A, > AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, AARCH64_BIG_LITTLE > (0xd07, 0xd03), -1) > @@ -87,4 +93,8 @@ AARCH64_CORE("cortex-a72.cortex-a53", cortexa72cortexa53, > cortexa53, 8A, AARCH > AARCH64_CORE("cortex-a73.cortex-a35", cortexa73cortexa35, cortexa53, 8A, > AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, 0x41, AARCH64_BIG_LITTLE > (0xd09, 0xd04), -1) > AARCH64_CORE("cortex-a73.cortex-a53", cortexa73cortexa53, cortexa53, 8A, > AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, 0x41, AARCH64_BIG_LITTLE > (0xd09, 0xd03), -1) > > +/* ARM DynamIQ big.LITTLE configurations. */ > + > +AARCH64_CORE("cortex-a75.cortex-a55", cortexa75cortexa55, cortexa53, 8_2A, > AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa73, 0x41, AARCH64_BIG_LITTLE > (0xd0a, 0xd05), -1) > + > #undef AARCH64_CORE > diff --git a/gcc/config/aarch64/aarch64-tune.md > b/gcc/config/aarch64/aarch64-tune.md > index 4209f67..7fcd6cb 100644 > --- a/gcc/config/aarch64/aarch64-tune.md > +++ b/gcc/config/aarch64/aarch64-tune.md > @@ -1,5 +1,5 @@ > ;; -*- buffer-read-only: t -*- > ;; Generated automatically by gentune.sh from aarch64-cores.def > (define_attr "tune" > - > "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53" > + > "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55" > (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 86c8d62..2746c3e 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -14077,17 +14077,19 @@ processors implementing the target architecture. > @opindex mtune > Specify the name of the target processor for which GCC should tune the > performance of the code. Permissible values for this option are: > -@samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a57}, > -@samp{cortex-a72}, @samp{cortex-a73}, @samp{exynos-m1}, > -@samp{xgene1}, @samp{vulcan}, @samp{thunderx}, > +@samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55}, > +@samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75}, > +@samp{exynos-m1}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx}, > @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81}, > @samp{thunderxt83}, @samp{thunderx2t99}, @samp{cortex-a57.cortex-a53}, > @samp{cortex-a72.cortex-a53}, @samp{cortex-a73.cortex-a35}, > -@samp{cortex-a73.cortex-a53}, @samp{native}. > +@samp{cortex-a73.cortex-a53}, @samp{cortex-a75.cortex-a55}, > +@samp{native}. > > The values @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53}, > -@samp{cortex-a73.cortex-a35}, @samp{cortex-a73.cortex-a53} > -specify that GCC should tune for a big.LITTLE system. > +@samp{cortex-a73.cortex-a35}, @samp{cortex-a73.cortex-a53}, > +@samp{cortex-a75.cortex-a55} specify that GCC should tune for a > +big.LITTLE system. > > Additionally on native AArch64 GNU/Linux systems the value > @samp{native} tunes performance to the host system. This option has no > effect > @@ -25607,12 +25609,13 @@ This option instructs GCC to use 128-bit AVX > instructions instead of > > @item -mcx16 > @opindex mcx16 > -This option enables GCC to generate @code{CMPXCHG16B} instructions in 64-bit > -code to implement compare-and-exchange operations on 16-byte aligned 128-bit > -objects. This is useful for atomic updates of data structures exceeding one > -machine word in size. The compiler uses this instruction to implement > -@ref{__sync Builtins}. However, for @ref{__atomic Builtins} operating on > -128-bit integers, a library call is always used. > +This option enables GCC to generate @code{CMPXCHG16B} instructions. > +@code{CMPXCHG16B} allows for atomic operations on 128-bit double quadword > +(or oword) data types. > +This is useful for high-resolution counters that can be updated > +by multiple processors (or cores). This instruction is generated as part of > +atomic built-in functions: see @ref{__sync Builtins} or > +@ref{__atomic Builtins} for details. > > @item -msahf > @opindex msahf > I don't think this last hunk should be part of this patch. OK without that bit... R.