Re: [PATCH] [Aarch64] Variable shift count truncation issues

2017-06-23 Thread James Greenhalgh
On Fri, Jun 23, 2017 at 10:27:55AM +0100, Michael Collison wrote: > Fixed the "nitpick" issues pointed out by James. Okay for trunk? > > I have a few comments below, which are closer to nitpicking than structural > > issues with the patch. > > > > With those fixed, this is OK to commit. This is

[Patch AArch64 docs] Document the RcPc extension

2017-06-23 Thread James Greenhalgh
. OK? Thanks, James --- 2017-06-21 James Greenhalgh * doc/invoke.texi (rcpc architecture extension): Document it. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 7e7a16a5..db00e51 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -14172,6 +14172,10 @@ Enable

Re: [PATCH][AArch64] Improve Cortex-A53 shift bypass

2017-06-28 Thread James Greenhalgh
On Wed, Jun 28, 2017 at 01:49:26PM +0100, Wilco Dijkstra wrote: > Ramana Radhakrishnan wrote: > >  > > I'm about to run home for the day but this came in from > > https://gcc.gnu.org/ml/gcc-patches/2013-09/msg02109.html and James > > said in that email that this was put in to ensure no segfaults

Re: [Patch AArch64 docs] Document the RcPc extension

2017-07-03 Thread James Greenhalgh
On Fri, Jun 23, 2017 at 11:21:43AM +0100, James Greenhalgh wrote: > > Hi, > > Andrew pointed out that I did not document the new architecture extension > flag I added the RcPc iextension. This was intentional, as enablihg the rcpc > extension does not change GCC code generati

Re: [Patch AArch64 2/2] Fix memory sizes to load/store patterns

2017-07-03 Thread James Greenhalgh
On Wed, Jun 21, 2017 at 11:50:08AM +0100, James Greenhalgh wrote: > *ping* *ping*x2 Thanks, James > On Mon, Jun 12, 2017 at 02:54:00PM +0100, James Greenhalgh wrote: > > > > Hi, > > > > There seems to be a partial misconception in the AArch64 backend that &

Re: [Patch AArch64 2/2] Fix memory sizes to load/store patterns

2017-07-03 Thread James Greenhalgh
On Wed, Jun 21, 2017 at 11:50:08AM +0100, James Greenhalgh wrote: > *ping* Ping*2 Thanks, James > On Mon, Jun 12, 2017 at 02:54:00PM +0100, James Greenhalgh wrote: > > > > Hi, > > > > There seems to be a partial misconception in the AArch64 backend that > >

Re: [Patch AArch64] Stop generating BSL for simple integer code

2017-07-03 Thread James Greenhalgh
On Wed, Jun 21, 2017 at 11:49:07AM +0100, James Greenhalgh wrote: > *ping* *ping*x2 Thanks, James > On Mon, Jun 12, 2017 at 02:44:40PM +0100, James Greenhalgh wrote: > > [Sorry for the re-send. I spotted that the attributes were not right for the > > new pattern I was

[Patch ARM] Add initial tuning for Cortex-A55 and Cortex-A75

2017-07-04 Thread James Greenhalgh
Both Cortex-A55 and Cortex-A75 support ARMv8-A with the ARM8.1-A and ARMv8.2-A extensions. This is reflected in the patch, -mcpu=cortex-a75 is treated as equivalent to passing -mtune=cortex-a75 -march=armv8.2-a+fp16 OK? Thanks, James --- 2017-07-04 James Greenhalgh * config/arm/arm

Re: [PATCH][AArch64] Improve scheduling model for X-Gene

2017-11-10 Thread James Greenhalgh
nary and faster build > time. Survives bootstrap. I'm trusting your judgment on whether these numbers make sense, as I have no access to specifications for xgene. OK. Reviewed-By: James Greenhalgh > > Best, > Dominik > > gcc/ChangeLog: > 2017-10-09 Dominik Infuehr

Re: [4/4] SVE unwinding

2017-11-10 Thread James Greenhalgh
nd-dw2.c part. Thanks, James Reviewed-by: James Greenhalgh > 2017-11-03 Richard Sandiford > > libgcc/ > * config/aarch64/value-unwind.h (aarch64_vg): New function. > (DWARF_LAZY_REGISTER_VALUE): Define. > * unwind-dw2.c (_Unwind_GetGR): Use DWARF_LAZY_REGIST

Re: [05/nn] [AArch64] Rewrite aarch64_simd_valid_immediate

2017-11-10 Thread James Greenhalgh
o derive them from the old CHECK macros. Thanks for the patch, this is OK for trunk. Reviewed-by: James Greenhalgh James > > > 2017-10-26 Richard Sandiford > Alan Hayward > David Sherwood > > gcc/ > * config/aarch64/aarch64-protos

Re: [03/nn] [AArch64] Rework interface to add constant/offset routines

2017-11-10 Thread James Greenhalgh
n may be used to adjust the stack pointer, we must > - ensure that it cannot cause transient stack deallocation (for example > - by first incrementing SP and then decrementing when adjusting by a > - large immediate). */ This one in particular seems like we'd want it kept nearby the code. OK with some sort of change to make the restrictions on what this code should do clear on both functions. Reviewed-by: James Greenhalgh Thanks, James

Re: [AArch64] Tweak aarch64_classify_address interface

2017-11-10 Thread James Greenhalgh
On Mon, Oct 23, 2017 at 06:58:29PM +0100, Richard Sandiford wrote: > Ping. Makes sense. OK. Reviewed-By: James Greenhalgh James > Richard Sandiford writes: > > Richard Sandiford writes: > >> James Greenhalgh writes: > >>> On Tue, Aug 22, 2017 at 10:23:47

Re: [Patch AArch64] Stop generating BSL for simple integer code

2017-11-14 Thread James Greenhalgh
On Wed, Oct 04, 2017 at 05:44:07PM +0100, James Greenhalgh wrote: > > On Thu, Jul 27, 2017 at 06:49:01PM +0100, James Greenhalgh wrote: > > > > On Mon, Jun 12, 2017 at 02:44:40PM +0100, James Greenhalgh wrote: > > > [Sorry for the re-send. I spotted that the at

Re: [PATCH][GCC][AARCH64]Bad code-gen for structure/block/unaligned memory access

2017-11-14 Thread James Greenhalgh
On Tue, Nov 14, 2017 at 04:05:12PM +, Tamar Christina wrote: > Hi James, > > I have split off the aarch64 bit off from the generic parts and processed > your feedback. > > Attached is the reworked patch. > > Ok for Tunk? Thanks for the respin, I'm a bit confused by this comment. > diff --

Re: [AARCH64] implements neon vld1_*_x2 intrinsics

2017-11-15 Thread James Greenhalgh
On Wed, Nov 15, 2017 at 09:58:28AM +, Kyrill Tkachov wrote: > Hi Kugan, > > On 07/11/17 04:10, Kugan Vivekanandarajah wrote: > > Hi, > > > > Attached patch implements the vld1_*_x2 intrinsics as defined by the > > neon document. > > > > Bootstrap for the latest patch is ongoing on aarch64-lin

Re: [PATCH, AArch64] Adjust tuning parameters for Falkor

2017-11-17 Thread James Greenhalgh
On Wed, Nov 15, 2017 at 03:00:53AM +, Luis Machado wrote: > > I think the best thing is to leave this tuning structure in place and > > just change default_opt_level to -1 to disable it at -O3. > > > > Thanks, > > Andrew > > > > Indeed that seems to be more appropriate if re-enabling prefe

Re: [PATCH][AArch64] Set SLOW_BYTE_ACCESS

2017-11-17 Thread James Greenhalgh
On Fri, Nov 17, 2017 at 03:21:31PM +, Wilco Dijkstra wrote: > Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing > bitfields by their declared type, which results in better codegeneration on > practically > any target. > > I'm thinking we should completely remove all trace

Re: [PATCH][aarch64] Put vector fnma instruction into canonical form for better code generation.

2017-11-17 Thread James Greenhalgh
operator on  > the first operand and instead has it on the second.  This  > results in sub-optimal code generation (an extra dup instruction). > > I have moved the 'neg', rebuilt GCC and retested with this patch > There were no regressions.  OK to checkin? OK. T

Re: [PATCH][aarch64] Fix pr81356 - copy empty string with wrz, not a ldrb/strb

2017-11-17 Thread James Greenhalgh
nly note I have on it points at an incorrect PR number. So, I think this is probably a safe and sensible choice. OK. Reviewed-by: James Greenhalgh Thanks, James > > Bootstrapped and tested without regressions, OK to checkin? > > Steve Ellcey > sell...@cavium.com > >

Re: [COMMITTED][AArch64] Fix frame tests

2017-11-17 Thread James Greenhalgh
On Thu, Nov 16, 2017 at 11:34:46AM +, Wilco Dijkstra wrote: > Improve the AArch64 frame tests - add -f(no-)omit-frame-pointer, > update checks and add missing tests. As a result all tests now > pass. > > Committed as obvious. Some of these are far from obvious... Even if they were obvious to

Re: [GCC][PATCH][AArch64] Add negative tests for dotprod and set minimum version to v8.2 in the target bit.

2017-11-17 Thread James Greenhalgh
On Tue, Nov 14, 2017 at 03:54:56PM +, Tamar Christina wrote: > Hi All, > > Dot Product is intended to only be available for Armv8.2-a and newer. > While this restriction is reflected in the intrinsics, the patterns > themselves were missing the Armv8.2-a bit. > > This means that using -march=

Re: [PATCH][AArch64] Restrict POST_INC operand in aarch64_simd_mem_operand_p to register

2017-11-17 Thread James Greenhalgh
doesn’t seem to check > POST_INC’s operand. Here is a patch that fixes this for me, although I am not > sure if this is the right way to address this. GCC bootstraps and it causes > no test regressions. OK! Reviewed-by: James Greenhalgh Thanks, James > > Dominik > > Chan

Re: [Patch][aarch64] Use IFUNCs to enable LSE instructions in libatomic on aarch64

2017-11-20 Thread James Greenhalgh
On Mon, Nov 20, 2017 at 05:39:25PM +, Steve Ellcey wrote: > Re-ping with a CC to the Aarch64 maintainers. If I'm completely honest with myself, I don't know enough about this area to review the patch. Szabolcs' OK holds a lot of weight with me, but I'd like to understand more of the top-level

Re: [RFA][PATCH] Stack clash protection 07/08 -- V4 (aarch64 bits)

2017-11-21 Thread James Greenhalgh
I've finally built up enough courage to start getting my head around this... I see one outstanding issue sitting on this patch version: On Sat, Oct 28, 2017 at 05:08:54AM +0100, Jeff Law wrote: > On 10/13/2017 02:26 PM, Wilco Dijkstra wrote: > > --param=stack-clash-protection-probe-interval=13 >

Re: [Patch][aarch64] Use IFUNCs to enable LSE instructions in libatomic on aarch64

2017-11-21 Thread James Greenhalgh
On Mon, Nov 20, 2017 at 07:22:15PM +, Steve Ellcey wrote: > On Mon, 2017-11-20 at 18:27 +0000, James Greenhalgh wrote: > > > > If you have the time, would you mind giving me a quick run-down of what > > design decisions went in to this patch, and why they are the right t

Re: [PATCH][GCC][DOCS][AArch64][ARM] Documentation updates adding -A extensions.

2017-11-27 Thread James Greenhalgh
On Wed, Nov 15, 2017 at 11:51:15AM +, Tamar Christina wrote: > Hi All, > > This patch updates the documentation for AArch64 and ARM correcting the use > of the > architecture namings by adding the -A suffix in appropriate places. > > Build done on aarch64-none-elf and arm-none-eabi and no is

Re: [RFA][PATCH] Stack clash protection 07/08 -- V4 (aarch64 bits)

2017-11-27 Thread James Greenhalgh
On Wed, Nov 22, 2017 at 06:28:24PM +, Jeff Law wrote: > On 11/21/2017 04:57 AM, James Greenhalgh wrote: > > I've finally built up enough courage to start getting my head around this... > Can't blame you for avoiding :-) This stuff isn't my idea of fun either. Ri

Re: [PATCH][AArch64] Fix ICE due to store_pair_lanes

2017-11-28 Thread James Greenhalgh
On Mon, Nov 27, 2017 at 03:20:29PM +, Wilco Dijkstra wrote: > The recently added store_pair_lanes causes ICEs in output_operand. > This is due to aarch64_classify_address treating it like a 128-bit STR > rather than a STP. The valid immediate offsets don't fully overlap, > causing it to return

Re: [PATCH][AArch64] Fix address printing on ILP32

2017-12-01 Thread James Greenhalgh
On Thu, Nov 30, 2017 at 05:27:47PM +, Wilco Dijkstra wrote: > Fix address printing for ILP32. The md file uses 'a' in assembler > templates for symbolic addresses in adrp/add, which end up calling > aarch64_print_operand_address. However in ILP32 these are not valid > memory addresses (being

Re: [AArch64] Fix some define_insn_and_split conditions

2017-12-05 Thread James Greenhalgh
On Tue, Dec 05, 2017 at 02:28:56PM +, Richard Sandiford wrote: > The split conditions for aarch64_simd_bsldi_internal and > aarch64_simd_bsldi_alt were: > > "&& GP_REGNUM_P (REGNO (operands[0]))" > > But since they (deliberately) can be split before reload, the operand > matched by register

Re: [Patch][aarch64] Add missing thunderx2-t99 instruction scheduling pipeline descriptions.

2017-12-05 Thread James Greenhalgh
On Mon, Dec 04, 2017 at 05:33:37PM +, Steve Ellcey wrote: > On Mon, 2017-12-04 at 17:18 +, Kyrill Tkachov wrote: > > > > +(define_insn_reservation "thunderx2t99_multiple" 1 > > > +  (and (eq_attr "tune" "thunderx2t99") > > > +   (eq_attr "type" "multiple")) > > > +  "thunderx2t99_i0+th

Re: [AArch64] Fix ICEs in aarch64_print_operand

2017-12-07 Thread James Greenhalgh
On Tue, Dec 05, 2017 at 05:57:37PM +, Richard Sandiford wrote: > Three related regression fixes: > > - We can't use asserts like: > > gcc_assert (GET_MODE_SIZE (mode) == 16); > > in aarch64_print_operand because it could trigger for invalid user input. > > - The output_operand_lossage

Re: [Patch][aarch64] Use IFUNCs to enable LSE instructions in libatomic on aarch64

2017-12-07 Thread James Greenhalgh
On Fri, Sep 29, 2017 at 09:29:37PM +0100, Steve Ellcey wrote: > On Thu, 2017-09-28 at 12:31 +0100, Szabolcs Nagy wrote: > >  > > i think this should be improved, see below. > > diff --git a/libatomic/Makefile.am b/libatomic/Makefile.am > index d731406..92d19c6 100644 > --- a/libatomic/Makefile.am

Re: [patch, fortran] Implement maxval for characters

2017-12-11 Thread James Greenhalgh
On Wed, Dec 06, 2017 at 11:38:21AM +, Christophe Lyon wrote: > Hi, > > > On 28 November 2017 at 19:40, Thomas Koenig wrote: > > Hello world, > > > > the attached patch implements maxval for characters, an F2003 feature > > that we were missing up to now. > > > > Regression-tested on x86_64-p

[Patch combine] Don't create vector mode ZERO_EXTEND from subregs

2017-12-11 Thread James Greenhalgh
eeing on a branch in which I'm trying to tackle some performance regressions, so I have no live testcase for this, but it is wrong by observation. Tested on aarch64-none-elf and bootstrapped on aarch64-none-linux-gnu with no issues. OK? Thanks, James --- 2017-12-11 James Greenhalgh

[patch AArch64] Do not perform a vector splat for vector initialisation if it is not useful

2017-12-11 Thread James Greenhalgh
tends. Are the non-AArch64 parts OK? Thanks, James --- 2017-12-11 James Greenhalgh * config/aarch64/aarch64.c (aarch64_expand_vector_init): Modify code generation for cases where splatting a value is not useful. * simplify-rtx.c (simplify_ternary_operation): Sim

Re: [PATCH][AArch64] Specify fp16 support for Cortex-A55 and Cortex-A75

2017-12-12 Thread James Greenhalgh
On Mon, Dec 11, 2017 at 01:44:23PM +, Kyrill Tkachov wrote: > Hi all, > > The Cortex-A55 and Cortex-A75 processors support the fp16 extension. > We already specify them as such in the arm port. > This patch makes aarch64 consistent on this front. > > Bootstrapped and tested on aarch64-none-li

Re: [PATCH PR81228][AARCH64] Fix ICE by adding LTGT in vec_cmp

2017-12-13 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:45:33PM +, Sudi Das wrote: > On 13/12/17 16:42, Sudakshina Das wrote: > > Hi > > > > This patch is a follow up to the existing discussions on > > https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01904.html > > Bin had earlier submitted a patch to fix the ICE that occur

Re: [Patch][Aarch64] Fix aarch64 libatomic build with older binutils

2017-12-14 Thread James Greenhalgh
On Thu, Dec 07, 2017 at 11:56:55PM +, Steve Ellcey wrote: > James, > > Here is a patch that will turn off the use of IFUNC and the LSE > instructions in libatomic if the compiler/assembler toolchain do not > understand the '-march=armv8-a+lse' option (changed from > -march=armv8.1-a).  Rather

Re: [Patch combine] Don't create vector mode ZERO_EXTEND from subregs

2017-12-21 Thread James Greenhalgh
On Sun, Dec 17, 2017 at 03:14:08AM +, Segher Boessenkool wrote: > Hi! > > On Mon, Dec 11, 2017 at 02:18:53PM +0000, James Greenhalgh wrote: > > > > In simplify_set we try transforming the paradoxical subreg expression: > > > > (set FOO (subreg:M (mem:N BAR)

Re: [Patch][Aarch64] Fix multi-arch support in ILP32 mode

2017-12-21 Thread James Greenhalgh
On Thu, Dec 21, 2017 at 06:56:22PM +, Steve Ellcey wrote: > This one line patch for multi-arch support on Aarch64 and ILP32 was > submitted over a year ago and pinged a number of time since then, > since no one has objected and since it is only one line I am going > to check it in as an obvious

Re: PING: [11/nn] [AArch64] Set NUM_POLY_INT_COEFFS to 2

2018-01-06 Thread James Greenhalgh
On Fri, Jan 05, 2018 at 11:26:59AM +, Richard Sandiford wrote: > Ping. Here's the patch updated to apply on top of the v8.4 and > __builtin_load_no_speculate support. > > Richard Sandiford writes: > > This patch switches the AArch64 port to use 2 poly_int coefficients > > and updates code as

Re: [2/4] [AArch64] Testsuite markup for SVE

2018-01-06 Thread James Greenhalgh
On Fri, Nov 03, 2017 at 05:49:56PM +, Richard Sandiford wrote: > This patch adds new target selectors for SVE and updates existing > selectors accordingly. It also XFAILs some tests that don't yet > work for some SVE modes; most of these go away with follow-on > vectorisation enhancements. OK

Re: [3/4] [AArch64] SVE tests

2018-01-06 Thread James Greenhalgh
On Fri, Nov 03, 2017 at 05:50:54PM +, Richard Sandiford wrote: > This patch adds gcc.target/aarch64 tests for SVE, and forces some > existing Advanced SIMD tests to use -march=armv8-a. I'm going to assume that these new testcases are broadly sensible, and not spend any significant time looking

Re: [0/4] [AArch64] Add SVE support

2018-01-06 Thread James Greenhalgh
On Fri, Nov 24, 2017 at 03:59:58PM +, Richard Sandiford wrote: > Richard Sandiford writes: > > This series adds support for ARM's Scalable Vector Extension. > > More details on SVE can be found here: > > > > > > https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture

Re: SLP reductions with variable-length vectors

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 12:43:11AM +, Jeff Law wrote: > On 11/22/2017 11:10 AM, Richard Sandiford wrote: > > Richard Sandiford writes: > >> Two things stopped us using SLP reductions with variable-length vectors: > >> > >> (1) We didn't have a way of constructing the initial vector. > >> T

Re: Add support for bitwise reductions

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 12:36:58AM +, Jeff Law wrote: > On 11/22/2017 11:12 AM, Richard Sandiford wrote: > > Richard Sandiford writes: > >> This patch adds support for the SVE bitwise reduction instructions > >> (ANDV, ORV and EORV). It's a fairly mechanical extension of existing > >> REDUC_*

Re: Add support for fully-predicated loops

2018-01-07 Thread James Greenhalgh
On Mon, Dec 18, 2017 at 07:40:00PM +, Jeff Law wrote: > On 11/17/2017 07:56 AM, Richard Sandiford wrote: > > This patch adds support for using a single fully-predicated loop instead > > of a vector loop and a scalar tail. An SVE WHILELO instruction generates > > the predicate for each iteratio

Re: [3/4] [AArch64] SVE tests

2018-01-07 Thread James Greenhalgh
On Sat, Jan 06, 2018 at 07:13:22PM +, Richard Sandiford wrote: > James Greenhalgh writes: > > On Fri, Nov 03, 2017 at 05:50:54PM +, Richard Sandiford wrote: > >> This patch adds gcc.target/aarch64 tests for SVE, and forces some > >> existing Advanced SIMD

Re: Add support for reductions in fully-masked loops

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:34:34PM +, Jeff Law wrote: > On 11/17/2017 07:59 AM, Richard Sandiford wrote: > > This patch removes the restriction that fully-masked loops cannot > > have reductions. The key thing here is to make sure that the > > reduction accumulator doesn't include any values a

Re: Add an empty_mask_is_expensive hook

2018-01-07 Thread James Greenhalgh
On Fri, Nov 17, 2017 at 06:12:49PM +, Jeff Law wrote: > On 11/17/2017 08:15 AM, Richard Sandiford wrote: > > This patch adds a hook to control whether we avoid executing masked > > (predicated) stores when the mask is all false. We don't want to do > > that by default for SVE. > > > > Tested

Re: Add support for vectorising live-out values using SVE LASTB

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:36:47PM +, Jeff Law wrote: > On 11/17/2017 08:24 AM, Richard Sandiford wrote: > > This patch uses the SVE LASTB instruction to optimise cases in which > > a value produced by the final scalar iteration of a vectorised loop is > > live outside the loop. Previously thi

Re: Add support for conditional reductions using SVE CLASTB

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:59:00PM +, Jeff Law wrote: > On 11/17/2017 08:29 AM, Richard Sandiford wrote: > > This patch uses SVE CLASTB to optimise conditional reductions. It means > > that we no longer need to maintain a separate index vector to record > > the most recent valid value, and no

Re: Rework the legitimize_address_displacement hook

2018-01-07 Thread James Greenhalgh
On Fri, Nov 17, 2017 at 07:45:28PM +, Jeff Law wrote: > On 11/17/2017 09:03 AM, Richard Sandiford wrote: > > This patch: > > > > - tweaks the handling of legitimize_address_displacement > > so that it gets called before rather than after the address has > > been expanded. This means that

Re: Add support for SVE gather loads

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 01:16:26AM +, Jeff Law wrote: > On 11/17/2017 02:58 PM, Richard Sandiford wrote: > > This patch adds support for SVE gather loads. It uses the basically > > the same analysis code as the AVX gather support, but after that > > there are two major differences: > > > > -

Re: Add support for SVE scatter stores

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 12:34:54AM +, Jeff Law wrote: > On 11/17/2017 03:10 PM, Richard Sandiford wrote: > > This is mostly a mechanical extension of the previous gather load > > support to scatter stores. The internal functions in this case are: > > > > IFN_SCATTER_STORE (base, offsets, sc

Re: Handle more SLP constant and extern definitions for variable VF

2018-01-07 Thread James Greenhalgh
On Mon, Dec 11, 2017 at 11:04:28PM +, Jeff Law wrote: > On 11/09/2017 07:20 AM, Richard Sandiford wrote: > > This patch adds support for vectorising SLP definitions that are > > constant or external (i.e. from outside the loop) when the vectorisation > > factor isn't known at compile time. It

Re: Add support for masked load/store_lanes

2018-01-07 Thread James Greenhalgh
On Tue, Dec 12, 2017 at 12:59:33AM +, Jeff Law wrote: > On 11/17/2017 02:36 AM, Richard Sandiford wrote: > > Richard Sandiford writes: > >> This patch adds support for vectorising groups of IFN_MASK_LOADs > >> and IFN_MASK_STOREs using conditional load/store-lanes instructions. > >> This requi

Re: Allow the number of iterations to be smaller than VF

2018-01-07 Thread James Greenhalgh
On Mon, Nov 20, 2017 at 12:12:38AM +, Jeff Law wrote: > On 11/17/2017 08:11 AM, Richard Sandiford wrote: > > Fully-masked loops can be profitable even if the iteration > > count is smaller than the vectorisation factor. In this case > > we're effectively doing a complete unroll followed by SLP

Re: Handle peeling for alignment with masking

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 12:12:01AM +, Jeff Law wrote: > On 11/17/2017 08:13 AM, Richard Sandiford wrote: > > This patch adds support for aligning vectors by using a partial > > first iteration. E.g. if the start pointer is 3 elements beyond > > an aligned address, the first iteration will have

Re: Allow single-element interleaving for non-power-of-2 strides

2018-01-07 Thread James Greenhalgh
On Fri, Nov 17, 2017 at 06:40:13PM +, Jeff Law wrote: > On 11/17/2017 08:33 AM, Richard Sandiford wrote: > > This allows LD3 to be used for isolated a[i * 3] accesses, in a similar > > way to the current a[i * 2] and a[i * 4] for LD2 and LD4 respectively. > > Given the problems with the cost mo

Re: Use single-iteration epilogues when peeling for gaps

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:42:02PM +, Jeff Law wrote: > On 11/17/2017 08:38 AM, Richard Sandiford wrote: > > This patch adds support for fully-masking loops that require peeling > > for gaps. It peels exactly one scalar iteration and uses the masked > > loop to handle the rest. Previously we

Re: Use gather loads for strided accesses

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:47:40PM +, Jeff Law wrote: > On 11/17/2017 03:02 PM, Richard Sandiford wrote: > > This patch tries to use gather loads for strided accesses, > > rather than falling back to VMAT_ELEMENTWISE. > > > > Tested on aarch64-linux-gnu (with and without SVE), x86_64-linux-gnu

Re: Allow gather loads to be used for grouped accesses

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:53:27PM +, Jeff Law wrote: > On 11/17/2017 03:04 PM, Richard Sandiford wrote: > > Following on from the previous patch for strided accesses, this patch > > allows gather loads to be used with grouped accesses, if we otherwise > > would need to fall back to VMAT_ELEMEN

Re: Support for aliasing with variable strides

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 11:00:36AM +, Richard Biener wrote: > On Fri, Nov 17, 2017 at 11:17 PM, Richard Sandiford > wrote: > > This patch adds runtime alias checks for loops with variable strides, > > so that we can vectorise them even without a restrict qualifier. > > There are several parts

Re: [0/4] [AArch64] Add SVE support

2018-01-07 Thread James Greenhalgh
(Resending; this bounced) On Sat, Jan 06, 2018 at 07:39:46PM +, Richard Sandiford wrote: > James Greenhalgh writes: > > On Fri, Nov 24, 2017 at 03:59:58PM +, Richard Sandiford wrote: > >> Richard Sandiford writes: > >> > This series adds support for A

Re: [PATCH][AArch64] Support for LDP/STP of Q-registers

2018-06-05 Thread James Greenhalgh
On Tue, Jun 05, 2018 at 11:32:06AM -0500, Kyrill Tkachov wrote: > > On 04/06/18 18:40, Kyrill Tkachov wrote: > > Hi all, > > > > This patch adds support for generating LDPs and STPs of Q-registers. > > This allows for more compact code generation and makes better use of the > > ISA. > > > > It's

Re: [Patch][Aarch64][PR target/79924] Cannot translate diagnostics

2018-06-05 Thread James Greenhalgh
On Tue, Jun 05, 2018 at 12:40:01PM -0500, Steve Ellcey wrote: > On Tue, 2018-06-05 at 13:23 +0100, Richard Sandiford wrote: > >  > > This regresses a couple of things: > > > > - before the patch, the option would be properly quoted, whereas now > >   it's unquoted.  Either keeping %qs or using %<.

Re: [PATCH][Aarch64] v2: Arithmetic overflow common functions [Patch 1/4]

2018-06-07 Thread James Greenhalgh
On Wed, Jun 06, 2018 at 12:14:03PM -0500, Michael Collison wrote: > This is a respin of a AArch64 patch that adds support for builtin arithmetic > overflow operations. This update separates the patch into multiple pieces and > addresses comments made by Richard Earnshaw here: > > https://gcc.gnu

Re: [PATCH][Aarch64] v2: Arithmetic overflow addv patterns [Patch 2/4]

2018-06-07 Thread James Greenhalgh
On Wed, Jun 06, 2018 at 12:16:22PM -0500, Michael Collison wrote: > This is a respin of a AArch64 patch that adds support for builtin arithmetic > overflow operations. This update separates the patch into multiple pieces and > addresses comments made by Richard Earnshaw here: > > https://gcc.gnu

Re: [PATCH][Aarch64] v2: Arithmetic overflow subv patterns [Patch 3/4]

2018-06-07 Thread James Greenhalgh
On Wed, Jun 06, 2018 at 12:19:52PM -0500, Michael Collison wrote: > This is a respin of a AArch64 patch that adds support for builtin arithmetic > overflow operations. This update separates the patch into multiple pieces and > addresses comments made by Richard Earnshaw here: > > https://gcc.gnu

Re: [PATCH][Aarch64] v2: Arithmetic overflow tests [Patch 4/4]

2018-06-07 Thread James Greenhalgh
On Wed, Jun 06, 2018 at 12:20:59PM -0500, Michael Collison wrote: > This is a respin of a AArch64 patch that adds support for builtin arithmetic > overflow operations. This update separates the patch into multiple pieces and > addresses comments made by Richard Earnshaw here: > > https://gcc.gnu

Re: [PATCH] [aarch64] Remove obsolete comment about X30

2018-06-19 Thread James Greenhalgh
On Mon, Jun 18, 2018 at 08:43:04AM -0500, Siddhesh Poyarekar wrote: > r217431 changed X30 as caller-saved in CALL_USE_REGISTERS because of > which this comment about X30 not being marked as call-clobbered is no > longer accurate. Is the second paragraph is still relevant to how we define EPILOGUE_

Re: [PATCH][AArch64] Support for LDP/STP of Q-registers

2018-06-19 Thread James Greenhalgh
On Thu, Jun 07, 2018 at 05:58:01AM -0500, Kyrill Tkachov wrote: > > On 05/06/18 18:28, James Greenhalgh wrote: > > On Tue, Jun 05, 2018 at 11:32:06AM -0500, Kyrill Tkachov wrote: > >> On 04/06/18 18:40, Kyrill Tkachov wrote: > >>> Hi all, > >>> >

Re: [PATCH][GCC][AArch64] Simplify movmem code by always doing overlapping copies when larger than 8 bytes.

2018-06-19 Thread James Greenhalgh
On Tue, Jun 19, 2018 at 09:09:27AM -0500, Tamar Christina wrote: > Hi All, > > This changes the movmem code in AArch64 that does copy for data between 4 and > 7 > bytes to use the smallest possible mode capable of copying the remaining > bytes. > > This means that if we're copying 5 bytes we wo

Re: [AArch64][PATCH 1/2] Make AES unspecs commutative

2018-06-19 Thread James Greenhalgh
On Mon, Jun 18, 2018 at 04:38:27AM -0500, Andre Simoes Dias Vieira wrote: > Hi, > > This patch teaches the AArch64 backend that the AESE and AESD unspecs are > commutative (which correspond to the vaeseq_u8 and vaesdq_u8 intrinsics). > This improves register allocation around their corresponding i

Re: [AArch64][PATCH 2/2] Combine AES instructions with xor and zero operands

2018-06-19 Thread James Greenhalgh
On Mon, Jun 18, 2018 at 04:38:44AM -0500, Andre Simoes Dias Vieira wrote: > Hi, > > This patch teaches the AArch64 backend that AES instructions with a XOR and > zero operands can be simplified by replacing the operands of the AES with > XOR's thus eliminating the XOR. This is OK because the AES i

Re: [PATCH][AARCH64] PR target/84521 Fix frame pointer corruption with -fomit-frame-pointer with __builtin_setjmp

2018-06-26 Thread James Greenhalgh
On Mon, Jun 25, 2018 at 04:24:14AM -0500, Sudakshina Das wrote: > PING! > > On 14/06/18 12:10, Sudakshina Das wrote: > > Hi Eric > > > > On 07/06/18 16:33, Eric Botcazou wrote: > >>> Sorry this fell off my radar. I have reg-tested it on x86 and tried it > >>> on the sparc machine from the gcc far

Re: [PATCH v2] Add HXT Phecda core support

2018-06-26 Thread James Greenhalgh
On Fri, Jun 22, 2018 at 02:52:33AM -0500, Hongbo Zhang wrote: > HXT semiconductor's CPU core Phecda, as a variant of Qualcomm qdf24xx, > reuses the same tuning structure and pipeline with it. OK. Thanks, James > 2018-06-19 Hongbo Zhang > > * config/aarch64/aarch64-cores.def (AARCH64_CO

Re: [PATCH] [aarch64] Remove obsolete comment about X30

2018-06-26 Thread James Greenhalgh
On Wed, Jun 20, 2018 at 04:41:18AM -0500, Siddhesh Poyarekar wrote: > On 06/19/2018 09:11 PM, James Greenhalgh wrote: > > On Mon, Jun 18, 2018 at 08:43:04AM -0500, Siddhesh Poyarekar wrote: > >> r217431 changed X30 as caller-saved in CALL_USE_REGISTERS because of > >> w

Re: [PATCH][GCC][AArch64] Add SIMD to REG pattern for movhf without armv8.2-a support (PR85769)

2018-06-26 Thread James Greenhalgh
On Wed, Jun 20, 2018 at 05:15:37AM -0500, Tamar Christina wrote: > Hi Kyrill, > > Many thanks for the review! > > The 06/19/2018 16:47, Kyrill Tkachov wrote: > > Hi Tamar, > > > > On 19/06/18 15:07, Tamar Christina wrote: > > > Hi All, > > > > > > This fixes a regression where we don't have an i

Re: [PATCH][AArch64] Add support for Arm Cortex-A76

2018-06-27 Thread James Greenhalgh
On Wed, Jun 27, 2018 at 04:50:33AM -0500, Kyrill Tkachov wrote: > Hi all, > > The Cortex-A76 is an Armv8.2-A processor with dotproduct and FP16 support. > It can be paired with the Cortex-A55 and hence the option > -mcpu/-mtune=cortex-a76.cortex-a55 is also introduced. > > Bootstrapped and tested

Re: [13/13] [AArch64] Use vec_perm_indices helper routines

2018-01-09 Thread James Greenhalgh
On Thu, Jan 04, 2018 at 11:27:56AM +, Richard Sandiford wrote: > Ping**2 This is OK. It took me a while to get the hang of the interface - a worked example in the comment in vec-perm-indices.c would probably have been helpful. It took until your code for REV for this to really make sense to m

Re: [AArch64] Reject (high (const (plus anchor offset)))

2018-01-09 Thread James Greenhalgh
On Thu, Jan 04, 2018 at 06:15:58PM +, Richard Sandiford wrote: > The aarch64_legitimate_constant_p tests for HIGH and CONST seem > to be the wrong way round: (high (const ...)) is valid rtl that > could be passed in, but (const (high ...)) isn't. As it stands, > we disallow anchor+offset but a

Re: [PATCH 1/5][AArch64] Crypto command line split

2018-01-09 Thread James Greenhalgh
On Wed, Jan 03, 2018 at 05:21:27PM +, Michael Collison wrote: > Hi all, > > This patch adds two new command line options for the legacy cryptographic > extensions AES (+aes) and SHA-1/SHA-2 (+sha2). Backward compatibility is > retained by modifying the +crypto feature modifier to enable +aes a

Re: [PATCH 2/5][AArch64] Add v8.4 architecture

2018-01-09 Thread James Greenhalgh
On Wed, Jan 03, 2018 at 05:25:17PM +, Michael Collison wrote: > Hi all, > > This patch adds support for the Arm architecture v8.4. A new command line > option, -march=armv8.4-a, is added as well as documentation. > > Bootstrapped on aarch64-none-elf. Tested with new binutils and verified all

Re: [PATCH 3/5][AArch64] Crypto SM4 Support

2018-01-09 Thread James Greenhalgh
On Wed, Jan 03, 2018 at 05:25:57PM +, Michael Collison wrote: > Hi All, > > This patch adds support for the SM3/SM4 cryptographic instructions added in > Armv8.4-a. Support for the new instructions is in the form of new ACLE > intrinsics. A new command line feature modifier, +sm4, is added to

Re: [PATCH 4/5][AArch64] Crypto sha512 and sha3

2018-01-09 Thread James Greenhalgh
On Wed, Jan 03, 2018 at 05:30:33PM +, Michael Collison wrote: > Hi All, > > This patch adds support for the SHA-512 and SHA-3 instructions added in > Armv8.4-a. Support for the new instructions is in the form of new ACLE > intrinsics. A new command line feature modifier, +sha3, is added to ena

Re: [PATCH][GCC][AArch64] Enable dotproduct by default for Armv8.4-a

2018-01-09 Thread James Greenhalgh
On Tue, Jan 09, 2018 at 10:36:23AM +, Tamar Christina wrote: > Hi All, > > This patch makes the Dot Product extension mandatory on Armv8.4-A. > > Regtested on aarch64-none-elf and no regressions. OK. Thanks, James > gcc/ > 2018-01-09 Tamar Christina > > * config/aarch64/aarch64.h

Re: [1/4] [AArch64] SVE backend support

2018-01-10 Thread James Greenhalgh
On Fri, Jan 05, 2018 at 11:41:25AM +, Richard Sandiford wrote: > Here's the patch updated to apply on top of the v8.4 and > __builtin_load_no_speculate support. It also handles the new > vec_perm_indices and CONST_VECTOR encoding and uses VNx... names > for the SVE modes. > > Richard Sandifor

Re: [PATCH 5/5][AArch64] fp16fml support

2018-01-10 Thread James Greenhalgh
On Tue, Jan 09, 2018 at 06:28:09PM +, Michael Collison wrote: > Patch updated per Richard's comments. Ok for trunk? This patch adds a lot of code, much of which looks like it ought to be possible to common up using the iterators. I'm going to OK it as is, as I'd like to see this make GCC 8, an

Re: [PATCH][WWWDOCS][AArch64][ARM] Update GCC 8 release notes

2018-01-16 Thread James Greenhalgh
On Tue, Jan 16, 2018 at 02:21:30PM +, Tamar Christina wrote: > Hi Kyrill, > > > > > xgene1 was added a few releases ago, better to use one of the new additions > > from the above list. > > For example -mtune=cortex-r52. > > Thanks, I have updated the patch. I'll wait for an ok from an AArch

Re: [PATCH] PR82964: Fix 128-bit immediate ICEs

2018-01-16 Thread James Greenhalgh
On Mon, Jan 15, 2018 at 11:34:19AM +, Wilco Dijkstra wrote: > This fixes PR82964 which reports ICEs for some CONST_WIDE_INT immediates. > It turns out decimal floating point CONST_DOUBLE get changed into > CONST_WIDE_INT without checking the constraint on the operand, which > results in failur

Re: [PATCH][AArch64] Fix gcc.target/aarch64/subs_compare_[12].c

2018-01-26 Thread James Greenhalgh
On Tue, Jan 23, 2018 at 02:49:03PM +, Kyrill Tkachov wrote: > Hi all, > > This patch fixes the testsuite failures gcc.target/aarch64/subs_compare_1.c > and subs_compare_2.c The tests check that we combine a sequence like: > sub w2, w0, w1 > cmp w0, w1 > > into >

Re: [AArch64] Prefer LD1RQ for big-endian SVE

2018-01-29 Thread James Greenhalgh
On Fri, Jan 26, 2018 at 01:54:42PM +, Richard Sandiford wrote: > This patch deals with cases in which a CONST_VECTOR contains a > repeating bit pattern that is wider than one element but narrower > than 128 bits. The current code: > > * treats the repeating pattern as a single element > * use

Re: [AArch64] Handle SVE subregs that are effectively REVs

2018-01-29 Thread James Greenhalgh
On Fri, Jan 26, 2018 at 01:59:40PM +, Richard Sandiford wrote: > Subreg reads should be equivalent to storing the inner register to > memory and loading the appropriate memory bytes back, with subreg > writes doing the reverse. For the reasons explained in the comments, > this isn't what happe

Re: [AArch64] Use all SVE LD1RQ variants

2018-01-29 Thread James Greenhalgh
On Fri, Jan 26, 2018 at 01:50:40PM +, Richard Sandiford wrote: > The fallback way of handling a repeated 128-bit constant vector for SVE > is to force the 128 bits to the constant pool and use LD1RQ to load it. > Previously the code always used the byte variant of LD1RQ (LD1RQB), > with a prece

Re: [AArch64] Fix sve/extract_[12].c for big-endian SVE

2018-01-29 Thread James Greenhalgh
On Fri, Jan 26, 2018 at 03:15:58PM +, Richard Sandiford wrote: > Kyrill Tkachov writes: > > On 26/01/18 13:31, Richard Sandiford wrote: > >> sve/extract_[12].c were relying on the target-independent optimisation > >> that removes a redundant vec_select, so that we don't end up with > >> thing

Re: [PATCH][AArch64] PR tree-optimization/64946: XFAIL gcc.target/aarch64/vect-abs-compile.c

2018-01-31 Thread James Greenhalgh
On Wed, Jan 31, 2018 at 09:46:32AM +, Kyrill Tkachov wrote: > Hi all, > > This test has been failing since forever, it has never passed AFAIK. > The PR details the vectoriser deficiency. > I propose we xfail this with a reference to the PR. > > Ok for trunk? Yes please. We're long overdue on

<    1   2   3   4   5   6   7   8   9   10   >