Re: [PATCH 0/2][AArch64] Implement AAPCS64 updates for alignment attribute

2016-01-21 Thread Alan Lawrence
On 18/01/16 17:10, Eric Botcazou wrote: Similarly to ARM, I note that Ada is affected. Indeed, with a gcc 4.9 host compiler, I saw a bootstrap miscompare iff including Ada; however, I was able to bootstrap Ada successfully, if I first built a GCC including this patch with --disable-bootstrap, and

[PATCH][Testsuite] Fix PR66877

2016-01-22 Thread Alan Lawrence
This is a scan-tree-dump failure in vect-over-widen-3-big-array.c, that occurs only on ARM - the only platform to have vect_widen_shift. Tested on arm-none-eabi (armv8-crypto-neon-fp, plus a non-neon variant), also aarch64 (token platform without vect_widen_shift). gcc/testsuite/ChangeLog:

[PATCH][Testsuite] Fix scan-tree-dump failures with vect_multiple_sizes

2016-01-22 Thread Alan Lawrence
Since r230292, these tests in gcc.dg/vect have been failing on ARM, AArch64, and x86_64 with -march=haswell (among others - when prefer_avx128 is true): vect-outer-1-big-array.c scan-tree-dump-times vect "grouped access in outer loop" 2 vect-outer-1.c scan-tree-dump-times vect "grouped access in

Re: [PATCH] ARM PR68620 (ICE with FP16 on armeb)

2016-01-22 Thread Alan Lawrence
On 20/01/16 21:10, Christophe Lyon wrote: On 19 January 2016 at 15:51, Alan Lawrence wrote: On 19/01/16 11:15, Christophe Lyon wrote: For neon_vdupn, I chose to implement neon_vdup_nv4hf and neon_vdup_nv8hf instead of updating the VX iterator because I thought it was not desirable to impact

Re: [PATCH 1/2][AArch64] Implement AAPCS64 updates for alignment attribute

2016-01-22 Thread Alan Lawrence
On 21/01/16 17:23, Alan Lawrence wrote: > On 18/01/16 17:10, Eric Botcazou wrote: >> >> Could you post the list of files that differ? How do they differ exactly? > > Hmmm. Well, I definitely had this failing to bootstrap once. I repeated that, > to > try to identify e

Re: [PATCH 4/4] Un-XFAIL ssa-dom-cse-2.c for most platforms

2016-02-03 Thread Alan Lawrence
On 26/01/16 12:23, Dominik Vogt wrote: On Mon, Dec 21, 2015 at 01:13:28PM +, Alan Lawrence wrote: ...the test passes with --param sra-max-scalarization-size-Ospeed. Verified on aarch64 and with stage1 compiler for hppa, powerpc, sparc, s390. How did you test this on s390? For me, the

Re: [PATCH 4/4] Un-XFAIL ssa-dom-cse-2.c for most platforms

2016-02-04 Thread Alan Lawrence
On 04/02/16 09:53, Dominik Vogt wrote: On Wed, Feb 03, 2016 at 11:41:02AM +, Alan Lawrence wrote: On 26/01/16 12:23, Dominik Vogt wrote: On Mon, Dec 21, 2015 at 01:13:28PM +, Alan Lawrence wrote: ...the test passes with --param sra-max-scalarization-size-Ospeed. Verified on aarch64

Re: [PATCH, PR middle-end/68134] Reject scalar modes in default get_mask_mode hook

2016-02-19 Thread Alan Lawrence
On 17/11/15 11:49, Ilya Enkovich wrote: Hi, Default hook for get_mask_mode is supposed to return integer vector modes. This means it should reject calar modes returned by mode_for_vector. Bootstrapped and regtested on x86_64-unknown-linux-gnu, regtested on aarch64-unknown-linux-gnu. OK for

[PATCH] Add -funknown-commons to work around PR/69368 (and others) in SPEC2006

2016-02-19 Thread Alan Lawrence
This relates to FORTRAN code where different modules give different sizes to the same array in a COMMON block (contrary to the fortran language specification). SPEC have refused to patch the source code (https://www.spec.org/cpu2006/Docs/faq.html#Run.05). Hence, this patch provides a Fortran-speci

Re: [PATCH] Add -funknown-commons to work around PR/69368 (and others) in SPEC2006

2016-02-22 Thread Alan Lawrence
On 19/02/16 17:52, Jakub Jelinek wrote: On Fri, Feb 19, 2016 at 05:42:34PM +, Alan Lawrence wrote: This relates to FORTRAN code where different modules give different sizes to the same array in a COMMON block (contrary to the fortran language specification). SPEC have refused to patch the

Re: [PATCH 1/2][AArch64] Implement AAPCS64 updates for alignment attribute

2016-02-22 Thread Alan Lawrence
On 22/01/16 17:16, Alan Lawrence wrote: On 21/01/16 17:23, Alan Lawrence wrote: On 18/01/16 17:10, Eric Botcazou wrote: Could you post the list of files that differ? How do they differ exactly? Hmmm. Well, I definitely had this failing to bootstrap once. I repeated that, to try to

Re: [PATCH] Add -funknown-commons to work around PR/69368 (and others) in SPEC2006

2016-02-23 Thread Alan Lawrence
On 22/02/16 12:03, Jakub Jelinek wrote: (f) A global command-line option, which we check alongside DECL_COMMON and further tests (basically, we want only DECL_COMMON decls that either have ARRAY_TYPE, or some other aggregate type with flexible array member or some other trailing array in the str

Re: [PATCH, PR middle-end/68134] Reject scalar modes in default get_mask_mode hook

2016-02-23 Thread Alan Lawrence
On 20/02/16 09:29, Ilya Enkovich wrote: 2016-02-19 20:36 GMT+03:00 Alan Lawrence : Mostly this is fairly straightforward, relatively little midend code is required, and the backend cleans up quite a bit. However, I get stuck on the case of singleton vectors (64x1). No surprises there, then

Re: [PATCH] Add -funknown-commons to work around PR/69368 (and others) in SPEC2006

2016-02-25 Thread Alan Lawrence
that this be combined with some flag fiddling and warnings in the Fortran front-end; this patch doesn't do that, as I'm not very familiar with the frontends, but that can follow in a separate patch. (Thomas?) OK for trunk? Cheers, Alan gcc/ChangeLog: DATE Alan Lawrence

Re: [PATCH] Add -funknown-commons to work around PR/69368 (and others) in SPEC2006

2016-03-03 Thread Alan Lawrence
On 25/02/16 18:00, Alan Lawrence wrote: On 22/02/16 12:03, Jakub Jelinek wrote: (f) A global command-line option, which we check alongside DECL_COMMON and further tests (basically, we want only DECL_COMMON decls that either have ARRAY_TYPE, or some other aggregate type with flexible array

Re: [PATCH 1/2][AArch64] Implement AAPCS64 updates for alignment attribute

2016-03-04 Thread Alan Lawrence
On 26/02/16 14:52, James Greenhalgh wrote: gcc/ChangeLog: * gcc/config/aarch64/aarch64.c (aarch64_function_arg_alignment): Rewrite, looking one level down for records and arrays. --- gcc/config/aarch64/aarch64.c | 31 --- 1 file changed, 16 inserti

Re: revised and updated new-if-converter patch… [PATCH] fix PR46029: reimplement if conversion of loads and stores

2015-07-20 Thread Alan Lawrence
Abe wrote: diff --git a/gcc/testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c b/gcc/testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c index 71f2db3..2b159d7 100644 --- a/gcc/testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c +++ b/gcc/testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c @@

Re: [PATCH 3/16][ARM] Add float16x4_t intrinsics

2015-07-27 Thread Alan Lawrence
Ramana Radhakrishnan wrote: I haven't seen the patch yet but here are my thoughts on where this should be going. Thus in summary - 1. -mfpu=neon implies the presence of the float16x(4/8) types and all the intrinsics that treat these values as bags of bits. 2. -mfpu=neon-fp16 implies the pre

Re: [PATCH 2/16][ARM] PR/63870 Add __builtin_arm_lane_check.

2015-07-27 Thread Alan Lawrence
Kyrill Tkachov wrote: Hi Alan, Can you please add a comment on top of this saying that this builtin only exists to perform the lane check, just to make it explicit for the future. Done, and pushed as r226252. Charles, thanks for your patience, and I hope this lets you move forwards. I reali

[PATCH 0/15][ARM/AArch64] Add support for float16_t vectors (v3)

2015-07-28 Thread Alan Lawrence
All AArch64 patches are unchanged from previous version. However, in response to discussion, the ARM patches are changed (much as I suggested https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02249.html); this version: * Hides the existing vcvt_f16_f32 and vcvt_f32_f16 intrinsics, and float16x4

[PATCH 1/15][ARM] Hide existing float16 intrinsics unless we have a scalar __fp16 type

2015-07-28 Thread Alan Lawrence
This makes the existing float16 vector intrinsics available only when we have an __fp16 type (i.e. when one of the ARM_FP16_FORMAT_... macros is defined). Thus, we also rearrange the float16x[48]_t types to use the same type as __fp16 for the element type (ACLE says that __fp16 should be an ali

[PATCH 2/15][ARM] float16x4_t intrinsics in arm_neon.h

2015-07-28 Thread Alan Lawrence
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00476.html. The change is to provide all the new float16 intrinsics only if we actually have an scalar __fp16 type. (This covers the intrinsics whose implementation is entirely within arm_neon.h; those requiring .md changes follow

[PATCH 4/15][ARM] float16x8_t intrinsics in arm_neon.h

2015-07-28 Thread Alan Lawrence
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00478.html , again making the intrinsics available only if we have a scalar __fp16 type. (This covers the intrinsics whose implementation is entirely within arm_neon.h; those requiring .md changes follow in the next patch). gcc/

[PATCH 5/15][ARM] Remaining intrinsics

2015-07-28 Thread Alan Lawrence
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00479.html, again to make the intrinsics available only if we have a scalar __fp16 type. This does not fix existing indentation issues in neon.md but rather keeps the affected lines consistent with those around them. gcc/Change

[PATCH 3/15][ARM] Add V8HFmode and float16x8_t type

2015-07-28 Thread Alan Lawrence
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00477.html. The only change is to publish float16x8_t only if we actually have a scalar __fp16 type. gcc/ChangeLog: * config/arm/arm.h (VALID_NEON_QREG_MODE): Add V8HFmode. * config/arm/arm.c (arm_vector_mode_supported_

[PATCH 6/15][AArch64] Add basic FP16 support

2015-07-28 Thread Alan Lawrence
fhf2): New. * config/aarch64/iterators.md (GPF_F16): New. gcc/testsuite/ChangeLog: * gcc.target/aarch64/f16_movs_1.c: New test. commit 989af1492bbf268be1ecfae06f3303b90ae514c8 Author: Alan Lawrence Date: Tue Dec 2 12:57:39 2014 + AArch64 1/6: Basic HFmode support (

[PATCH 7/15][ARM/AArch64 Testsuite] Add basic fp16 tests

2015-07-28 Thread Alan Lawrence
gcc/testsuite/ChangeLog: * gcc.target/aarch64/fp16/fp16.exp: New. * gcc.target/aarch64/fp16/f16_convs_1.c: New. * gcc.target/aarch64/fp16/f16_convs_2.c: New. commit bc5045c0d3dd34b8cb94910281384f9ab9880325 Author: Alan Lawrence Date: Thu May 7 10:08:12 2015 +0100

[PATCH 8/15][AArch64] Add support for float14x{4,8}_t vectors/builtins

2015-07-28 Thread Alan Lawrence
float32x4_t. * gcc.target/aarch64/vld1_lane.c: Remove unused constants; add cases for float16x4_t and float16x8_t. commit 49cb53a94a44fcda845c3f6ef11e88f9be458aad Author: Alan Lawrence Date: Tue Dec 2 13:08:15 2014 + AArch64 2/N: Vector/__builtin basics: define+support

[PATCH 9/15][AArch64] vld{2,3,4}{,_lane,_dup}, vcombine, vcreate

2015-07-28 Thread Alan Lawrence
/aarch64/vldN_dup_1.c: Likewise. * gcc.target/aarch64/vldN_lane_1.c: Likewise. commit ef719e5d3d6eccc5cf621851283b7c0ba1a9ee6c Author: Alan Lawrence Date: Tue Aug 5 17:52:28 2014 +0100 AArch64 3/N: v(create|combine|v(ld|st|ld...dup/lane|st...lane)[234](q?))_f16; tests vldN{,_lane,_dup

[PATCH 10/15][AArch64] Implement vcvt_{,high_}f16_f32

2015-07-28 Thread Alan Lawrence
gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_float_truncate_lo_v2sf): Reparameterize to... (aarch64_float_truncate_lo_): ...this, for both V2SF and V4HF. (aarch64_float_truncate_hi_v4sf): Reparameterize to... (aarch64_float_truncate_hi_): ...thi

[PATCH 11/15][AArch64] vreinterpret(q?), vget_(low|high), vld1(q?)_dup

2015-07-28 Thread Alan Lawrence
mit beb21a6bce76d4fbedb13fcf25796563b27f6bae Author: Alan Lawrence Date: Mon Jun 29 18:46:49 2015 +0100 [AArch64 5/N v2] vreinterpret, vget_(low|high), vld1(q?)_dup. update tests for vget_low/high diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index b915754..ff1a45c 100644 --

[PATCH 14/15][ARM/AArch64 Testsuite]Add test of vcvt{,_high}_{f16_f32,f32_f16}

2015-07-28 Thread Alan Lawrence
gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/advsimd-intrinsics.exp: set additional flags for neon-fp16 support. * gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c: New. commit e6cc7467ddf5702d3a122b8ac4163621d0164b37 Author: Alan Lawrence Date: Wed

[PATCH 13/15][ARM/AArch64 Testsuite] Add float16 tests to advsimd-intrinsics testsuite

2015-07-28 Thread Alan Lawrence
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00488.html, fixing up the testsuite for float16 vectors. Relative to the previous version, most of the additions to the tests are now within #if..#endif such that they are only compiled if we have a scalar __fp16 type (the excepti

[PATCH 12/15][AArch64] Add vcvt(_high)?_f32_f16 intrinsics, with BE RTL fix

2015-07-28 Thread Alan Lawrence
(float_extend_lo): Add v4sf. * config/aarch64/arm_neon.h (vcvt_f32_f16, vcvt_high_f32_f16): New. * config/aarch64/iterators.md (VQ_HSF): New iterator. (VWIDE, Vwtype, Vhalftype): Add V8HF, V4SF. (Vwide): New mode_attr. commit 214fcc00475a543a79ed444f9a64061215

[PATCH 15/15][ARM] Update sourcebuild.texi with testsuite/effective-target hooks

2015-07-28 Thread Alan Lawrence
This documents the change to arm_neon_fp16_ok in the first patch; the addition of arm_neon_fp16_hw_ok in the last patch; and corrects a cross-reference. (I tried using an @ref instead of "Implies previous." but the page ref looked very out-of-place in PDF when I am referring to the previous ite

Re: [PATCH 10/15][AArch64] Implement vcvt_{,high_}f16_f32

2015-07-29 Thread Alan Lawrence
James Greenhalgh wrote: On Tue, Jul 28, 2015 at 12:26:09PM +0100, Alan Lawrence wrote: gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_float_truncate_lo_v2sf): Reparameterize to... (aarch64_float_truncate_lo_): ...this, for both V2SF and V4HF

[AArch64] Remove unused VRL2/3/4 iterator values (was: Re: [PATCH 8/15][AArch64] Add support for float14x{4,8}_t vectors/builtins)

2015-07-30 Thread Alan Lawrence
James Greenhalgh wrote: On Tue, Jul 28, 2015 at 12:25:40PM +0100, Alan Lawrence wrote: I'd have preferred the unrelated changes here as separate patches. If you pull them out, they are OK to commit independent of this patch. Done (r226352 and r226353). Ah ok, I see what is going on

Re: [AArch64] Remove unused VRL2/3/4 iterator values

2015-07-30 Thread Alan Lawrence
James Greenhalgh wrote: (define_mode_attr VRL2 [(V8QI "V32QI") (V4HI "V16HI") (V2SI "V8SI") (V2SF "V8SF") - (DI "V4DI") (DF "V4DF") - (V16QI "V32QI") (V8HI "V16HI") - (V4SI "V8SI") (V4SF "V8SF") -

Re: [PATCH 8/15][AArch64] Add support for float16x{4,8}_t vectors/builtins

2015-08-04 Thread Alan Lawrence
.target/aarch64/vset_lane_1.c: Likewise. * gcc.target/aarch64/vld1-vst1_1.c: Likewise. * gcc.target/aarch64/vld1_lane.c: Likewise. commit 49cb53a94a44fcda845c3f6ef11e88f9be458aad Author: Alan Lawrence Date: Tue Dec 2 13:08:15 2014 + AArch64 2/N: Vector/__builtin basics: defin

Re: [PATCH 9/15][AArch64] vld{2,3,4}{,_lane,_dup}, vcombine, vcreate

2015-08-04 Thread Alan Lawrence
James Greenhalgh wrote: On Tue, Jul 28, 2015 at 12:25:55PM +0100, Alan Lawrence wrote: gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_split_simd_combine): Add V4HFmode. * config/aarch64/aarch64-builtins.c (VAR13, VAR14): New. (aarch64_scalar_builtin_types

[PATCH][ARM/AArch64 Testsuite] Add float16 lane_indices tests (was: Re: [PATCH 9/15][AArch64] vld{2,3,4}{,_lane,_dup}, vcombine, vcreate)

2015-08-04 Thread Alan Lawrence
James Greenhalgh wrote: Hi Alan, The arm_neon.h portion of this patch does not apply after Charles' recent changes. Could you please rebase and resubmit the patch for review? Thanks, James These are straightforward copies of the corresponding uint16 tests, with appropriate substitutions uint

Re: [PATCH 8/15][AArch64] Add support for float16x{4,8}_t vectors/builtins

2015-08-04 Thread Alan Lawrence
Sorry, attached the wrong file. Here! --Alan Alan Lawrence wrote: James Greenhalgh wrote: -;; All modes. +;; All vector modes on which we support any arithmetic operations. (define_mode_iterator VALL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V2SF V4SF V2DF]) -;; All vector modes and DI

Re: [PATCH 9/15][AArch64] vld{2,3,4}{,_lane,_dup}, vcombine, vcreate

2015-08-04 Thread Alan Lawrence
Attachment has gone awol here too. Sorry for the bother, please ignore previous... Alan Lawrence wrote: James Greenhalgh wrote: On Tue, Jul 28, 2015 at 12:25:55PM +0100, Alan Lawrence wrote: gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_split_simd_combine): Add V4HFmode

Re: [PATCH] Optimize certain end of loop conditions into min/max operation

2015-08-05 Thread Alan Lawrence
Richard Biener wrote: Furthermore it doesn't work for three such ops which would require an additional pattern like (simplfiy (bit_and:c (op @0 (min @1 @2)) (op @0 @3)) (op @0 (min (min @1 @2) @3 if that's profitable? Shouldn't that be just a case of binding @1 in the original patte

Re: [PATCH 9/15][AArch64] vld{2,3,4}{,_lane,_dup}, vcombine, vcreate

2015-08-06 Thread Alan Lawrence
Alan Lawrence wrote: > James Greenhalgh wrote: >> Hi Alan, >> >> The arm_neon.h portion of this patch does not apply after Charles' recent >> changes. Could you please rebase and resubmit the patch for review? >> >> Thanks, >> James > > Ah,

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-16 Thread Alan Lawrence
On 16/09/15 15:28, Bill Schmidt wrote: 2015-09-16 Bill Schmidt * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX, UNSPEC_REDUC_SMIN, UNSPEC_REDUC_UMAX, UNSPEC_REDUC_UMIN, UNSPEC_REDUC_SMAX_SCAL, UNSPEC_REDUC_SMIN_SCAL, UNSPEC_REDUC_UMAX_SCAL, UNSPEC_REDUC_UMIN_

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-16 Thread Alan Lawrence
On 16/09/15 17:10, Bill Schmidt wrote: On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote: On 16/09/15 15:28, Bill Schmidt wrote: 2015-09-16 Bill Schmidt * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX, UNSPEC_REDUC_SMIN, UNSPEC_REDUC_UMAX, UNSPEC_REDUC_UMIN

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-16 Thread Alan Lawrence
On 16/09/15 17:19, Bill Schmidt wrote: On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote: I proposed a patch to migrate PPC off the old patterns, but have forgotten to ping it recently - last at https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01024.html ... (ping?!) Hi Alan, Thanks for

Re: [PATCH 2/5] completely_scalarize arrays as well as records.

2015-09-17 Thread Alan Lawrence
On 15/09/15 08:43, Richard Biener wrote: > > Sorry for chiming in so late... Not at all, TYVM for your help! > TREE_CONSTANT isn't the correct thing to test. You should use > TREE_CODE () == INTEGER_CST instead. Done (in some cases, via tree_fits_shwi_p). > Also you need to handle > NULL_TREE

Re: [PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

2015-09-18 Thread Alan Lawrence
On 18/09/15 13:17, Richard Biener wrote: Ok, I see. That this case is already vectorized is because it implements MAX_EXPR, modifying it slightly to int foo (int *a) { int val = 0; for (int i = 0; i < 1024; ++i) if (a[i] > val) val = a[i] + 1; return val; } makes it no lo

[PATCH][RS6000] Migrate from reduc_xxx to reduc_xxx_scal optabs

2015-09-18 Thread Alan Lawrence
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01024.html after discovering that patch was broken on power64le - thanks to Bill Schmidt for pointing out that gcc112 is the opposite endianness to gcc110... This time I decided to avoid any funny business with making RTL match othe

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-18 Thread Alan Lawrence
On 18/09/15 09:35, Richard Biener wrote: Btw, we ditched the original reduce-to-vector variant due to its endianess issues (it only had _one_ element of the vector contain the reduction result). Re-introducing reduce-to-vector but with the reduction result in all elements wouldn't have any issu

Re: [PR64164] drop copyrename, integrate into expand

2015-09-18 Thread Alan Lawrence
On 02/09/15 23:12, Alexandre Oliva wrote: On Sep 2, 2015, Alan Lawrence wrote: One more failure to report, I'm afraid. On AArch64 Bigendian, aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from r227348): Thanks. The failure mode was different in the current, revampe

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-21 Thread Alan Lawrence
[Resending in plain text] This makes sense to me now, although I find your comment slightly confusing: [] in that +;; the meaning of HI and LO is always taken with a little-endian view of +;; the vector You mean vec_unpacks_{hi,lo} (which seems to go against the *architectural* bit after this

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-21 Thread Alan Lawrence
On 21/09/15 15:38, James Greenhalgh wrote: On Mon, Sep 21, 2015 at 10:44:32AM +0100, Alan Lawrence wrote: [Resending in plain text] This makes sense to me now, although I find your comment slightly confusing: [] in that +;; the meaning of HI and LO is always taken with a little-endian

Re: [PATCH, MIPS, PR/61114] Migrate to reduc_..._scal optabs.

2015-10-06 Thread Alan Lawrence
Thanks for working on this, Simon! On 01/10/15 15:43, Simon Dardis wrote: -(define_expand "reduc_smax_" - [(match_operand:VWHB 0 "register_operand" "") - (match_operand:VWHB 1 "register_operand" "")] +(define_expand "reduc_smax_scal_" + [(match_operand:HI 0 "register_operand" "") + (match_

Re: [PATCH, MIPS, PR/61114] Migrate to reduc_..._scal optabs.

2015-10-07 Thread Alan Lawrence
On 07/10/15 11:50, Simon Dardis wrote: On the change from smin/smax it was a deliberate change as I managed to confuse myself of the mode patterns, correct version follows. Reverted back to VWHB for smax/smin. Stylistic point addressed. No new regression, ok for commit? Well, I'm not a MIPS

Re: [[Boolean Vector, patch 5/5] Support boolean vectors in vector lowering

2015-10-12 Thread Alan Lawrence
On 09/10/15 22:01, Jeff Law wrote: So my question for the series as a whole is whether or not we need to do something for the other languages, particularly Fortran. I was a bit surprised to see this stuff bleed into the C/C++ front-ends and obviously wonder if it's bled into Fortran, Ada, Java,

Re: [PATCH 2/3] [ARM] PR63870 Mark lane indices of vldN/vstN with appropriate qualifier

2015-10-12 Thread Alan Lawrence
On 07/10/15 00:59, charles.bay...@linaro.org wrote: diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 2667866..251afdc 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -4261,8 +4261,9 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VLD1_LANE))] "TARG

Re: [PATCH 1/3] [ARM] PR63870 Add qualifiers for NEON builtins

2015-10-12 Thread Alan Lawrence
On 07/10/15 00:59, charles.bay...@linaro.org wrote: diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c ... case NEON_ARG_MEMORY: /* Check if expand failed. */ if (op[argc] == const0_rtx) { - va_end (a

[PATCH][Testsuite] Turn on 64-bit-vector tests for AArch64

2015-10-16 Thread Alan Lawrence
This enables tests bb-slp-11.c and bb-slp-26.c for AArch64. Both of these are currently passing on little- and big-endian. (Tested on aarch64-none-linux-gnu and aarch64_be-none-elf). OK for trunk? gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_vect64): Add AA

[PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-10-16 Thread Alan Lawrence
This lets the vectorizer handle some simple strides expressed using left-shift rather than mul, e.g. a[i << 1] (whereas previously only a[i * 2] would have been handled). This patch does *not* handle the general case of shifts - neither a[i << j] nor a[1 << i] will be handled; that would be a sign

Re: [PATCH 1/3] [ARM] PR63870 Add qualifiers for NEON builtins

2015-10-19 Thread Alan Lawrence
On 14/10/15 23:02, Charles Baylis wrote: On 12 October 2015 at 11:58, Alan Lawrence wrote: > Given we are making changes here to how this all works on bigendian, have you tested armeb at all? I tested on big endian, and it passes, except Well, I asked because it seemed good to m

[PATCH][AArch64 Testsuite][Trivial?] Remove divisions-to-produce-NaN from vdiv_f.c

2015-10-20 Thread Alan Lawrence
The test vdiv_f.c #define's NAN to (0.0 / 0.0). This produces extra scalar fdiv's, which complicate the scan-assembler testing. We can remove these by using __builtin_nan instead. Tested on AArch64 Linux. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vdiv_f.c: Use __builtin_nan. --- g

[PATCH][Testsuite] Add --param sra-max-scalarization-size-Ospeed to sra-12.c

2015-10-21 Thread Alan Lawrence
gcc.dg/tree-ssa/sra-12.c is skipped on a bunch of targets, including AArch64, because the default max-scalarization-size depends on MOVE_RATIO, and on those targets thus ends up being too small for SRA to optimize the testcase. Recently I noticed that the test has been failing for some time on ARM

Re: [PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

2015-10-22 Thread Alan Lawrence
Just one very small point... On 19/10/15 09:17, Alan Hayward wrote: > - if (check_reduction > - && (!commutative_tree_code (code) || !associative_tree_code (code))) > + if (check_reduction) > { > - if (dump_enabled_p ()) > -report_vect_op (MSG_MISSED_OPTIMIZATION, def_st

Re: [PATCH, MIPS, PR/61114] Migrate to reduc_..._scal optabs.

2015-10-22 Thread Alan Lawrence
On closer inspection I think you can also remove this guy (from loongson.md): (define_insn "reduc_uplus_v8qi" [(set (match_operand:V8QI 0 "register_operand" "=f") (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "f")] UNSPEC_LOONGSON_BIADD))] "TARGET_HARD_FL

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-10-23 Thread Alan Lawrence
On 19/10/15 12:49, Richard Biener wrote: > Err, you should always do the shift in the type of rhs1. You should also > avoid the chrec_convert of rhs2 above for shifts. Err, yes, indeed. Needed to keep the chrec_convert before the chrec_fold_multiply, and the rest followed. How's this? Bootstr

[PATCH] PR/67682, break SLP groups up if only some elements match

2015-10-23 Thread Alan Lawrence
vect_analyze_slp_instance currently only creates an slp_instance if _all_ stores in a group fitted the same pattern. This patch splits non-matching groups up on vector boundaries, allowing only part of the group to be SLP'd, or multiple subgroups to be SLP'd differently. The algorithm could be mad

Re: [PATCH] PR/67682, break SLP groups up if only some elements match

2015-10-25 Thread Alan Lawrence
On 23 October 2015 at 16:20, Alan Lawrence wrote: > diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > index ab54a48..b012d78 100644 > --- a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > @@ -16,

[PATCH][AArch64] Fix ICE on (const_double:HF 0.0)

2015-10-26 Thread Alan Lawrence
The included testcase demonstrates the ICE: aarch64_valid_floating_const (via aarch64_float_const_representable_p) disables HFmode immediates, but allows 0.0. However, *movhf_aarch64 does not allow this insn: (insn 7 6 10 2 (set (mem:HF (reg/f:DI 73) [0 *f_2(D)+0 S2 A16]) (const_double:HF

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-10-27 Thread Alan Lawrence
--in-reply-to On 26/10/15 08:58, Richard Biener wrote: > > On Fri, Oct 23, 2015 at 5:15 PM, Alan Lawrence wrote: >> + chrec2 = fold_build2 (LSHIFT_EXPR, TREE_TYPE (rhs1), >> + build_int_cst (TREE_TYPE (rhs1), 1), > > 'type' inst

Re: [PATCH] PR/67682, break SLP groups up if only some elements match

2015-10-27 Thread Alan Lawrence
On 26/10/15 15:04, Richard Biener wrote: apart from the fact that you'll post a new version you need to adjust GROUP_GAP. You also seem to somewhat "confuse" "first I stmts" and "a group of size I", those are not the same when the group has haps. I'd say "a group of size i" makes the most sense

[PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF

2015-10-29 Thread Alan Lawrence
This patch just teaches DOM that ARRAY_REFs can be equivalent to MEM_REFs (with pointer type to the array element type). gcc/ChangeLog: * tree-ssa-dom.c (dom_normalize_single_rhs): New. (dom_normalize_gimple_stmt): New. (lookup_avail_expr): Call dom_normalize_gimple_stmt.

[PATCH 2/6] tree-ssa-dom.c: Normalize data types in MEM_REFs.

2015-10-29 Thread Alan Lawrence
This makes dom2 identify e.g. MEM[(int[8] *)...] with MEM[(int *)...]. These are not generally equivalent as they have different aliasing behaviour but they have the same value as far as dom is concerned and so this helps find more equivalences. There is some question over the best policy here, bu

[PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices

2015-10-29 Thread Alan Lawrence
The code I added to completely_scalarize for arrays isn't right in some cases of negative array indices (e.g. arrays with indices from -1 to 1 in the Ada testsuite). On ARM, this prevents a failure bootstrapping Ada with the next patch, as well as a few ACATS tests (e.g. c64106a). Some discussion

[PATCH 6/6] Make SRA replace constant-pool loads

2015-10-29 Thread Alan Lawrence
This has changed quite a bit since the previous revision (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01484.html), mostly due to Ada and specifically Ada on ARM. I didn't find a good alternative to scanning for constant-pool accesses "as we go" through the function, and although I didn't find an

[PATCH 4/6][Trivial] tree-sra.c: A few comment fixes/additions.

2015-10-29 Thread Alan Lawrence
gcc/ChangeLog: * tree-sra.c (scalarizable_type_p): Comment variable-length arrays. (completely_scalarize): Comment zero-length arrays. (get_access_replacement): Correct comment re. precondition. --- gcc/tree-sra.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) d

[PATCH 3/6] Share code from fold_array_ctor_reference with fold.

2015-10-29 Thread Alan Lawrence
This is in response to https://gcc.gnu.org/ml/gcc/2015-10/msg00097.html, where Richi points out that CONSTRUCTOR elements are not necessarily ordered. I wasn't sure of a good naming convention for the new get_ctor_element_at_index, other suggestions welcome. gcc/ChangeLog: * gimple-fold.

[PATCH 0/6 v2] PR/63679 Make SRA scalarize constant-pool loads

2015-10-29 Thread Alan Lawrence
This is a revision of previous series at https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01485.html , and follows on from the first two patches of that series, which have been pushed already. A few things have happened since. The previous patch 3, making SRA generate ARRAY_REFS, is removed. As Marti

Re: [PATCH 3/4] bb-reorder: Add -freorder-blocks-algorithm= and wire it up

2015-11-02 Thread Alan Lawrence
On 02/11/15 14:38, Alan Lawrence wrote: > I'm a bit puzzled as to why nobody else has been seeing this, as it's been happening to me as part of building gcc on x86_64, but since this patch I've been seeing an ICE in vec::operator[] in reorder_basic_blocks_simple, building

Re: [PATCH][AArch64] Fix ICE on (const_double:HF 0.0)

2015-11-02 Thread Alan Lawrence
On 26/10/15 16:26, Alan Lawrence wrote: The included testcase demonstrates the ICE: aarch64_valid_floating_const (via aarch64_float_const_representable_p) disables HFmode immediates, but allows 0.0. However, *movhf_aarch64 does not allow this insn: (insn 7 6 10 2 (set (mem:HF (reg/f:DI 73) [0

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-11-03 Thread Alan Lawrence
On 27/10/15 22:27, H.J. Lu wrote: > > It caused: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68112 Bah :(. So yes, in general case, we can't rewrite (a << 1) to (a * 2) as for signed types (0x7f...f) << 1 == -2 whereas (0x7f...f * 2) is undefined behaviour. Oh well :(... I don't have a real

Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF

2015-11-03 Thread Alan Lawrence
On 30/10/15 05:35, Jeff Law wrote: > On 10/29/2015 01:18 PM, Alan Lawrence wrote: >> This patch just teaches DOM that ARRAY_REFs can be equivalent to MEM_REFs >> (with >> pointer type to the array element type). >> >> gcc/ChangeLog: >> >> * t

Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF

2015-11-03 Thread Alan Lawrence
On 3 November 2015 at 10:27, Alan Lawrence wrote: > That is, ssa-dom-cse-7.c passes (and the patch series solves PR/63679) if > instead of my patch 2 (normalization of MEM_REFs) we have this: > > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index 4327990..2889a96 100644 > --

[PATCH][i386]Migrate reduction optabs to reduc__scal

2015-11-03 Thread Alan Lawrence
This migrates the various reduction optabs in sse.md to use the reduce-to-scalar form. I took the straightforward approach (equivalent to the migration code in expr.c/optabs.c) of generating a vector temporary, using the existing code to reduce to that, and extracting lane 0, in each pattern. Boot

[PATCH/RFTesting][MIPS] Migrate reduction optabs in mips-ps-3d.md

2015-11-03 Thread Alan Lawrence
There are still a few uses of the old reduc_[us](plus|min|max)_ optabs remaining. This migrates the instances in mips-ps-3d.md. This seemed straightforward, as mips-ps-3d.md also provides a vec_extractv2sf. I tried to be conservative and handle all the possible cases for endianness, this may be ov

Re: [PATCH 3/6] Share code from fold_array_ctor_reference with fold.

2015-11-04 Thread Alan Lawrence
> s/explicitely/explicitly/ And remove the '*' from the 2nd and 3rd lines > of the comment. > > It looks like get_ctor_element_at_index has numerous formatting > problems. In particular you didn't indent the braces across the board > properly. Also check for tabs vs spaces issues please. Yes, y

Re: [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices

2015-11-05 Thread Alan Lawrence
On 30/10/15 10:54, Eric Botcazou wrote: > On 30/10/15 10:44, Richard Biener wrote: >> >> I think you want to use wide-ints here and >> >> wide_int idx = wi::from (minidx, TYPE_PRECISION (TYPE_DOMAIN >> (...)), TYPE_SIGN (TYPE_DOMAIN (..))); >> wide_int maxidx = ... >> >> you can then simply

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-11-05 Thread Alan Lawrence
On 3 November 2015 at 11:35, Richard Biener wrote: > > I think this should simply re-write A << B to (type) (unsigned-type) A > * (1U << B). > > Does that then still vectorize the signed case? I didn't realize our representation of chrec's could express that. Yes, it does - thanks! (And the avx51

Re: [PATCH] PR/67682, break SLP groups up if only some elements match

2015-11-05 Thread Alan Lawrence
On 03/11/15 13:39, Richard Biener wrote: > On Tue, Oct 27, 2015 at 6:38 PM, Alan Lawrence wrote: >> >> Say I...P are consecutive, the input would have gaps 0 1 1 1 1 1 1 1. If we >> split the load group, we would want subgroups with gaps 0 1 1 1 and 0 1 1 1? > > As sai

Re: [PATCH 6/6] Make SRA replace constant-pool loads

2015-11-05 Thread Alan Lawrence
On 3 November 2015 at 14:01, Richard Biener wrote: > > Hum. I still wonder why we need all this complication ... Well, certainly I'd love to make it simpler, and if the complication is because I've gone about trying to deal with especially Ada in the wrong way... > I would > expect that if > w

Re: [PATCH] Fix PR68067

2015-11-06 Thread Alan Lawrence
On 28/10/15 13:38, Richard Biener wrote: Applied as follows. Bootstrapped / tested on x86_64-unknown-linux-gnu. Richard. 2015-10-28 Richard Biener * fold-const.c (negate_expr_p): Adjust the division case to properly avoid introducing undefined overflow. (fold_negat

Re: [PATCH] Fix PR68067

2015-11-06 Thread Alan Lawrence
On 06/11/15 10:39, Richard Biener wrote: ../spec2000/benchspec/CINT2000/254.gap/src/polynom.c:358:11: error: location references block not in block tree l1_279 = PHI <1(28), l1_299(33)> ^^^ this is the error to look at! It means that the GC heap will be corrupted quite easily. Thanks, I'll

Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute

2015-11-06 Thread Alan Lawrence
On 04/11/15 13:13, Jakub Jelinek wrote: On Mon, Jul 06, 2015 at 05:38:35PM +0100, Alan Lawrence wrote: Trying to push these now (svn!), patch 2 is going first. I realize my second iteration of patch 1/2, dropped the testcases from the first version. Okay to include those as per https

Re: [PATCH] PR/67682, break SLP groups up if only some elements match

2015-11-09 Thread Alan Lawrence
On 06/11/15 12:55, Richard Biener wrote: > >> + /* GROUP_GAP of the first group now has to skip over the second group >> too. */ >> + GROUP_GAP (first_vinfo) += group2_size; > > Please add a MSG_NOTE debug printf stating that we split the group and > at which element. Done. > I think you want

Re: [PR64164] drop copyrename, integrate into expand

2015-11-10 Thread Alan Lawrence
On 05/11/15 05:08, Alexandre Oliva wrote: [PR67753] fix copy of PARALLEL entry_parm to CONCAT target_reg for gcc/ChangeLog PR rtl-optimization/67753 PR rtl-optimization/64164 * function.c (assign_parm_setup_block): Avoid allocating a stack slot if we don't have a

Re: [Patch AArch64] Switch constant pools to separate rodata sections.

2015-11-10 Thread Alan Lawrence
On 04/11/15 14:26, Ramana Radhakrishnan wrote: True and I've just been reading more of the backend - We could now start using blocks for constant pools as well. So let's do that. How does something like this look ? Tested on aarch64-none-elf - no regressions. 2015-11-04 Ramana Radhakrishnan

Re: [Patch AArch64] Switch constant pools to separate rodata sections.

2015-11-10 Thread Alan Lawrence
On 10/11/15 16:39, Alan Lawrence wrote: Since r229878, I've been seeing FAIL: gcc.dg/attr-weakref-1.c (test for excess errors) UNRESOLVED: gcc.dg/attr-weakref-1.c compilation failed to produce executable (both previously passing) on aarch64-none-elf, aarch64_be-none-elf, and aarch64-none-

Re: [PATCH 6/6] Make SRA replace constant-pool loads

2015-11-12 Thread Alan Lawrence
On 06/11/15 16:29, Richard Biener wrote: >>> 2) You should be able to use fold_ctor_reference directly (in place >> of >>> all your code >>> in case offset and size are readily available - don't remember >> exactly how >>> complete scalarization "walks" elements). Alternatively use >>> fold_const_

  1   2   3   4   5   6   >