Re: [PATCH 4.9][AArch64][testsuite] Backport r211502: PR/59843 Fix ICE on singleton vector of float on AArch64

2014-07-04 Thread Alan Lawrence
Patch here. Alan Lawrence wrote: No regressions on aarch64-none-elf; new tests passing on aarch64-none-elf, arm-none-eabi, x86_64-unknown-linux-gnu: NA->PASS gcc.dg/vect/vect-singleton_1.c (test for warnings, line 20) NA->PASS gcc.dg/vect/vect-singleton_1.c (test for excess errors)

[PATCH][Testsuite] Disable tests with dg-require-fork for simulated targets

2015-05-18 Thread Alan Lawrence
Simulators such as qemu report the presence of fork (it's in glibc) but generally do not support synchronization primitives between threads, so any tests using fork are unreliable. This patch disables the subset of such tests that identify themselves using dg-require-fork. At present, such tes

[PATCH/RFC] Make loop-header-copying more aggressive, rerun before tree-if-conversion

2015-05-22 Thread Alan Lawrence
This example which I wrote to test ifconversion, currently fails to if-convert or vectorize: int foo () { for (int i = 0; i < 32 ; i++) { int m = (a[i] & i) ? 5 : 4; b[i] = a[i] * m; } } ...because jump-threading in dom1 rearranged the loop into a form that neither if-con

Re: [PATCH 13/14][ARM/AArch64 testsuite] Use gcc-dg-runtest in advsimd-intrinsics.exp

2015-05-26 Thread Alan Lawrence
Christophe Lyon wrote: On 22 April 2015 at 19:36, Alan Lawrence wrote: In the first revision of Christophe Lyon's advsimd-intrinsics tests, https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00532.html , both gcc-dg-runtest (to assemble only) and c-torture-execute were used. In review the g

Re: [PATCH 13/14][ARM/AArch64 testsuite] Use gcc-dg-runtest in advsimd-intrinsics.exp

2015-05-28 Thread Alan Lawrence
Christophe Lyon wrote: On 26 May 2015 at 18:25, Alan Lawrence wrote: I don't see this symptom - I am able to execute such subsets with either my, or Sandra's, advsimd-intrinsics.exp. I didn't try to run with your patch, I thought it was an oversight of yours. Sorry, indeed I&

Re: [PATCH 13/14][ARM/AArch64 testsuite] Use gcc-dg-runtest in advsimd-intrinsics.exp

2015-05-28 Thread Alan Lawrence
Christophe Lyon wrote: So in fact, except for the comment about '-w' it seems you initial patch was mostly OK, right? Well, my removing a bunch of that c-torture-init stuff, was what was causing the "-Og -g" variant to go missing, but apart from that, yes. --Alan

[PATCH][ARM/AArch64 Testsuite] Cleanup advsimd-intrinsics.exp, removing unnecessary loop

2015-05-28 Thread Alan Lawrence
I've tested this on aarch64, aarch64_be, and arm, and in all cases, the same tests are executed (whether running the whole advsimd-intrinsics.exp, or manually specifying a single file). AFAICT the loop, explicit runtest_file_p, and gcc_set_parallelization_enable, all stem from a point where we w

Re: [PATCH][ARM/AArch64 Testsuite] Cleanup advsimd-intrinsics.exp, removing unnecessary loop

2015-05-29 Thread Alan Lawrence
Christophe Lyon wrote: This looks OK, but why can't you also drop the other torture-related lines as you did in your previous patch? I mean: load_lib c-torture.exp load_lib torture-options.exp etc... We need c-torture.exp in order to set-torture-options; we need to set-torture-options to get

Re: [PATCH][Testsuite] Disable tests with dg-require-fork for simulated targets

2015-06-02 Thread Alan Lawrence
Christophe Lyon wrote: On 18 May 2015 at 20:25, Mike Stump wrote: On May 18, 2015, at 8:01 AM, Alan Lawrence wrote: Simulators such as qemu report the presence of fork (it's in glibc) but generally do not support synchronization primitives between threads, so any tests using for

Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-06-02 Thread Alan Lawrence
Richard Earnshaw wrote: On 01/06/15 13:07, Jakub Jelinek wrote: On Thu, May 07, 2015 at 12:16:32PM +0100, Alan Lawrence wrote: So for my two cents, or perhaps three: Any progress on this PR? A P1 bug that affects several packages stalled for a month isn't a very good thing... (not to me

Re: [PATCH] [AArch64] PR63870 Improve error messages for NEON single lane memory access intrinsics

2015-06-08 Thread Alan Lawrence
Thanks for working on this! I'd been fiddling around with a patch with some similar elements to this, but many trials with union types, subregs, etc., all worsened the register allocation and led to more unnecessary shuffling / moves. The only real thing I tried which you don't do here, was to

Re: [PATCH] [AArch64] PR63870 Improve error messages for NEON single lane memory access intrinsics

2015-06-08 Thread Alan Lawrence
Oh, have you tested bigendian? --Alan Charles Baylis wrote: This is another attempt at fixing this PR63870 for AArch64 (ARM is still to come). As before, the Q register variants are handled by moving the check for the lane bounds into builtin expansion. The handling of lane numbers is made con

[Obvious][AArch64 Testsuite] Fix comments in vldN_lane_1.c

2015-04-16 Thread Alan Lawrence
The comments in vldN_lane_1.c say it is testing vld{1,2,3}{,q}_dup. This is wrong, it is testing vld{1,2,3}{,q}_lane, as per test filename; I've pushed the attached as r222148. gcc/testsuite/ChangeLog: gcc.target/aarch64/vldN_lane_1.c: Correct dup->lane in comments. diff --git a/gcc/t

[PATCH][AArch64] Fix PR/65770 vstN_lane on bigendian

2015-04-16 Thread Alan Lawrence
As per bugzilla entry, indices in the generated assembly for bigendian are flipped when they should not be (and, flipped always relative to a Q-register!). This flips the lane indices back again at assembly time, fixing PR. The "indices" contained in the RTL are still wrong for D registers, but

[Obvious][AArch64] arm_neon.h: Remove unnecessary forward declaration of vdup_n_f32

2015-04-17 Thread Alan Lawrence
Committed r222177 after testing on aarch64-none-linux-gnu and aarch64-none-elf. gcc/ChangeLog: config/aarch64/arm_neon.h (vdup_n_f32): Remove forward declaration diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 71ef027..e9cc825 100644 --- a/gcc/config/aar

[PATCH 0/3][AArch64] DImode vector compares

2015-04-17 Thread Alan Lawrence
Hi, Comparing 64x1 vector types (defined by hand or from arm_neon.h) using GCC vector extensions currently generates very poor assembly code, for example "uint64x1_t foo (uint64x1_t a, uint64x1_t b) { return a >= b; }" generates (at -O3): fmov x0, d0 // 22 movdi_aarch64/12 [length = 4] fmov x

[PATCH 1/3] optabs.c: Make vector_compare_rtx cope with VOIDmode constants (e.g. const0_rtx)

2015-04-17 Thread Alan Lawrence
As per introduction, this allows vector_compare_rtx to work on DImode vectors. Bootstrapped + check-gcc on x86-unknown-linux-gnu. gcc/ChangeLog: * optabs.c (vector_compare_rtx): Handle RTL operands having VOIDmode. diff --git a/gcc/optabs.c b/gcc/optabs.c index f8d584eeeb11a2c19d8c8d88

[PATCH 2/3][AArch64] Add vcond(u?)didi pattern

2015-04-17 Thread Alan Lawrence
This just adds the necessary patterns used for comparisons of DImode vectors. Used as part of arm_neon.h, in next/final patch. Tested on aarch64-none-elf. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_vcond_internal, vcond, vcondu,): Add DImode variant. diff --git a/

[PATCH 3/3][AArch64] Idiomatic 64x1 comparisons in arm_neon.h

2015-04-17 Thread Alan Lawrence
This also makes the existing intrinsics tests apply to the new patterns. Tested on aarch64-none-elf. gcc/ChangeLog: * config/aarch64/arm_neon.h (vceq_s64, vceq_u64, vceqz_s64, vceqz_u64, vcge_s64, vcge_u64, vcgez_s64, vcgt_s64, vcgt_u64, vcgtz_s64, vcle_s64, vcle_u64, vc

[PATCH][AArch64] PR/64134: Make aarch64_expand_vector_init use 'ins' more often

2015-04-17 Thread Alan Lawrence
From https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64134, testcase #define vector __attribute__((vector_size(16))) float a; float b; vector float fb(void) { return (vector float){ 0,0,b,a};} currently produces (correct, but suboptimal): fb: fmovs0, wzr adrpx1, b

[Obvious][AArch64] Delete unused aarch64_simd_emit_pair_result_insn.

2015-04-20 Thread Alan Lawrence
Bootstrapped on aarch64-none-linux-gnu. Pushed as r34. gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_simd_emit_pair_result_insn): Delete. * config/aarch64/aarch64-protos.h (aarch64_simd_emit_pair_result_insn): Delete.

Re: [Obvious][AArch64] Delete unused aarch64_simd_emit_pair_result_insn.

2015-04-20 Thread Alan Lawrence
Oops, missed off the patch actually pushed. Attached now. Cheers, Alan Alan Lawrence wrote: Bootstrapped on aarch64-none-linux-gnu. Pushed as r34. gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_simd_emit_pair_result_insn): Delete. * config/aarch64/aarch64-protos.h

[PATCH 0/14][ARM/AArch64] __FP16 support, vectors, intrinsics, testsuite

2015-04-22 Thread Alan Lawrence
This patch series adds support for ARM Neon float16x4_t and float16x8_t vector types and intrinsics, and the __fp16 type, on both ARM and AArch64, and extends the tests in Christophe Lyon's advsimd-intrinsics testsuite to cover these. (I chose to extend the existing tests rather than add new one

Re: [PATCH 1/2][ARM] PR/63870: Add qualifier to check lane bounds in expand

2015-04-22 Thread Alan Lawrence
Ping (https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01422.html). These are required for float16 patches posted at https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01332.html . Bootstrapped + check-gcc on arm-none-linux-gnueabihf. Alan Lawrence wrote: This is based loosely upon svn r217440

Re: [PATCH 2/2][ARM] PR/63870: Add a __builtin_lane_check

2015-04-22 Thread Alan Lawrence
Ping (https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01436.html). These are required for float16 patches posted at https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01332.html Bootstrapped + check-gcc on arm-none-linux-gnueabihf. Alan Lawrence wrote: This parallels the present form of

[PATCH 2/14][ARM]Add float16x8_t type

2015-04-22 Thread Alan Lawrence
Identical to https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01438.html . Bootstrapped on arm-none-linux-gnueabihf. commit bc582bd6a0ed7c7c91fc834603fc573ed745b1a7 Author: Alan Lawrence Date: Mon Dec 8 18:40:24 2014 + Add float16x8_t + V8HFmode support (regardless of -mfp16-format

[PATCH 1/14][ARM] Add float16x4_t intrinsics

2015-04-22 Thread Alan Lawrence
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01437.html , but fixes a wrong 'lane index out of bounds' error on vget_lane_f16 and vset_lane_f16, and drops vdup_n_f16 and vdup_lane_f16, as these are not in the ACLE spec. As previously, these use GCC vector extensions to maxim

[PATCH 3/14][ARM] Add float16x8_t intrinsics

2015-04-22 Thread Alan Lawrence
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01439.html , again fixing a wrong 'lane index out of bounds' error for vgetq_lane_f16 and vsetq_lane-f16 at -O0, and dropping vdupq_n_f16 and vdupq_lane_f16 as these are not in the ACLE spec. The vld1, vldN, vldN_lane and corres

[PATCH 4/14][ARM] Remaining float16 intrinsics: vld..., vst..., vget_low|high, vcombine

2015-04-22 Thread Alan Lawrence
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01440.html ; changes are to add in several missing vst... intrinsics, and fix a missing iterator V_uf_sclr used in vec_extract. These intrinsics are all made from patterns in neon.md, and are all tied together by iterators - I'v

[PATCH 5/14][AArch64] Add basic fp16 support

2015-04-22 Thread Alan Lawrence
This adds basic support for moving __fp16 values around, passing and returning, and operating on them by promoting to 32-bit floats. Also a few scalar testcases. Note I've not got an fmov (immediate) variant, because there is no 'fmov h, ...' - the only way to load a 16-bit immediate is to rein

[PATCH 5/14][AArch64] Add basic fp16 support

2015-04-22 Thread Alan Lawrence
[Resending with correct in-reply-to header] This adds basic support for moving __fp16 values around, passing and returning, and operating on them by promoting to 32-bit floats. Also a few scalar testcases. Note I've not got an fmov (immediate) variant, because there is no 'fmov h, ...' - the

[PATCH 6/14][AArch64] Add support for float16x{4,8}_t vectors/builtins

2015-04-22 Thread Alan Lawrence
This adds some basic intrinsics - vget_lane, vset_lane, vld1_lane, vld1, vst1 - for float16 types, and the necessary support in the builtin generator, basic patterns for moving values around, etc. Other intrinsics will follow in later patches. I've extended the existing testcases in aarch64/,

[PATCH 7/14][AArch64] vld{2,3,4}{,_lane,_dup},vcombine,vcreate

2015-04-22 Thread Alan Lawrence
gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_split_simd_combine): Add V4HFmode. * config/aarch64/aarch64-builtins.c (VAR13, VAR14): New. (aarch64_scalar_builtin_types, aarch64_init_simd_builtin_scalar_types): Add __builtin_aarch64_simd_hf. * config/aa

[PATCH 8/14][AArch64]Add vreinterpret, float_truncate_lo/hi, vget_low/high

2015-04-22 Thread Alan Lawrence
gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_float_truncate_lo_v2sf): Reparameterize to... (aarch64_float_truncate_lo_): ...this, for both V2SF and V4HF. (aarch64_float_truncate_hi_v4sf): Reparameterize to... (aarch64_float_truncate_hi_): ...thi

[PATCH 9/14][AArch64] vld1(q?)_dup, missing vreinterpretq intrinsics

2015-04-22 Thread Alan Lawrence
gcc/ChangeLog: * config/aarch64/arm_neon.h (vreinterpretq_p8_f16, vreinterpretq_p16_f16, vreinterpretq_f32_f16, vreinterpretq_f64_f16, vreinterpretq_s64_f16, vreinterpretq_s8_f16, vreinterpretq_s16_f16, vreinterpretq_s32_f16, vreinterpretq_u8_f16, vreinterpretq_u16

[PATCH 10/14][AArch64] Add vcvt(_high)?_f32_f16 intrinsics

2015-04-22 Thread Alan Lawrence
This adds the two remaining widening intrinsics, first adding patterns in aarch64-simd.md, then entries in aarch64-simd-builtins.def, and finally intrinsics in arm_neon.h . Note this changes the vector indices present in the RTL on bigendian for float vec_unpacks, to be the same as for integer

[PATCH 11/14][fold-const.c] Fix bigendian HFmode in native_interpret_real

2015-04-22 Thread Alan Lawrence
. commit f8ad02fecdb7b6f91bab77cc154a246bd719ac20 Author: Alan Lawrence Date: Thu Apr 9 10:54:40 2015 +0100 Fix native_interpret_real for HFmode floats on Bigendian with UNITS_PER_WORD>=4 (with missing space) diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 6d085b1..52bc8e9 100644 --- a/gcc/fold-const.

[PATCH 12/14][ARM/AArch64 Testsuite] Update advsimd-intrinsics tests to add float16 vectors

2015-04-22 Thread Alan Lawrence
This is a fairly straightforward addition of a new type: I've added it in on equal status to the other types, because the various vector-load/store/element-manipulating intrinsics, are *not* conditional on HW support. (They just involve moving 16-bit chunks around, just like s16/u16/p16). Thus

[PATCH 13/14][ARM/AArch64 testsuite] Use gcc-dg-runtest in advsimd-intrinsics.exp

2015-04-22 Thread Alan Lawrence
In the first revision of Christophe Lyon's advsimd-intrinsics tests, https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00532.html , both gcc-dg-runtest (to assemble only) and c-torture-execute were used. In review the gcc-dg-runtest part was then dropped, and execution tests continued using c-tortur

[PATCH 14/14][ARM/AArch64 testsuite] Test float16_t vcvt_* intrinsics

2015-04-22 Thread Alan Lawrence
This adds a test of vcvt_f32_f16 and vcvt_f16_f32, also vcvt_high_f32_f16 and vcvt_high_f16_f32. On ARM, we pass additional option -mfpu=neon-fp16 to the compiler (possible following patch 2/3). The compiler is already receiving an option such as -mfpu=neon or -mfpu=crypto-neon-fp-armv8, but p

Re: [PATCH 7/14][AArch64] vld{2,3,4}{,_lane,_dup},vcombine,vcreate

2015-04-22 Thread Alan Lawrence
Alan Lawrence wrote: gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_split_simd_combine): Add V4HFmode. * config/aarch64/aarch64-builtins.c (VAR13, VAR14): New. (aarch64_scalar_builtin_types, aarch64_init_simd_builtin_scalar_types): Add

[PATCH] Remove some restrictions on loop shape in tree-if-conv.c

2015-04-28 Thread Alan Lawrence
Tree if-conversion currently bails out for loops that (a) contain nested loops; (b) have more than one exit; (c) where the exit block (source of the exit edge) does not dominate the loop latch; (d) where the exit block is the loop header, or there are statements after the exit. This patch remo

Re: [PATCH] Remove some restrictions on loop shape in tree-if-conv.c

2015-04-28 Thread Alan Lawrence
Alan Lawrence wrote: Tree if-conversion currently bails out for loops that (a) contain nested loops; (b) have more than one exit; (c) where the exit block (source of the exit edge) does not dominate the loop latch; (d) where the exit block is the loop header, or there are statements after the

[PATCH,PING][ARM]Remove vec_shr and vec_shr optabs

2015-04-28 Thread Alan Lawrence
No new code here ;). There is a slight change of execution path, i.e. some VEC_PERM_EXPRs (e.g. those for reductions via shifts) will be expanded using arm_expand_vec_perm_const rather than the vec_shr pattern. This generates EXT instructions equivalent to the original, but using the mode of the s

Re: [PATCH] Remove some restrictions on loop shape in tree-if-conv.c

2015-04-29 Thread Alan Lawrence
Sorry, I realize I forgot to attach the patch to the original email, this followed a couple of minutes later in message <553f91b9.7050...@arm.com> at https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01745.html . Cheers, Alan Jeff Law wrote: On 04/28/2015 07:55 AM, Alan Lawrence wrote: T

Re: [PATCH][AArch64] Fix PR/65770 vstN_lane on bigendian

2015-04-29 Thread Alan Lawrence
Alan Lawrence wrote: As per bugzilla entry, indices in the generated assembly for bigendian are flipped when they should not be (and, flipped always relative to a Q-register!). This flips the lane indices back again at assembly time, fixing PR. The "indices" contained in the RTL

Re: [PATCH] Remove some restrictions on loop shape in tree-if-conv.c

2015-04-30 Thread Alan Lawrence
Richard Biener wrote: On Tue, Apr 28, 2015 at 3:55 PM, Alan Lawrence wrote: Tree if-conversion currently bails out for loops that (a) contain nested loops; (b) have more than one exit; (c) where the exit block (source of the exit edge) does not dominate the loop latch; (d) where the exit block

Re: [PATCH 1/3] optabs.c: Make vector_compare_rtx cope with VOIDmode constants (e.g. const0_rtx)

2015-05-01 Thread Alan Lawrence
Alan Lawrence wrote: As per introduction, this allows vector_compare_rtx to work on DImode vectors. Bootstrapped + check-gcc on x86-unknown-linux-gnu. gcc/ChangeLog: * optabs.c (vector_compare_rtx): Handle RTL operands having VOIDmode. Ping. (DImode vectors are explicitly allowed

Re: [PATCH 0/3][AArch64] DImode vector compares

2015-05-05 Thread Alan Lawrence
Alan Lawrence wrote: Hi, Comparing 64x1 vector types (defined by hand or from arm_neon.h) using GCC vector extensions currently generates very poor assembly code, for example "uint64x1_t foo (uint64x1_t a, uint64x1_t b) { return a >= b; }" generates (at -O3): fmov x0, d0 // 22

Re: [PATCH] Fix eipa_sra AAPCS issue (PR target/65956)

2015-05-07 Thread Alan Lawrence
Richard Biener wrote: On May 5, 2015 4:33:58 PM GMT+02:00, Richard Earnshaw wrote: On 05/05/15 15:33, Richard Earnshaw wrote: On 05/05/15 15:29, Jakub Jelinek wrote: On Tue, May 05, 2015 at 02:20:43PM +0100, Richard Earnshaw wrote: On 05/05/15 14:06, Jakub Jelinek wrote: For the middle-end

Re: Vectorize stores with unknown stride

2015-05-07 Thread Alan Lawrence
(Below are all minor/style points only, no reason for patch not to go in.) Michael Matz wrote: diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index 96afc7a..6d8f17e 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -665,7 +665,7 @@ vect_compute_data_re

Re: Vectorize stores with unknown stride

2015-05-07 Thread Alan Lawrence
NP, and sorry for the spurious comments, hadn't spotted you were using nunits. I like the testcase, thanks :). A. Michael Matz wrote: On Thu, 7 May 2015, Alan Lawrence wrote: Also update comment? (5 identical cases) Also update comment? Obviously a good idea, thanks :) (s/loads/acc

Re: [PATCH 5/14][AArch64] Add basic fp16 support

2015-05-08 Thread Alan Lawrence
Joseph Myers wrote: > I'd think it would be desirable to share tests between ARM and AArch64 as far as possible (where applicable to both - so not the tests for the alternative format, and some of the gcc.target/arm/fp16-* tests using scan-assembler might need adapting to work for AArch64).

Re: [PATCH 0/14][ARM/AArch64] __FP16 support, vectors, intrinsics, testsuite

2015-05-08 Thread Alan Lawrence
Alan Lawrence wrote: This patch series adds support for ARM Neon float16x4_t and float16x8_t vector types and intrinsics, and the __fp16 type, on both ARM and AArch64, and extends the tests in Christophe Lyon's advsimd-intrinsics testsuite to cover these. (I chose to extend the existing

Re: [PATCH 1/2][ARM] PR/63870: Add qualifier to check lane bounds in expand

2015-05-08 Thread Alan Lawrence
Alan Lawrence wrote: Ping (https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01422.html). These are required for float16 patches posted at https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01332.html . Bootstrapped + check-gcc on arm-none-linux-gnueabihf. Alan Lawrence wrote: This is based loosely

[PATCH][AArch64] Remove/merge redundant iterators

2014-11-13 Thread Alan Lawrence
Hi, gcc/config/aarch64/iterators.md contains numerous duplicates - not always obvious as they are not always sorted the same. Sometimes, one copy is used is aarch64-simd-builtins.def and another in aarch64-simd.md; othertimes there is no obvious pattern ;). This patch just removes all the du

[PATCH 0/3][AArch64]More intrinsics/builtins improvements

2014-11-14 Thread Alan Lawrence
These three are logically independent, but all on a common theme, and I've tested them all together by bootstrapped + check-gcc on aarch64-none-elf cross-tested check-gcc on aarch64_be-none-elf Ok for trunk?

[PATCH 1/3][AArch64]Replace __builtin_aarch64_createv1df with a cast, cleanup

2014-11-14 Thread Alan Lawrence
Now that float64x1_t is a vector, casting to it from a unit64_t causes the bit pattern to be reinterpreted, just as vcreate_f64 should. (Previously when float64x1_t was still a scalar, casting caused a conversion.) Hence, replace the __builtin with a cast. None of the other variants of the aarch

[PATCH 2/3][AArch64] Extend aarch64_simd_vec_set pattern, replace asm for vld1_lane

2014-11-14 Thread Alan Lawrence
The vld1_lane intrinsic is currently implemented using inline asm. This patch replaces that with a load and a straightforward use of vset_lane (this gives us correct bigendian lane-flipping in a simple manner). Naively this would produce assembler along the lines of (for vld1_lane_u8):

[PATCH 3/3][AArch64]Replace temporary assembler for vld1_dup

2014-11-14 Thread Alan Lawrence
This patch replaces the inline asm for vld1_dup intrinsics with a vdup_n_ and a load from the pointer. The existing *aarch64_simd_ld1r insn, combiner, etc., are quite capable of generating the expected single ld1r instruction from this. (I've verified by inspecting assembler output.) gcc/Chang

Re: [PATCH 0/4][Vectorizer] Reductions: replace VEC_RSHIFT_EXPR with VEC_PERM_EXPR

2014-11-14 Thread Alan Lawrence
Ah, I didn't realize Loongson was little-endian only. In that case (with mid-end reductions-via-shifts changes pushed) I don't think I have actually broken anything, or at least, no MIPS platform that exists :). However, yes, that would seem a safe bet (and simpler than my linked patch that pr

[PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.

2014-11-14 Thread Alan Lawrence
Following recent vectorizer changes to reductions via shifts, AArch64 will now reduce loops such as this unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15}; int main (unsigned char argc, char **argv) { unsigned char prod = 1; /* Prevent constant propagation of the entire loop below. */ a

Re: [PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.

2014-11-14 Thread Alan Lawrence
...Patch attached... Alan Lawrence wrote: Following recent vectorizer changes to reductions via shifts, AArch64 will now reduce loops such as this unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15}; int main (unsigned char argc, char **argv) { unsigned char prod = 1; /* Prevent

PUSHED: [PATCH 14/14][Vectorizer] Tidy up vect_create_epilog / use_scalar_result

2014-11-14 Thread Alan Lawrence
After recent updates, tree-vect-loop.c is in the same state as when this cleanup patch was first written and approved, so I've just pushed it as r/217580. Cheers, Alan Richard Biener wrote: On Thu, Sep 18, 2014 at 2:48 PM, Alan Lawrence wrote: Following earlier pa

Re: [PATCH 0/3][AArch64]More intrinsics/builtins improvements

2014-11-17 Thread Alan Lawrence
Ah, sorry for the duplication of effort. And thanks for the heads-up about upcoming work! I don't think I have any plans for any of those others at the moment. In the case of vld1_dup, however, I'm going to argue that my approach (https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01718.html) is bet

Re: [PATCH][AArch64]Add vec_shr pattern for 64-bit vectors using ush{l,r}; enable tests.

2014-11-17 Thread Alan Lawrence
I confirm no regressions on aarch64_be-none-elf. --Alan Alan Lawrence wrote: ...Patch attached... Alan Lawrence wrote: Following recent vectorizer changes to reductions via shifts, AArch64 will now reduce loops such as this unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15}; int main

[PATCH][AArch64] Fix __builtin_aarch64_absdi, must not fold to ABS_EXPR

2014-11-17 Thread Alan Lawrence
...as the former is defined as returning MIN_VALUE for argument MIN_VALUE, whereas the latter is 'undefined', and gcc can optimize "abs(x)>=0" to "true", which is wrong for __builtin_aarch64_abs. There has been much debate here, although not recently - I think the last was https://gcc.gnu.org/

[PATCH][ARM] Remove vec_shr and vec_shl optabs

2014-11-17 Thread Alan Lawrence
No new code here ;). There is a slight change of execution path, i.e. some VEC_PERM_EXPRs (e.g. those for reductions via shifts) will be expanded using arm_expand_vec_perm_const rather than the vec_shr pattern. This generates EXT instructions equivalent to the original, but using the mode of the s

[PATCH][AArch64]Tidy up aarch64_simd_expand_args

2014-11-17 Thread Alan Lawrence
This is a pure tidyup, no new functionality. Changes are (1) Use op[0] to store the result operand, rather than a separate variable, thus combining the two large switch statements into one; (2) The 'arg' and 'mode' arrays were (almost-)only ever used to store data *within* each iteration, so tur

Re: [PATCH][AArch64] Add bounds checking to vqdm*_lane intrinsics via a qualifier that also flips endianness

2014-11-19 Thread Alan Lawrence
On 12 November 2014 15:35, Alan Lawrence wrote: Nice! One nit - can the extra "tree" argument be a "const_tree" ? - I'll defer to the maintainers on the use of C++ default arguments in the AArch64 backend. But LGTM. Thanks, good catch. The default parameter will go away once a

Re: [PING][PATCH] Change contrib/test_installed for testing cross compilers

2014-11-24 Thread Alan Lawrence
Having just been experimenting with testing of installed compilers - yes something like this could be useful, however: to do cross-testing I found I also (a) had to set my target_list; so either an extra flag for that, or maybe just a generic 'extra_site_flags' parameter? (b) I had to set up som

[PATCH][AArch64]Fix ICE at -O0 on vld1_lane intrinsics

2014-11-25 Thread Alan Lawrence
vld1_lane intrinsics ICE at -O0 because they contain a call to the vset_lane intrinsics, through which the lane index is not constant-propagated. (They are fine at -O1 and higher!). This fixes the ICE by replacing said call by a macro. Rather than defining many individual macros __aarch64_vset

Ping with testcase: [PATCH][AArch64] Fix __builtin_aarch64_absdi, must not fold to ABS_EXPR

2014-11-26 Thread Alan Lawrence
So in case there's any confusion about the behaviour expected of *the vabs intrinsic*, here's a testcase (failing without patch, passing with it)... --Alan Alan Lawrence wrote: ...as the former is defined as returning MIN_VALUE for argument MIN_VALUE, whereas the latter is '

Re: [PING][PATCH] Change contrib/test_installed for testing cross compilers

2014-11-27 Thread Alan Lawrence
rticular to my setup!! --Alan Jeff Law wrote: On 11/24/14 09:51, Alan Lawrence wrote: Having just been experimenting with testing of installed compilers - yes something like this could be useful, however: to do cross-testing I found I also (a) had to set my target_list; so either an extra flag for t

Re: Ping with testcase: [PATCH][AArch64] Fix __builtin_aarch64_absdi, must not fold to ABS_EXPR

2014-12-03 Thread Alan Lawrence
On Wed, Nov 26, 2014 at 04:35:50PM +, James Greenhalgh wrote: > Why do we want to turn off folding for the V4SF/V2SF/V2DF modes of these > intrinsics? There should be no difference between the mid-end definition > and the intrinsic definition of their behaviour. Good point. Done. > I also no

Pushed: [PATCH][AArch64] Remove/merge redundant iterators

2014-12-03 Thread Alan Lawrence
no uses, remove; * VDQM and VQ_S duplicate VDQ_BHSI, use the latter; * VDIC and VDW duplicate VD_BHSI, use the latter; ...committed as r218310. Marcus Shawcroft wrote: On 13 November 2014 10:38, Alan Lawrence wrote: Hi, gcc/config/aarch64/iterators.md contains numerous duplicates - not

Re: [PATCH][AArch64]Fix ICE at -O0 on vld1_lane intrinsics

2014-12-03 Thread Alan Lawrence
Ping. Alan Lawrence wrote: vld1_lane intrinsics ICE at -O0 because they contain a call to the vset_lane intrinsics, through which the lane index is not constant-propagated. (They are fine at -O1 and higher!). This fixes the ICE by replacing said call by a macro. Rather than defining many

Re: [PATCH][RFC] Fix PR63155

2015-03-18 Thread Alan Lawrence
Following this patch (r221318), we're seeing what appears to be a miscompile of glibc on AArch64. This causes quite a bunch of tests to fail, segfaults etc., if LD_LIBRARY_PATH leads to a libc.so.6 built with that patch vs without (same glibc sources). We are still working on a reduced testcase,

[PATCH][AArch64][Testsuite] Fix gcc.target/aarch64/c-output-template-3.c

2015-03-24 Thread Alan Lawrence
Following Richard Biener's patch at https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01064.html (r221532), gcc.target/aarch64/c-output-template-3.c fails with: c-output-template-3.c: In function 'test': c-output-template-3.c:7:5: error: impossible constraint in 'asm' __asm__ ("@ %c0" : : "S"

Re: [PATCH][AArch64][Testsuite] Fix gcc.target/aarch64/c-output-template-3.c

2015-03-24 Thread Alan Lawrence
uot;) [flags 0x3] ) (const_int 4 [0x4]))) but following Richard's patch the constraint is evaluated only on: (reg/f:DI 73 [ D.2670 ]) --Alan Alan Lawrence wrote: Following Richard Biener's patch at https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01064.html (r221532), gcc.target/

[Obvious] Fix libstdc++/33394 testcase when cross-testing linux

2015-03-25 Thread Alan Lawrence
When cross-testing, the -DITERATIONS=1000 flag replaced the -pthread required for linux targets, so the test failed to build. I've pushed the following test fix as r221666: Index: libstdc++-v3/testsuite/21_strings/basic_string/pthread33394.cc

Re: [PATCH][AArch64][Testsuite] Fix gcc.target/aarch64/c-output-template-3.c

2015-03-26 Thread Alan Lawrence
egister_constraint which accepts registers only; and define_memory_constraint which accepts memory only). However, I think this is too late in the development cycle for gcc5, and hence, I think the original testcase fix (dg-options "-O") is the best we can do for now (possibly unless we would prefe

[PATCH 4.8][AArch64] Backport r207785 from trunk: Fix PCH on AArch64, (PR pch/60010)

2015-03-27 Thread Alan Lawrence
ommit 39f9a388f15e12f43e3f59c314325cc087eab377 Author: Alan Lawrence Date: Tue Mar 10 12:20:12 2015 + Kyle McMartin patch diff --git a/gcc/config/host-linux.c b/gcc/config/host-linux.c index 1f10823..0774ecf 100644 --- a/gcc/config/host-linux.c +++ b/gcc/config/host-linux.c @@ -86,6 +86,8 @@ # d

New regression on ARM Linux (was: Re: [PATCH] Fix regression caused by PR65310 fix)

2015-03-30 Thread Alan Lawrence
We've been seeing a bunch of new failures in the *libffi* testsuite on ARM Linux (arm-none-linux-gnueabi, arm-none-linux-gnueabihf), following this one-liner fix. I've reduced the testcase down to the attached (including removing any dependency on libffi); with gcc r221347, this prints the expec

Re: New regression on ARM Linux

2015-03-30 Thread Alan Lawrence
...actually attach the testcase... Alan Lawrence wrote: We've been seeing a bunch of new failures in the *libffi* testsuite on ARM Linux (arm-none-linux-gnueabi, arm-none-linux-gnueabihf), following this one-liner fix. I've reduced the testcase down to the attached (including re

Re: New regression on ARM Linux

2015-03-30 Thread Alan Lawrence
n_printf("%d\n", x); } but in that case, the arm_function_arg is still fed a type with alignment 32 (bits), i.e. distinct from the type of the field 'x' in memory, which has alignment 128. --Alan Richard Biener wrote: On Mon, 30 Mar 2015, Richard Biener wrote: On Mon, 30

Re: New regression on ARM Linux

2015-03-31 Thread Alan Lawrence
Richard Biener wrote: On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener wrote: It doesn't make sense to use the alignment of passed values. That looks like bs. This means that Int I __aligned__(8); Is passed differently than int. Arm_function_arg needs to be fixed. That is, typedef int

Re: New regression on ARM Linux

2015-03-31 Thread Alan Lawrence
Richard Biener wrote: But I find it odd that on ARM passing *((aligned_int *)p) as vararg (only as varargs?) changes calling conventions independent of the functions type signature. Does it? Do you have a testcase, and compilation flags, that'll make this show up in an RTL dump? I've tried nu

Re: New regression on ARM Linux

2015-03-31 Thread Alan Lawrence
/03/15 08:50, Richard Biener wrote: On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener wrote: On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence wrote: -O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes it. The problem appears to be in laying out arguments, specifically

Re: New regression on ARM Linux

2015-03-31 Thread Alan Lawrence
Jakub Jelinek wrote: On Tue, Mar 31, 2015 at 11:47:37AM +0100, Alan Lawrence wrote: Richard Biener wrote: But I find it odd that on ARM passing *((aligned_int *)p) as vararg (only as varargs?) changes calling conventions independent of the functions type signature. Does it? Do you have a

Re: [PATCH, AArch64] Fix PR 65624 (ICE in aarch64-linux-gnueabi crosscompiler on i686 host).

2015-04-01 Thread Alan Lawrence
Looks good to me. Indeed, I'd support this being an "obvious" fix --Alan Maxim Ostapenko wrote: Hi, expanding AArch64 AdvSIMD builtins, aarch64_simd_expand_builtin puts return type and arguments types in args[SIMD_MAX_BUILTIN_ARGS] array and indicates the last argument with SIMD_ARG_STO

Re: New regression on ARM Linux

2015-04-02 Thread Alan Lawrence
Richard Biener wrote: > > On Tue, 31 Mar 2015, Alan Lawrence wrote: > > >> >> (1) If we wish to keep the AAPCS principle that varargs are passed just as >> >> named args, we should use TYPE_MAIN_VARIANT inside >> >> arm_needs_doubleword_alignmen

Re: [PATCH][AArch64][Testsuite] Fix gcc.target/aarch64/c-output-template-3.c

2015-04-07 Thread Alan Lawrence
Done - committed as r221905, and PR target/65689 filed on bugzilla. Cheers, Alan James Greenhalgh wrote: On Wed, Mar 25, 2015 at 06:27:49PM +, James Greenhalgh wrote: I think your original patch to add -O is just fine, but Marcus or Richard will need to approve it. I haven't seen any how

Re: [PATCH][AArch64 Intrinsics] Replace temporary assembler for vst1_lane

2015-04-14 Thread Alan Lawrence
Marcus Shawcroft wrote: On 30 January 2015 at 12:09, Alan Lawrence wrote: This was posted towards the end of stage 3, a few days before stage 4 started. Is it now too late to "ping" ? --Alan gcc/ChangeLog: * config/aarch64/arm_neon.h (vst1_lane_f32, vst

Re: [PATCH 1/4] vldN_lane error message enhancements (Q registers)

2015-04-14 Thread Alan Lawrence
April 2015 at 14:45, Alan Lawrence wrote: Assuming/hoping that this patch is proposed for new stage 1 ;), IIRC the approach of using __builtin_aarch64_im_lane_boundsi doesn't work (results in double error messages), and so the patch needs to be rewritten to avoid it. However, thanks for you

Re: [PING^2] [PATCH][5a/5] Postpone expanding va_arg until pass_stdarg

2015-06-09 Thread Alan Lawrence
Hmmm. One side effect of this is that the line number information available in the target hook gimplify_va_arg_expr, is now just the name of the containing function, rather than the specific use of va_arg. Is there some way to get this more precise location (e.g. gimple_location(stmt) in expand_

Re: [PING^2] [PATCH][5a/5] Postpone expanding va_arg until pass_stdarg

2015-06-09 Thread Alan Lawrence
Tom de Vries wrote: On 09/06/15 13:03, Richard Biener wrote: On Tue, 9 Jun 2015, Alan Lawrence wrote: Hmmm. One side effect of this is that the line number information available in the target hook gimplify_va_arg_expr, is now just the name of the containing function, rather than the specific

Re: [PATCH] [AArch64] PR63870 Improve error messages for NEON single lane memory access intrinsics

2015-06-10 Thread Alan Lawrence
Charles Baylis wrote: On 8 June 2015 at 10:33, Alan Lawrence wrote: Thanks for working on this! I'd been fiddling around with a patch with some similar elements to this, but many trials with union types, subregs, etc., all worsened the register allocation and led to more unnecessary shuf

[PATCH *2][AArch64] Fix ICEs with +nofp/-mgeneral-regs-only and improve error messages; clarify docs.

2015-06-11 Thread Alan Lawrence
* gcc.target/aarch64/nofp_1.c: New file. gcc/ChangeLog: * doc/invoke.texi: Clarify AArch64 feature modifiers (no)fp, (no)simd and (no)crypto. commit efbf0f4699ac963472834c912b46b1a3a076fa64 Author: Alan Lawrence Date: Mon Jan 12 15:04:06 2015 + Approved r/3008, rebas

Re: [PATCH] [AArch64] PR63870 Improve error messages for NEON single lane memory access intrinsics

2015-06-17 Thread Alan Lawrence
Looks good to me, but I can't approve. Thanks, Alan Charles Baylis wrote: Ping? On 11 June 2015 at 00:42, Charles Baylis wrote: [resending, as previous version was rejected from the list for html] On 11 June 2015 at 00:38, Charles Baylis wrote: On 8 June 2015 at 10:44, Alan Law

<    1   2   3   4   5   6   >