Patch here.
Alan Lawrence wrote:
No regressions on aarch64-none-elf; new tests passing on aarch64-none-elf,
arm-none-eabi, x86_64-unknown-linux-gnu:
NA->PASS gcc.dg/vect/vect-singleton_1.c (test for warnings, line 20)
NA->PASS gcc.dg/vect/vect-singleton_1.c (test for excess errors)
Simulators such as qemu report the presence of fork (it's in glibc) but
generally do not support synchronization primitives between threads, so any
tests using fork are unreliable. This patch disables the subset of such tests
that identify themselves using dg-require-fork.
At present, such tes
This example which I wrote to test ifconversion, currently fails to if-convert
or vectorize:
int foo ()
{
for (int i = 0; i < 32 ; i++)
{
int m = (a[i] & i) ? 5 : 4;
b[i] = a[i] * m;
}
}
...because jump-threading in dom1 rearranged the loop into a form that neither
if-con
Christophe Lyon wrote:
On 22 April 2015 at 19:36, Alan Lawrence wrote:
In the first revision of Christophe Lyon's advsimd-intrinsics tests,
https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00532.html , both
gcc-dg-runtest (to assemble only) and c-torture-execute were used. In review
the g
Christophe Lyon wrote:
On 26 May 2015 at 18:25, Alan Lawrence wrote:
I don't see this symptom - I am able to execute such subsets with either my,
or Sandra's, advsimd-intrinsics.exp.
I didn't try to run with your patch, I thought it was an oversight of yours.
Sorry, indeed I&
Christophe Lyon wrote:
So in fact, except for the comment about '-w' it seems you initial
patch was mostly OK, right?
Well, my removing a bunch of that c-torture-init stuff, was what was causing the
"-Og -g" variant to go missing, but apart from that, yes.
--Alan
I've tested this on aarch64, aarch64_be, and arm, and in all cases, the same
tests are executed (whether running the whole advsimd-intrinsics.exp, or
manually specifying a single file). AFAICT the loop, explicit runtest_file_p,
and gcc_set_parallelization_enable, all stem from a point where we w
Christophe Lyon wrote:
This looks OK, but why can't you also drop the other torture-related
lines as you did in your previous patch?
I mean:
load_lib c-torture.exp
load_lib torture-options.exp
etc...
We need c-torture.exp in order to set-torture-options; we need to
set-torture-options to get
Christophe Lyon wrote:
On 18 May 2015 at 20:25, Mike Stump wrote:
On May 18, 2015, at 8:01 AM, Alan Lawrence wrote:
Simulators such as qemu report the presence of fork (it's in glibc) but
generally do not support synchronization primitives between threads, so any
tests using for
Richard Earnshaw wrote:
On 01/06/15 13:07, Jakub Jelinek wrote:
On Thu, May 07, 2015 at 12:16:32PM +0100, Alan Lawrence wrote:
So for my two cents, or perhaps three:
Any progress on this PR?
A P1 bug that affects several packages stalled for a month isn't a very good
thing... (not to me
Thanks for working on this!
I'd been fiddling around with a patch with some similar elements to this, but
many trials with union types, subregs, etc., all worsened the register
allocation and led to more unnecessary shuffling / moves. The only real thing I
tried which you don't do here, was to
Oh, have you tested bigendian?
--Alan
Charles Baylis wrote:
This is another attempt at fixing this PR63870 for AArch64 (ARM is
still to come).
As before, the Q register variants are handled by moving the check for
the lane bounds into builtin expansion. The handling of lane numbers
is made con
The comments in vldN_lane_1.c say it is testing vld{1,2,3}{,q}_dup. This is
wrong, it is testing vld{1,2,3}{,q}_lane, as per test filename; I've pushed the
attached as r222148.
gcc/testsuite/ChangeLog:
gcc.target/aarch64/vldN_lane_1.c: Correct dup->lane in comments.
diff --git a/gcc/t
As per bugzilla entry, indices in the generated assembly for bigendian are
flipped when they should not be (and, flipped always relative to a Q-register!).
This flips the lane indices back again at assembly time, fixing PR. The
"indices" contained in the RTL are still wrong for D registers, but
Committed r222177 after testing on aarch64-none-linux-gnu and aarch64-none-elf.
gcc/ChangeLog:
config/aarch64/arm_neon.h (vdup_n_f32): Remove forward declaration
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 71ef027..e9cc825 100644
--- a/gcc/config/aar
Hi,
Comparing 64x1 vector types (defined by hand or from arm_neon.h) using GCC
vector extensions currently generates very poor assembly code, for example
"uint64x1_t foo (uint64x1_t a, uint64x1_t b) { return a >= b; }" generates (at -O3):
fmov x0, d0 // 22 movdi_aarch64/12 [length = 4]
fmov x
As per introduction, this allows vector_compare_rtx to work on DImode vectors.
Bootstrapped + check-gcc on x86-unknown-linux-gnu.
gcc/ChangeLog:
* optabs.c (vector_compare_rtx): Handle RTL operands having VOIDmode.
diff --git a/gcc/optabs.c b/gcc/optabs.c
index f8d584eeeb11a2c19d8c8d88
This just adds the necessary patterns used for comparisons of DImode vectors.
Used as part of arm_neon.h, in next/final patch.
Tested on aarch64-none-elf.
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (aarch64_vcond_internal,
vcond, vcondu,): Add DImode variant.
diff --git a/
This also makes the existing intrinsics tests apply to the new patterns.
Tested on aarch64-none-elf.
gcc/ChangeLog:
* config/aarch64/arm_neon.h (vceq_s64, vceq_u64, vceqz_s64, vceqz_u64,
vcge_s64, vcge_u64, vcgez_s64, vcgt_s64, vcgt_u64, vcgtz_s64, vcle_s64,
vcle_u64, vc
From https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64134, testcase
#define vector __attribute__((vector_size(16)))
float a; float b;
vector float fb(void) { return (vector float){ 0,0,b,a};}
currently produces (correct, but suboptimal):
fb:
fmovs0, wzr
adrpx1, b
Bootstrapped on aarch64-none-linux-gnu.
Pushed as r34.
gcc/ChangeLog:
* config/aarch64/aarch64.c (aarch64_simd_emit_pair_result_insn): Delete.
* config/aarch64/aarch64-protos.h (aarch64_simd_emit_pair_result_insn):
Delete.
Oops, missed off the patch actually pushed. Attached now.
Cheers, Alan
Alan Lawrence wrote:
Bootstrapped on aarch64-none-linux-gnu.
Pushed as r34.
gcc/ChangeLog:
* config/aarch64/aarch64.c (aarch64_simd_emit_pair_result_insn):
Delete.
* config/aarch64/aarch64-protos.h
This patch series adds support for ARM Neon float16x4_t and float16x8_t vector
types and intrinsics, and the __fp16 type, on both ARM and AArch64, and extends
the tests in Christophe Lyon's advsimd-intrinsics testsuite to cover these. (I
chose to extend the existing tests rather than add new one
Ping (https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01422.html).
These are required for float16 patches posted at
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01332.html .
Bootstrapped + check-gcc on arm-none-linux-gnueabihf.
Alan Lawrence wrote:
This is based loosely upon svn r217440
Ping (https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01436.html).
These are required for float16 patches posted at
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01332.html
Bootstrapped + check-gcc on arm-none-linux-gnueabihf.
Alan Lawrence wrote:
This parallels the present form of
Identical to https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01438.html .
Bootstrapped on arm-none-linux-gnueabihf.
commit bc582bd6a0ed7c7c91fc834603fc573ed745b1a7
Author: Alan Lawrence
Date: Mon Dec 8 18:40:24 2014 +
Add float16x8_t + V8HFmode support (regardless of -mfp16-format
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01437.html ,
but fixes a wrong 'lane index out of bounds' error on vget_lane_f16 and
vset_lane_f16, and drops vdup_n_f16 and vdup_lane_f16, as these are not in the
ACLE spec. As previously, these use GCC vector extensions to maxim
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01439.html ,
again fixing a wrong 'lane index out of bounds' error for vgetq_lane_f16 and
vsetq_lane-f16 at -O0, and dropping vdupq_n_f16 and vdupq_lane_f16 as these are
not in the ACLE spec.
The vld1, vldN, vldN_lane and corres
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01440.html ;
changes are to add in several missing vst... intrinsics, and fix a missing
iterator V_uf_sclr used in vec_extract.
These intrinsics are all made from patterns in neon.md, and are all tied
together by iterators - I'v
This adds basic support for moving __fp16 values around, passing and returning,
and operating on them by promoting to 32-bit floats. Also a few scalar testcases.
Note I've not got an fmov (immediate) variant, because there is no 'fmov h,
...' - the only way to load a 16-bit immediate is to rein
[Resending with correct in-reply-to header]
This adds basic support for moving __fp16 values around, passing and returning,
and operating on them by promoting to 32-bit floats. Also a few scalar testcases.
Note I've not got an fmov (immediate) variant, because there is no 'fmov h,
...' - the
This adds some basic intrinsics - vget_lane, vset_lane, vld1_lane, vld1, vst1 -
for float16 types, and the necessary support in the builtin generator, basic
patterns for moving values around, etc. Other intrinsics will follow in later
patches.
I've extended the existing testcases in aarch64/,
gcc/ChangeLog:
* config/aarch64/aarch64.c (aarch64_split_simd_combine): Add V4HFmode.
* config/aarch64/aarch64-builtins.c (VAR13, VAR14): New.
(aarch64_scalar_builtin_types, aarch64_init_simd_builtin_scalar_types):
Add __builtin_aarch64_simd_hf.
* config/aa
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (aarch64_float_truncate_lo_v2sf):
Reparameterize to...
(aarch64_float_truncate_lo_): ...this, for both V2SF and V4HF.
(aarch64_float_truncate_hi_v4sf): Reparameterize to...
(aarch64_float_truncate_hi_): ...thi
gcc/ChangeLog:
* config/aarch64/arm_neon.h (vreinterpretq_p8_f16,
vreinterpretq_p16_f16, vreinterpretq_f32_f16, vreinterpretq_f64_f16,
vreinterpretq_s64_f16, vreinterpretq_s8_f16, vreinterpretq_s16_f16,
vreinterpretq_s32_f16, vreinterpretq_u8_f16, vreinterpretq_u16
This adds the two remaining widening intrinsics, first adding patterns in
aarch64-simd.md, then entries in aarch64-simd-builtins.def, and finally
intrinsics in arm_neon.h .
Note this changes the vector indices present in the RTL on bigendian for float
vec_unpacks, to be the same as for integer
.
commit f8ad02fecdb7b6f91bab77cc154a246bd719ac20
Author: Alan Lawrence
Date: Thu Apr 9 10:54:40 2015 +0100
Fix native_interpret_real for HFmode floats on Bigendian with UNITS_PER_WORD>=4
(with missing space)
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 6d085b1..52bc8e9 100644
--- a/gcc/fold-const.
This is a fairly straightforward addition of a new type: I've added it in on
equal status to the other types, because the various
vector-load/store/element-manipulating intrinsics, are *not* conditional on HW
support. (They just involve moving 16-bit chunks around, just like s16/u16/p16).
Thus
In the first revision of Christophe Lyon's advsimd-intrinsics tests,
https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00532.html , both gcc-dg-runtest
(to assemble only) and c-torture-execute were used. In review the gcc-dg-runtest
part was then dropped, and execution tests continued using c-tortur
This adds a test of vcvt_f32_f16 and vcvt_f16_f32, also vcvt_high_f32_f16 and
vcvt_high_f16_f32.
On ARM, we pass additional option -mfpu=neon-fp16 to the compiler (possible
following patch 2/3). The compiler is already receiving an option such as
-mfpu=neon or -mfpu=crypto-neon-fp-armv8, but p
Alan Lawrence wrote:
gcc/ChangeLog:
* config/aarch64/aarch64.c (aarch64_split_simd_combine): Add V4HFmode.
* config/aarch64/aarch64-builtins.c (VAR13, VAR14): New.
(aarch64_scalar_builtin_types, aarch64_init_simd_builtin_scalar_types):
Add
Tree if-conversion currently bails out for loops that (a) contain nested loops;
(b) have more than one exit; (c) where the exit block (source of the exit edge)
does not dominate the loop latch; (d) where the exit block is the loop header,
or there are statements after the exit.
This patch remo
Alan Lawrence wrote:
Tree if-conversion currently bails out for loops that (a) contain nested loops;
(b) have more than one exit; (c) where the exit block (source of the exit edge)
does not dominate the loop latch; (d) where the exit block is the loop header,
or there are statements after the
No new code here ;). There is a slight change of execution path, i.e. some
VEC_PERM_EXPRs (e.g. those for reductions via shifts) will be expanded using
arm_expand_vec_perm_const rather than the vec_shr pattern. This generates EXT
instructions equivalent to the original, but using the mode of the s
Sorry, I realize I forgot to attach the patch to the original email, this
followed a couple of minutes later in message <553f91b9.7050...@arm.com> at
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01745.html .
Cheers, Alan
Jeff Law wrote:
On 04/28/2015 07:55 AM, Alan Lawrence wrote:
T
Alan Lawrence wrote:
As per bugzilla entry, indices in the generated assembly for bigendian are
flipped when they should not be (and, flipped always relative to a Q-register!).
This flips the lane indices back again at assembly time, fixing PR. The
"indices" contained in the RTL
Richard Biener wrote:
On Tue, Apr 28, 2015 at 3:55 PM, Alan Lawrence wrote:
Tree if-conversion currently bails out for loops that (a) contain nested
loops; (b) have more than one exit; (c) where the exit block (source of the
exit edge) does not dominate the loop latch; (d) where the exit block
Alan Lawrence wrote:
As per introduction, this allows vector_compare_rtx to work on DImode vectors.
Bootstrapped + check-gcc on x86-unknown-linux-gnu.
gcc/ChangeLog:
* optabs.c (vector_compare_rtx): Handle RTL operands having VOIDmode.
Ping. (DImode vectors are explicitly allowed
Alan Lawrence wrote:
Hi,
Comparing 64x1 vector types (defined by hand or from arm_neon.h) using GCC
vector extensions currently generates very poor assembly code, for example
"uint64x1_t foo (uint64x1_t a, uint64x1_t b) { return a >= b; }" generates (at -O3):
fmov x0, d0 // 22
Richard Biener wrote:
On May 5, 2015 4:33:58 PM GMT+02:00, Richard Earnshaw
wrote:
On 05/05/15 15:33, Richard Earnshaw wrote:
On 05/05/15 15:29, Jakub Jelinek wrote:
On Tue, May 05, 2015 at 02:20:43PM +0100, Richard Earnshaw wrote:
On 05/05/15 14:06, Jakub Jelinek wrote:
For the middle-end
(Below are all minor/style points only, no reason for patch not to go in.)
Michael Matz wrote:
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 96afc7a..6d8f17e 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -665,7 +665,7 @@ vect_compute_data_re
NP, and sorry for the spurious comments, hadn't spotted you were using nunits. I
like the testcase, thanks :).
A.
Michael Matz wrote:
On Thu, 7 May 2015, Alan Lawrence wrote:
Also update comment? (5 identical cases)
Also update comment?
Obviously a good idea, thanks :) (s/loads/acc
Joseph Myers wrote:
>
I'd think it would be desirable to share tests between ARM and AArch64 as
far as possible (where applicable to both - so not the tests for the
alternative format, and some of the gcc.target/arm/fp16-* tests using
scan-assembler might need adapting to work for AArch64).
Alan Lawrence wrote:
This patch series adds support for ARM Neon float16x4_t and float16x8_t vector
types and intrinsics, and the __fp16 type, on both ARM and AArch64, and extends
the tests in Christophe Lyon's advsimd-intrinsics testsuite to cover these. (I
chose to extend the existing
Alan Lawrence wrote:
Ping (https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01422.html).
These are required for float16 patches posted at
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01332.html .
Bootstrapped + check-gcc on arm-none-linux-gnueabihf.
Alan Lawrence wrote:
This is based loosely
Hi,
gcc/config/aarch64/iterators.md contains numerous duplicates - not always
obvious as they are not always sorted the same. Sometimes, one copy is used is
aarch64-simd-builtins.def and another in aarch64-simd.md; othertimes there is no
obvious pattern ;).
This patch just removes all the du
These three are logically independent, but all on a common theme, and I've
tested them all together by
bootstrapped + check-gcc on aarch64-none-elf
cross-tested check-gcc on aarch64_be-none-elf
Ok for trunk?
Now that float64x1_t is a vector, casting to it from a unit64_t causes the bit
pattern to be reinterpreted, just as vcreate_f64 should. (Previously when
float64x1_t was still a scalar, casting caused a conversion.) Hence, replace the
__builtin with a cast. None of the other variants of the aarch
The vld1_lane intrinsic is currently implemented using inline asm. This patch
replaces that with a load and a straightforward use of vset_lane (this gives us
correct bigendian lane-flipping in a simple manner).
Naively this would produce assembler along the lines of (for vld1_lane_u8):
This patch replaces the inline asm for vld1_dup intrinsics with a vdup_n_ and a
load from the pointer. The existing *aarch64_simd_ld1r insn, combiner,
etc., are quite capable of generating the expected single ld1r instruction from
this. (I've verified by inspecting assembler output.)
gcc/Chang
Ah, I didn't realize Loongson was little-endian only. In that case (with mid-end
reductions-via-shifts changes pushed) I don't think I have actually broken
anything, or at least, no MIPS platform that exists :).
However, yes, that would seem a safe bet (and simpler than my linked patch that
pr
Following recent vectorizer changes to reductions via shifts, AArch64 will now
reduce loops such as this
unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15};
int
main (unsigned char argc, char **argv)
{
unsigned char prod = 1;
/* Prevent constant propagation of the entire loop below. */
a
...Patch attached...
Alan Lawrence wrote:
Following recent vectorizer changes to reductions via shifts, AArch64 will now
reduce loops such as this
unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15};
int
main (unsigned char argc, char **argv)
{
unsigned char prod = 1;
/* Prevent
After recent updates, tree-vect-loop.c is in the same state as when this cleanup
patch was first written and approved, so I've just pushed it as r/217580.
Cheers,
Alan
Richard Biener wrote:
On Thu, Sep 18, 2014 at 2:48 PM, Alan Lawrence wrote:
Following earlier pa
Ah, sorry for the duplication of effort. And thanks for the heads-up about
upcoming work! I don't think I have any plans for any of those others at the moment.
In the case of vld1_dup, however, I'm going to argue that my approach
(https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01718.html) is bet
I confirm no regressions on aarch64_be-none-elf.
--Alan
Alan Lawrence wrote:
...Patch attached...
Alan Lawrence wrote:
Following recent vectorizer changes to reductions via shifts, AArch64 will now
reduce loops such as this
unsigned char in[8] = {1, 3, 5, 7, 9, 11, 13, 15};
int
main
...as the former is defined as returning MIN_VALUE for argument MIN_VALUE,
whereas the latter is 'undefined', and gcc can optimize "abs(x)>=0" to "true",
which is wrong for __builtin_aarch64_abs.
There has been much debate here, although not recently - I think the last was
https://gcc.gnu.org/
No new code here ;). There is a slight change of execution path, i.e. some
VEC_PERM_EXPRs (e.g. those for reductions via shifts) will be expanded using
arm_expand_vec_perm_const rather than the vec_shr pattern. This generates EXT
instructions equivalent to the original, but using the mode of the s
This is a pure tidyup, no new functionality. Changes are
(1) Use op[0] to store the result operand, rather than a separate variable, thus
combining the two large switch statements into one;
(2) The 'arg' and 'mode' arrays were (almost-)only ever used to store data
*within* each iteration, so tur
On 12 November 2014 15:35, Alan Lawrence wrote:
Nice! One nit - can the extra "tree" argument be a "const_tree" ? - I'll
defer to the maintainers on the use of C++ default arguments in the AArch64
backend. But LGTM.
Thanks, good catch.
The default parameter will go away once a
Having just been experimenting with testing of installed compilers - yes
something like this could be useful, however: to do cross-testing I found I also
(a) had to set my target_list; so either an extra flag for that, or maybe just a
generic 'extra_site_flags' parameter?
(b) I had to set up som
vld1_lane intrinsics ICE at -O0 because they contain a call to the vset_lane
intrinsics, through which the lane index is not constant-propagated. (They are
fine at -O1 and higher!). This fixes the ICE by replacing said call by a macro.
Rather than defining many individual macros
__aarch64_vset
So in case there's any confusion about the behaviour expected of *the vabs
intrinsic*, here's a testcase (failing without patch, passing with it)...
--Alan
Alan Lawrence wrote:
...as the former is defined as returning MIN_VALUE for argument MIN_VALUE,
whereas the latter is '
rticular to my setup!!
--Alan
Jeff Law wrote:
On 11/24/14 09:51, Alan Lawrence wrote:
Having just been experimenting with testing of installed compilers - yes
something like this could be useful, however: to do cross-testing I
found I also (a) had to set my target_list; so either an extra flag for
t
On Wed, Nov 26, 2014 at 04:35:50PM +, James Greenhalgh wrote:
> Why do we want to turn off folding for the V4SF/V2SF/V2DF modes of these
> intrinsics? There should be no difference between the mid-end definition
> and the intrinsic definition of their behaviour.
Good point. Done.
> I also no
no uses, remove;
* VDQM and VQ_S duplicate VDQ_BHSI, use the latter;
* VDIC and VDW duplicate VD_BHSI, use the latter;
...committed as r218310.
Marcus Shawcroft wrote:
On 13 November 2014 10:38, Alan Lawrence wrote:
Hi,
gcc/config/aarch64/iterators.md contains numerous duplicates - not
Ping.
Alan Lawrence wrote:
vld1_lane intrinsics ICE at -O0 because they contain a call to the vset_lane
intrinsics, through which the lane index is not constant-propagated. (They are
fine at -O1 and higher!). This fixes the ICE by replacing said call by a macro.
Rather than defining many
Following this patch (r221318), we're seeing what appears to be a miscompile of
glibc on AArch64. This causes quite a bunch of tests to fail, segfaults etc., if
LD_LIBRARY_PATH leads to a libc.so.6 built with that patch vs without (same
glibc sources). We are still working on a reduced testcase,
Following Richard Biener's patch at
https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01064.html (r221532),
gcc.target/aarch64/c-output-template-3.c fails with:
c-output-template-3.c: In function 'test':
c-output-template-3.c:7:5: error: impossible constraint in 'asm'
__asm__ ("@ %c0" : : "S"
uot;) [flags 0x3] )
(const_int 4 [0x4])))
but following Richard's patch the constraint is evaluated only on:
(reg/f:DI 73 [ D.2670 ])
--Alan
Alan Lawrence wrote:
Following Richard Biener's patch at
https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01064.html (r221532),
gcc.target/
When cross-testing, the -DITERATIONS=1000 flag replaced the -pthread required
for linux targets, so the test failed to build. I've pushed the following test
fix as r221666:
Index: libstdc++-v3/testsuite/21_strings/basic_string/pthread33394.cc
egister_constraint which accepts
registers only; and define_memory_constraint which accepts memory only).
However, I think this is too late in the development cycle for gcc5, and hence,
I think the original testcase fix (dg-options "-O") is the best we can do for
now (possibly unless we would prefe
ommit 39f9a388f15e12f43e3f59c314325cc087eab377
Author: Alan Lawrence
Date: Tue Mar 10 12:20:12 2015 +
Kyle McMartin patch
diff --git a/gcc/config/host-linux.c b/gcc/config/host-linux.c
index 1f10823..0774ecf 100644
--- a/gcc/config/host-linux.c
+++ b/gcc/config/host-linux.c
@@ -86,6 +86,8 @@
# d
We've been seeing a bunch of new failures in the *libffi* testsuite on ARM Linux
(arm-none-linux-gnueabi, arm-none-linux-gnueabihf), following this one-liner
fix. I've reduced the testcase down to the attached (including removing any
dependency on libffi); with gcc r221347, this prints the expec
...actually attach the testcase...
Alan Lawrence wrote:
We've been seeing a bunch of new failures in the *libffi* testsuite on ARM Linux
(arm-none-linux-gnueabi, arm-none-linux-gnueabihf), following this one-liner
fix. I've reduced the testcase down to the attached (including re
n_printf("%d\n", x);
}
but in that case, the arm_function_arg is still fed a type with alignment 32
(bits), i.e. distinct from the type of the field 'x' in memory, which has
alignment 128.
--Alan
Richard Biener wrote:
On Mon, 30 Mar 2015, Richard Biener wrote:
On Mon, 30
Richard Biener wrote:
On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener wrote:
It doesn't make sense to use the alignment of passed values. That looks like
bs.
This means that
Int I __aligned__(8);
Is passed differently than int.
Arm_function_arg needs to be fixed.
That is,
typedef int
Richard Biener wrote:
But I find it odd that on ARM passing *((aligned_int *)p) as
vararg (only as varargs?) changes calling conventions independent
of the functions type signature.
Does it? Do you have a testcase, and compilation flags, that'll make this show
up in an RTL dump? I've tried nu
/03/15 08:50, Richard Biener wrote:
On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener wrote:
On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence
wrote:
-O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes
it.
The problem appears to be in laying out arguments, specifically
Jakub Jelinek wrote:
On Tue, Mar 31, 2015 at 11:47:37AM +0100, Alan Lawrence wrote:
Richard Biener wrote:
But I find it odd that on ARM passing *((aligned_int *)p) as
vararg (only as varargs?) changes calling conventions independent
of the functions type signature.
Does it? Do you have a
Looks good to me. Indeed, I'd support this being an "obvious" fix
--Alan
Maxim Ostapenko wrote:
Hi,
expanding AArch64 AdvSIMD builtins, aarch64_simd_expand_builtin puts
return type and arguments types in args[SIMD_MAX_BUILTIN_ARGS] array and
indicates the last argument with SIMD_ARG_STO
Richard Biener wrote:
> > On Tue, 31 Mar 2015, Alan Lawrence wrote:
> >
>> >> (1) If we wish to keep the AAPCS principle that varargs are passed just as
>> >> named args, we should use TYPE_MAIN_VARIANT inside
>> >> arm_needs_doubleword_alignmen
Done - committed as r221905, and PR target/65689 filed on bugzilla.
Cheers, Alan
James Greenhalgh wrote:
On Wed, Mar 25, 2015 at 06:27:49PM +, James Greenhalgh wrote:
I think your original patch to add -O is just fine, but Marcus or
Richard will need to approve it.
I haven't seen any how
Marcus Shawcroft wrote:
On 30 January 2015 at 12:09, Alan Lawrence wrote:
This was posted towards the end of stage 3, a few days before stage 4
started. Is it now too late to "ping" ?
--Alan
gcc/ChangeLog:
* config/aarch64/arm_neon.h (vst1_lane_f32, vst
April 2015 at 14:45, Alan Lawrence wrote:
Assuming/hoping that this patch is proposed for new stage 1 ;),
IIRC the approach of using __builtin_aarch64_im_lane_boundsi doesn't
work (results in double error messages), and so the patch needs to be
rewritten to avoid it. However, thanks for you
Hmmm. One side effect of this is that the line number information available in
the target hook gimplify_va_arg_expr, is now just the name of the containing
function, rather than the specific use of va_arg. Is there some way to get this
more precise location (e.g. gimple_location(stmt) in expand_
Tom de Vries wrote:
On 09/06/15 13:03, Richard Biener wrote:
On Tue, 9 Jun 2015, Alan Lawrence wrote:
Hmmm. One side effect of this is that the line number information available in
the target hook gimplify_va_arg_expr, is now just the name of the containing
function, rather than the specific
Charles Baylis wrote:
On 8 June 2015 at 10:33, Alan Lawrence wrote:
Thanks for working on this!
I'd been fiddling around with a patch with some similar elements to this,
but many trials with union types, subregs, etc., all worsened the register
allocation and led to more unnecessary shuf
* gcc.target/aarch64/nofp_1.c: New file.
gcc/ChangeLog:
* doc/invoke.texi: Clarify AArch64 feature modifiers (no)fp, (no)simd
and (no)crypto.
commit efbf0f4699ac963472834c912b46b1a3a076fa64
Author: Alan Lawrence
Date: Mon Jan 12 15:04:06 2015 +
Approved r/3008, rebas
Looks good to me, but I can't approve.
Thanks,
Alan
Charles Baylis wrote:
Ping?
On 11 June 2015 at 00:42, Charles Baylis wrote:
[resending, as previous version was rejected from the list for html]
On 11 June 2015 at 00:38, Charles Baylis wrote:
On 8 June 2015 at 10:44, Alan Law
201 - 300 of 583 matches
Mail list logo