This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02139.html .
Changes are:
* Separate the two passes by descending from a common base class, allowing
different predicates;
* Test flag_tree_vectorize, and loop->force_vectorize/dont_vectorize - this
fixes the test failing
Richard Biener wrote:
Apart from Jeffs comment - the usual fix for the undesired
vectorization is to put
a __asm__ volatile (""); in the loop.
In vect-strided-a-u16-i4.c, narrowing the scope of the declaration seemed to
preserve the original intent. I've been able to drop the other testsuite c
Jeff Law wrote:
On 05/22/2015 09:42 AM, Alan Lawrence wrote:
This patch does so (and makes slightly less conservative, to tackle the
example above). I found I had to make this a separate pass, so that the
phi nodes were cleaned up at the end of the pass before running
tree_if_conversion.
What
Abe Skolnik wrote:
Hi everybody!
In the current implementation of if conversion, loads and stores are
if-converted in a thread-unsafe way:
* loads were always executed, even when they should have not been.
Some source code could be rendered invalid due to null pointers
that were OK in
James Greenhalgh wrote:
-Generate code which uses only the general registers.
+Generate code which uses only the general registers. Equivalent to feature
The ARMARM uses "general-purpose registers" to refer to these registers,
we should match that style.
s/Equivalent to feature/This is equi
James Greenhalgh wrote:
Submissions on this list should be one patch per mail, it makes
tracking review easier.
OK here's a respin of the first, I've added a third patch after I found another
route to get to an ICE.
+void
+aarch64_err_no_fpadvsimd (machine_mode mode, const char *msg)
+{
+
This fixes another ICE, obtained with the attached testcase - yes, there was a
way to get hold of a float, without passing an argument or going through
movsf/movdf!
Bootstrapped + check-gcc on aarch64-none-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64.md (2):
Condition on
Sebastian Pop wrote:
On Thu, Jun 25, 2015 at 4:43 AM, Richard Biener
wrote:
when the new scheme triggers vectorization cannot succeed on the
result as we get
if (cond)
*p = val;
if-converted to
tem = cond ? p : &scratch;
*tem = val;
That's correct.
and
if (cond)
val =
Abe Skolnik wrote:
In tree-if-conv.c:[…]> if it doesn't trap, but has_non_addressable_refs, can't
we use
ifcvt_can_use_mask_load_store there too?
if an access could trap, but is addressable,> can't we use the scratchpad
technique
to get round the trapping problem?
That`s how we deal with loa
Jeff Law wrote:
Thanks. Does running the phi-only propagator after the loop header
copying help? At first glance it would seem that it ought to propagate
the values of those degenerate PHIs then eliminate those PHIs.
It was written to cleanup after jump threading which has a tendency to
cre
Thanks, Abe. A couple comments below...
@@ -883,7 +733,7 @@ if_convertible_gimple_assign_stmt_p (gimple stmt,
if (flag_tree_loop_if_convert_stores)
{
- if (ifcvt_could_trap_p (stmt, refs))
+ if (ifcvt_could_trap_p (stmt))
{
if (ifcvt_can_use_mask_load_store
Jeff Law wrote:
On 06/24/2015 01:59 AM, Richard Biener wrote:
And then there is the possibility of making passes generate less
needs to perform cleanups after them - like in the present case
with the redundant IVs make them more appearant redundant by
CSEing the initial value and step during vec
With those comment fixes, this is OK for the trunk.
jeff
Thank you for review - I've pushed r225311 with what I hope are appropriate
comment fixes.
Cheers, Alan
Charles Baylis wrote:
These patches are a port of the changes do the same thing for AArch64 (see
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01984.html)
The first patch ports over some infrastructure, and the second converts the
vldN_lane and vstN_lane intrinsics. The changes required for vget
This patch series implements the changes/additions to the ARM ABI proposed at
https://gcc.gnu.org/ml/gcc/2015-07/msg00040.html .
The first patch is the ABI update. This is an ABI-breaking change for any code
using __attribute__((aligned(...))) on a public interface (a case not previously
defin
These include tests of structs, scalars, and vectors - only general-purpose
registers are affected by the ABI rules for alignment, but we can restrict the
vector test to use the base AAPCS.
Prior to this patch, align2.c, align3.c and align_rec1.c were failing (the
latter showing an internal in
The previous patch caused a regression in gcc.c-torture/execute/20040709-1.c at
-O0 (only), and the new align_rec2.c test fails, both outputting an illegal
assembler instruction (ldrd on an odd-numbered reg) from output_move_double in
arm.c. Most routes have checks against such an illegal instru
Richard Biener wrote:
I also believe this loop is equivalent to checking TYPE_ALIGN of the aggregate
type?
Jakub is correct: the intention is to discard any top-level alignment attribute
on a struct declaration.
I'll double check your wording in the abi document, but it seems to be unclea
I note some parts of this duplicate my
https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01422.html , which has been pinged
a couple of times. Both Charles' patch, and my two, contain parts the other does
not...
Cheers, Alan
Charles Baylis wrote:
gcc/ChangeLog:
Charles Baylis
* con
have a
working Ada compiler with which to bootstrap gcc's Ada frontend. Working on this
now.
--Alan
gcc/ChangeLog:
* config/arm/arm.c (arm_needs_doubleword_align) : Drop any outer
alignment attribute, exploring one level down for records and arrays.
commit f8bd310d65f2b8fd8
Abe wrote:
On 7/2/15 4:49 AM, Alan Lawrence wrote:
As before, I'm still confused here. This still returns false, i.e. bails out of
if-conversion, if the statement could trap. Doesn't the scratchpad let us handle
that? Or do we just not care because it won't be vectorizable a
12:00, Alan Lawrence wrote:
Eric Botcazou wrote:
Technically this is incorrect since AGGREGATE_TYPE_P includes ARRAY_TYPE
and ARRAY_TYPE doesn't have TYPE_FIELDS. I doubt we could reach that
case though (unless there's a language that allows passing arrays by value).
Ada passes small a
Ramana Radhakrishnan wrote:
On 06/07/15 17:38, Alan Lawrence wrote:
Trying to push these now (svn!), patch 2 is going first.
I realize my second iteration of patch 1/2, dropped the testcases from the
first version. Okay to include those as per
https://gcc.gnu.org/ml/gcc-patches/2015-07
Richard Earnshaw wrote:
On 03/07/15 16:27, Alan Lawrence wrote:
The previous patch caused a regression in
gcc.c-torture/execute/20040709-1.c at -O0 (only), and the new
align_rec2.c test fails, both outputting an illegal assembler
instruction (ldrd on an odd-numbered reg) from output_move_double
Ramana Radhakrishnan wrote:
This is OK, the ada testing can go in parallel and we should take this in to
not delay rc1 any further.
I can confirm, no regressions in check-ada (gcc/testsuite/gnats and
gcc/testsuite/acats) following an ada bootstrap on cortex-a15/neon/hard-float.
That's the
Alan Lawrence wrote:
I note some parts of this duplicate my
https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01422.html , which has been pinged
a couple of times. Both Charles' patch, and my two, contain parts the other does
not...
Cheers, Alan
Charles Baylis wrote:
gcc/ChangeLog:
Ch
This is a respin of the series at
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01332.html, plus the two ARM
patches on which these depend
(https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01333.html). These two somewhat
duplicate Charles Baylis' lane-bounds-checking patch at
https://gcc.gnu.org/
As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01333.html
(While this falls under PR/63870, and I will link to that in the ChangeLog, it
is only a small step towards fixing that PR.)
commit 9812db88cff20a505365f68f4065d2fbab998c9c
Author: Alan Lawrence
Date: Mon Dec 8 11:04:49 2014
Unchanged since https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01336.html
commit b9ccac6243415b304024443b74bdc97b3a5954f2
Author: Alan Lawrence
Date: Mon Dec 8 18:40:24 2014 +
Add float16x8_t + V8HFmode support (regardless of -mfp16-format)
diff --git a/gcc/config/arm/arm-builtins.c b
As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01335.html
commit 54a89a084fbd00e4de036f549ca893b74b8f58fb
Author: Alan Lawrence
Date: Mon Dec 8 18:40:03 2014 +
ARM: float16x4_t intrinsics (v2 - fix v[sg]et_lane_f16 at -O0, no vdup_n/vmov_n)
diff --git a/gcc/config/arm
As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01334.html
commit 1bb1b208a2c8c8b1ee1186c6128a498583fd64fe
Author: Alan Lawrence
Date: Mon Dec 8 18:36:30 2014 +
Add __builtin_arm_lane_check
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 7f5bf87
As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01341.html
commit ae6264b144d25fadcbf219e68ddf3d8c5f40be34
Author: Alan Lawrence
Date: Thu Dec 11 11:53:59 2014 +
ARM 4/4 v2: v(ld|st)[234](q?|_lane|_dup), vcombine, vget_(low|high) (v2 w/ V_uf_sclr)
All are tied together
As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01337.html
commit 336eb16d3061131fe8d28fad4a473d00768bfe5c
Author: Alan Lawrence
Date: Tue Dec 9 15:06:38 2014 +
ARM float16x8_t intrinsics (v2 - fix v[sg]etq_lane_f16, add
vreinterpretq_p16_f16, no vdup_n/lane/vmov_n)
diff --git
: New test.
commit 989af1492bbf268be1ecfae06f3303b90ae514c8
Author: Alan Lawrence
Date: Tue Dec 2 12:57:39 2014 +
AArch64 1/6: Basic HFmode support (less tests), aarch64_fp16_type_node, patterns, mangling, predefines.
No --fp16-format option.
Disable constants as NYI.
di
/fp16/fp16.exp: New.
* gcc.target/aarch64/fp16/f16_convs_1.c: New.
* gcc.target/aarch64/fp16/f16_convs_2.c: New.
commit bc5045c0d3dd34b8cb94910281384f9ab9880325
Author: Alan Lawrence
Date: Thu May 7 10:08:12 2015 +0100
(ARM+AArch64) Add gcc.target/aarch64/fp16, f16_conv_[12].c
As https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01341.html
commit 49cb53a94a44fcda845c3f6ef11e88f9be458aad
Author: Alan Lawrence
Date: Tue Dec 2 13:08:15 2014 +
AArch64 2/N: Vector/__builtin basics: define+support types, movs, test ABI.
Patterns, builtins, intrinsics for
): Use BUILTIN_VDF iterator.
* config/aarch64/arm_neon.h (vcvt_f16_f32, vcvt_high_f16_f32): New.
* config/aarch64/iterators.md (VDF, Vdtype): New.
(VWIDE, Vmwtype): Add cases for V4HF and V2SF.
commit 5007fafedc8469ab645edfe65fbf41f75fc74750
Author: Alan Lawrence
Date: Tue
As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01342.html
commit ef719e5d3d6eccc5cf621851283b7c0ba1a9ee6c
Author: Alan Lawrence
Date: Tue Aug 5 17:52:28 2014 +0100
AArch64 3/N: v(create|combine|v(ld|st|ld...dup/lane|st...lane)[234](q?))_f16; tests vldN{,_lane,_dup} inc bigendian
.
commit beb21a6bce76d4fbedb13fcf25796563b27f6bae
Author: Alan Lawrence
Date: Mon Jun 29 18:46:49 2015 +0100
[AArch64 5/N v2] vreinterpret, vget_(low|high), vld1(q?)_dup. update tests for vget_low/high
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index b915754..ff1a45c 100644
--- a/gc
Unchanged since https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01345.html
commit 214fcc00475a543a79ed444f9a64061215397cc8
Author: Alan Lawrence
Date: Wed Jan 28 13:01:31 2015 +
AArch64 6/N: vcvt{,_high}_f32_f16 (using vect_par_cnst_hi_half, fixing bigendian indices)
diff --git a/gcc
As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01346.html. Fixes FAIL of
advsimd-intrinsics vcreate.c on aarch64_be-none-elf from previous patch.
commit e2e7ca148960a82fc88128820f17e7cbd14173cb
Author: Alan Lawrence
Date: Thu Apr 9 10:54:40 2015 +0100
Fix native_interpret_real for
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01347.html,
removing many default values of 0x333, to complete that I introduced new macros
CHECK_RESULTS{,_NAMED}_NO_FP16 as writing the same list of vector types in four
places seemed too many.
gcc/testsuite/ChangeLog:
/testsuite/ChangeLog:
* gcc.target/aarch64/advsimd-intrinsics/advsimd-intrinsics.exp:
set additional flags for neon-fp16 support.
* gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c: New.
commit e6cc7467ddf5702d3a122b8ac4163621d0164b37
Author: Alan Lawrence
Date: Wed Jan 28 13
Kyrill Tkachov wrote:
On 07/07/15 14:09, Kyrill Tkachov wrote:
Hi Alan,
On 07/07/15 13:34, Alan Lawrence wrote:
As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01335.html
For some context, the reference for these is at:
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a
Kyrill Tkachov wrote:
On 07/07/15 17:34, Alan Lawrence wrote:
Kyrill Tkachov wrote:
On 07/07/15 14:09, Kyrill Tkachov wrote:
Hi Alan,
On 07/07/15 13:34, Alan Lawrence wrote:
As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01335.html
For some context, the reference for these is at
Abe wrote:
I`m uncertain to what that is intended to refer, but I believe Sebastian would
agree that the new if converter is safer than the old one in terms of
correctness at the time of running the code being compiled.
>
even if they take us a step backwards from a performance standpoint.
Richard Biener wrote:
On Wed, Jul 8, 2015 at 12:07 AM, Jeff Law wrote:
On 07/07/2015 06:37 AM, Alan Lawrence wrote:
[snip]
Fix native_interpret_real for HFmode floats on Bigendian with
UNITS_PER_WORD>=4
(with missing space)
OK with ChangeLog in proper form.
Err - but now off
Abe wrote:
[Alan wrote:]
Where can I find info on what the different flag values mean?
(I had thought they were booleans [...]
[Abe wrote:]
Sorry; I don`t know if that is documented anywhere yet.
In this case, (-1) simply means "defaulted": on if the vectorizer is on, and
off if it is
Jeff Law wrote:
On 07/08/2015 03:43 AM, Richard Biener wrote:
On Wed, Jul 8, 2015 at 12:07 AM, Jeff Law wrote:
On 07/07/2015 06:37 AM, Alan Lawrence wrote:
As per https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01346.html. Fixes
FAIL of advsimd-intrinsics vcreate.c on aarch64_be-none-elf from
Richard Biener wrote:
I wonder why wi::from_buffer doesn't have the same issue though
for HImode ints. It's structured differently, without magic '4's as well.
I don't claim to understand the rest of wi::from_buffer and why it is different.
However, wrt. HImode, I think the key line is:
o
This is based loosely upon svn r217440, "[AArch64] Add bounds checking to
vqdm_lane intrinsics...", but applies to more intrinsics (including e.g.
vget_lane), and does not do the endianness-flipping present on AArch64: the
objective is to exactly preserve behaviour on all valid code. (Yes, the n
These add all the V[48]HFmode insns and corresponding intrinsics for ARM.
Depends on the two patches at
https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01422.html .
Unfortunately I don't at present have a testsuite. I've done some testing both
manually and on a large internal testsuite for Neon/
This parallels the present form of __builtin_aarch64_im_lane_boundsi, and allows
to check lane indices for intrinsics that can otherwise be written in terms of
GCC vector extensions.
The new builtin is not used in this patch but is used in my series of float16_t
intrinsics (https://gcc.gnu.org
This adds a bunch of new intrinsics, implemented with GCC vector extensions to
maximise mid-end optimization (the same approach as AArch64). Note that unlike
AArch64, no attempt is made to support bigendian.
gcc/ChangeLog:
* config/arm/arm_neon.h (vcreate_f16, vdup_lane_f16, vld1_lane_f16,
This defines arm_neon.h's float16x8_t type, although no intrinsics yet (see next
patch). Adding V8HFmode does mean programmers can define a GCC vector of same
size themselves.
gcc/ChangeLog:
* config/arm/arm.h (VALID_NEON_QREG_MODE): Add V8HFmode.
* config/arm/arm.c (arm_vector_mode_s
Much like the first patch, this adds the equivalent ...q... intrinsics for
float16x8_t, using GCC vector extensions.
gcc/ChangeLog:
* config/arm/arm_neon.h (vdupq_lane_f16, vld1q_lane_f16, vld1q_dup_f16,
vreinterpretq_p8_f16, vreinterpretq_f16_p8, vreinterpretq_f16_p16,
vreinterpret
These intrinsics are all made from patterns in neon.md, and are all tied
together by iterators - I've tried to reduce coupling a bit but there is
possibly more that could be done here.
gcc/ChangeLog:
* config/arm/arm-builtins.c (VAR11, VAR12): New.
* config/arm/arm_neon_builtins.def (v
s, Alan
Christophe Lyon wrote:
On 16 January 2015 at 18:22, Alan Lawrence wrote:
These add all the V[48]HFmode insns and corresponding intrinsics for ARM.
Depends on the two patches at
https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01422.html .
Unfortunately I don't at present have a t
There are still bugs in these patches, they should not go in. Hope to have
something ready, with tests, in the next stage 1.
Cheers, Alan
Alan Lawrence wrote:
These add all the V[48]HFmode insns and corresponding intrinsics for ARM.
Depends on the two patches at
https://gcc.gnu.org/ml/gcc
Hi,
The split rule introduced in r218961 uses as its split condition
'reload_completed && (which_alternative == 1)', but which_alternative does not
seem to be set reliably during split phases, even after reload. This can lead
to the split rule not being used even for insns using FP/SIMD regist
This was posted towards the end of stage 3, a few days before stage 4
started. Is it now too late to "ping" ?
--Alan
Alan Lawrence wrote:
Nowadays, just storing the (bigendian-corrected) vector element to the address,
generates exactly the same assembler for all cases except
{floa
Rainer Orth wrote:
I'm still not really comfortable with those target lists; they tend to
artificially exclude tests on targets where they are perfectly capable
of running. At least with the comments added, it's better than before
with no explanation whatsoever. Perhaps Mike can weigh in here?
Andrew Pinski wrote:
While trying to build the GCC 5 with GCC 5, I ran into an ICE when
building libcpp at -O0. The problem is the C++ front-end was not
folding sizeof(a)/sizeof(a[0]) when passed to a function at -O0. The
C++ front-end keeps around sizeof until the gimplifier and there is no
wa
This was giving an UNRESOLVED after my first attempt to apply the patch ran into
trouble with line wrapping, and in diagnosing the problem I'd introduced an
extra 'target' vs. the original
(https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00215.html). Sorry! Pushed as
r220542.
--Alan
gcc/testsu
ames Greenhalgh wrote:
On Wed, Jan 28, 2015 at 12:32:45PM +, Alan Lawrence wrote:
Ok for stage 4?
This is a regression from 4.9, so once we iron out some nits, it should
be.
gcc/ChangeLog:
* config/aarch64/aarch64.md (*xor_one_cmpl3): Use FP_REGNUM_P
as split condition.
And a
When you say a patch by Alan Hayward that's "coming soon", I take it you mean
this one? https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00952.html
Just so that we know it has now arrived :).
--Alan
David Sherwood wrote:
Hi,
I forgot to mention that this patch needs was tested in combination wi
So we've been seeing
FAIL: gcc.target/aarch64/vldN_dup_1.c
on aarch64_be-none-elf, since this patch went in. Felix, did you test for
bigendian?
However, this failure is fixed if I apply David Sherwood's patch set:
https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00942.html
https://gcc.gnu.org/ml
This generates out-of-range errors at compile- (rather than assemble-)time for
the vqdm*_lane intrinsics, and also provides a single place to do bigendian
lane-swapping for all those intrinsics (and others to follow in later patches).
This allows us to remove many define_expands that just do a r
Hmmm. I am a little surprised by your mention of "saturation points" as I would
not expect any variety of reduc_plus to be a saturating operation???
A.
Bill Schmidt wrote:
On Fri, 2014-10-24 at 19:49 -0400, David Edelsohn wrote:
On Fri, Oct 24, 2014 at 8:06 AM, Alan Lawrence wr
Ah I see now! Thank you for explaining that bit, I was a bit puzzled when I saw
it, but it makes sense now!
Cheers, Alan
Bill Schmidt wrote:
On Thu, 2014-11-06 at 16:44 +, Alan Lawrence wrote:
Hmmm. I am a little surprised by your mention of "saturation points" as I would
not
n
settled, but there's still ARM, indeed.
If you have any way/ideas to get better error messages (i.e. line numbers),
that'd be particularly good, tho :)
Cheers, Alan
Charles Baylis wrote:
On 6 November 2014 10:19, Alan Lawrence <mailto:alan.lawre...@arm.com>> wrote:
Thi
So I'm no expert on RS6000 here, but following on from Segher's observation
about the change in pattern...so the difference in 'expand' is exactly that, a
vsx_reduc_splus_v2df followed by a vec_extract to DF, becomes a
vsx_reduc_splus_v2df_scalar - as I expected the combiner to produce by combin
Nice! One nit - can the extra "tree" argument be a "const_tree" ? - I'll defer
to the maintainers on the use of C++ default arguments in the AArch64 backend.
But LGTM.
--Alan
Charles Baylis wrote:
On 11 November 2014 15:25, Alan Lawrence wrote:
[Resending in gcc-pa
In response to https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01803.html, this
series removes the VEC_RSHIFT_EXPR, instead using a VEC_PERM_EXPR (with a second
argument full of constant zeroes) to represent the shift.
I've kept the use of vec_shr optab for platforms that define it, as even on
p
This is a preliminary to patch 2, which wants functionality equivalent to
vect_gen_perm_mask (converting a char* to an RTL const_vector) but without the
check of can_vec_perm_p.
All existing calls to vect_gen_perm_mask barring that in perm_mask_for_reverse,
assert the return value is non-null.
This makes the vectorizer use VEC_PERM_EXPRs when doing reductions via shifts,
rather than VEC_RSHIFT_EXPR.
VEC_RSHIFT_EXPR presently has an endianness-dependent meaning (paralleling
vec_shr_optab). While the overall destination of this patch series is to make
these endianness-neutral, this pa
Tested (with patches 1+2):
Bootstrap + check-gcc on x64-none-linux-gnu
cross-tested check-gcc on aarch64-none-elf and aarch64_be-none-elf as these
platforms stand (i.e. without vec_shr_optab).
also cross-tested check-gcc on aarch64-none-elf and aarch64_be-none-elf after
applying https://gcc.
This redefines vec_shr optab to be the same (in terms of gcc vectors) regardless
of target endianness. The vectorizer uses this to do reductions via shifts, so
also change the vectorizer to shift things always the same way (from the
midend's POV of vectors).
cross-tested check-gcc on (1) aarch
Have run check-gcc on gcc110.fsffrance.org (powerpc64-unknown-linux-gnu) using
this snippet on top of original patch; no regressions.
Alan Lawrence wrote:
So I'm no expert on RS6000 here, but following on from Segher's observation
about the change in pattern...so the difference in &
Pushed as r217440, also with Charles' whitespace fixes ('' -> tab) -
good spot!
Cheers, Alan
Marcus Shawcroft wrote:
On 6 November 2014 10:19, Alan Lawrence wrote:
This generates out-of-range errors at compile- (rather than assemble-)time
for the vqdm*_lane in
he same wording in invoke.texi, unless you think there
is more to add.
On 04/03/16 13:33, Jakub Jelinek wrote:
> Also, isn't the *.opt description line supposed to end with a full stop?
Ah, yes, thanks.
Is this version OK for trunk?
gcc/ChangeLog:
DATE Alan Lawrence
Jaku
On 07/03/16 11:02, Alan Lawrence wrote:
On 04/03/16 13:27, Richard Biener wrote:
I think to make it work with LTO you need to mark it 'Optimization'.
Also it's about
arrays so maybe
'Assume common declarations may be overridden with ones with a larger
trailing array'
In this PR, a packed structure containing bitfields, loses part of its
constant-pool initialization in SRA.
A fuller explanation is on the PR:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013#c11. In short we need to
treat constant-pool entries, like function parameters, as both come
'pre-initi
On 04/03/16 17:24, Alan Lawrence wrote:
On 26/02/16 14:52, James Greenhalgh wrote:
gcc/ChangeLog:
* gcc/config/aarch64/aarch64.c (aarch64_function_arg_alignment):
Rewrite, looking one level down for records and arrays.
---
gcc/config/aarch64/aarch64.c | 31
On 10/03/16 16:18, Dominique d'Humières wrote:
> The test gfortran.dg/unconstrained_commons.f fails in the 32 bit mode. It
> needs some regexp
Indeed, confirmed on ARM, sorry for not spotting this earlier.
I believe the variable, if there is one, should always be called 'j', as it is
in the sour
Yvan Roux wrote:
Hi,
this patch is a fix for pr27127. It avoids splitting the DI registers
into SI ones if it is not allowed, which breaks the introduced loop.
I haven't added a testcase as the bug is already exhibited by several
regressions (like g++.dg/ext/attribute-test-2.C or g++.dg/eh/simd
Thanks, pushed with comment and ChangeLog fix as r227033.
--Alan
Kyrill Tkachov wrote:
Hi Alan,
On 28/07/15 12:23, Alan Lawrence wrote:
This makes the existing float16 vector intrinsics available only when we have an
__fp16 type (i.e. when one of the ARM_FP16_FORMAT_... macros is defined
James Greenhalgh wrote:
Did you check that these actually emit the expected instruction?
Applying your patch set I see some fairly unpleasant code generation,
but I might have made an error, or perhaps you have another patch in
waiting?
Thanks,
James
Yes, you are right, some of the code gen
James Greenhalgh wrote:
>>
>> - VAR1 (UNOP, vec_unpacks_hi_, 10, v4sf)
>> + VAR2 (UNOP, vec_unpacks_hi_, 10, v4sf, v8hf)
>
> Should this not use the appropriate "BUILTIN_..." iterator?
Indeed; BUILTIN_VQ_HSF it is.
>>VAR1 (BINOP, float_truncate_hi_, 0, v4sf)
>>VAR1 (BINOP, float_truncat
ssa-dom-cse-2.c fails on a number of platforms because the input array is pushed
out to the constant pool, preventing later stages from folding away the entire
computation. This patch series fixes the failure by extending SRA to pull the
constants back in.
This is my first patch(set) to SRA and as
This makes SRA replace loads of records/arrays from constant pool entries,
with elementwise assignments of the constant values, hence, overcoming the
fundamental problem in PR/63679.
As a first pass, the approach I took was to look for constant-pool loads as
we scanned through other accesses, and
I used this as a means of better-testing the previous changes, as it exercises
the constant replacement code a whole lot more. Indeed, quite a few tests are
now optimized away to nothing on AArch64...
Always pulling in constants, is almost certainly not what we want, but we may
nonetheless want so
This changes the completely_scalarize_record path to also work on arrays (thus
allowing records containing arrays, etc.). This just required extending the
existing type_consists_of_records_p and completely_scalarize_record methods
to handle things of ARRAY_TYPE as well as RECORD_TYPE. Hence, I rena
This is a small refactoring/renaming patch, it just moves the call to
"completely_scalarize_record" out from completely_scalarize_var, and renames
the latter to create_total_scalarization_access.
This is because the next patch needs to drop the "_record" suffix and I felt
it would be confusing to
When SRA completely scalarizes an array, this patch changes the generated
accesses from e.g.
MEM[(int[8] *)&a + 4B] = 1;
to
a[1] = 1;
This overcomes a limitation in dom2, that accesses to equivalent chunks of e.g.
MEM[(int[8] *)&a] are not hashable_expr_equal_p with accesses to e.g.
ME
Alan Lawrence wrote:
All AArch64 patches are unchanged from previous version. However, in response to
discussion, the ARM patches are changed (much as I suggested
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02249.html); this version:
* Hides the existing vcvt_f16_f32 and vcvt_f32_f16
Christophe Lyon wrote:
On 28 July 2015 at 13:26, Alan Lawrence wrote:
This is a respin of
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00488.html, fixing up the
testsuite for float16 vectors. Relative to the previous version, most of the
additions to the tests are now within #if..#endif such
Sorry - wrong version posted. The hunk for add_options_for_arm_neon_fp16 has
moved to the previous patch! This version also fixes some whitespace issues.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c: New.
* lib/target-supports.exp
(check_effe
Christophe Lyon wrote:
On 28 July 2015 at 13:27, Alan Lawrence wrote:
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/advsimd-intrinsics/advsimd-intrinsics.exp:
set additional flags for neon-fp16 support.
* gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c: New.
Is that
The end goal of this series of patches is to enable 64bit vector modes for
TARGET_ARRAY_MODE_SUPPORTED_P, achieved in the last patch. At present, doing so
causes ICEs with illegal subregs (e.g. returning the middle bits from a large
int mode covering 3 vectors); the patchset avoids these by first r
301 - 400 of 583 matches
Mail list logo