on AArch64. OK for commit?
ChangeLog:
2019-11-15 Wilco Dijkstra
PR tree-optimization/90838
* tree-ssa-forwprop.c (optimize_count_trailing_zeroes):
Add new function.
(simplify_count_trailing_zeroes): Add new function.
(pass_forwprop::execute): Try ctz simpl
Hi Richard,
> So what do we actually do unpatched with -funroll-loops here?
Yes so it does the insane "fully unrolled trailing loop before the unrolled
loop" thing. One always does the trailing loop last (and typically as an
actual loop of course) and then the code ends up much faster, close to
t
codesize reduces by 0.2%.
OK for commit?
ChangeLog:
2019-11-15 Wilco Dijkstra
* config/arm/arm-cpus.in (armv7): Set tune to Cortex-A53.
(armv7-a): Likewise.
(armv7ve): Likewise.
---
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index
at by
MAX_INSN_PER_IT_BLOCK. Also use the CPU tuning setting when a CPU/tune
is selected if -mrestrict-it is not explicitly set.
On Cortex-A57 this gives 1.1% performance gain on SPECINT2006 as well
as a 0.4% codesize reduction.
Bootstrapped on armhf. OK for commit?
ChangeLog:
2019-08-19 Wilco Dij
uling floating point
code is generally beneficial (more registers and higher latencies), only enable
the pressure scheduler with -Ofast.
On Cortex-A57 this gives a 0.7% performance gain on SPECINT2006 as well
as a 0.2% codesize reduction.
Bootstrapped on armhf. OK for commit?
ChangeLog:
2019-11-06
testcase - libquantum and SPECv6
performance improves.
OK for commit?
ChangeLog:
2018-01-22 Wilco Dijkstra
PR target/79262
* config/aarch64/aarch64.c (generic_vector_cost): Adjust
vec_to_scalar_cost.
--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
Hi Richard,
> I acked this here:
> https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01229.html
Thanks - I missed your email, but it's committed now. Yes we will
need to look at the vector costs again and retune them based on
recent vectorizer improvements and latest microarchitectures.
Cheers,
Wil
The vrbit_1 test was missing a flag to disable code sharing.
Committed as obvious.
ChangeLog:
2019-11-20 Wilco Dijkstra
testsuite/
* gcc.target/aarch64/simd/vrbit_1.c: Add -fno-ipa-icf.
--
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vrbit_1.c
b/gcc/testsuite/gcc.target
Hi Rainer,
>> ld: warning: symbol 'err' has differing types:
>> (file /var/tmp//ccWQCyMc.o type=OBJT; file /lib/libc.so type=FUNC);
>> /var/tmp//ccWQCyMc.o definition taken
So are glob and err somehow exported as globals by your GLIBC? I don't think
those
are standard functions
Add a missing extern to ensure the test passes with -fno-common.
Committed as obvious.
ChangeLog:
2019-11-21 Wilco Dijkstra
testsuite/
* gfortran.dg/global_vars_f90_init_driver.c: Add missing extern.
--
diff --git a/gcc/testsuite/gfortran.dg/global_vars_f90_init_driver.c
b/gcc
Hi Andrew,
> Hi if we have a aarch64 compiler that has a big-endian
> multi-lib, it fails to compile libstdc++ because
> simd_fast_mersenne_twister_engine is only defined for little-endian
> in ext/random but ext/opt_random.h thinks it is defined always.
>
> OK? Built an aarch64-elf toolchain whi
Hi Andrew,
Could you repost your patch please to make review easier/quicker? It's no
longer linked...
Cheers,
Wilco
g for -O3 and higher.
OK for commit?
ChangeLog:
2019-11-26 Wilco Dijkstra
PR tree-optimization/80155
* common/config/arm/arm-common.c (arm_option_optimization_table):
Disable -fcode-hoisting with -O3.
--
diff --git a/gcc/common/config/arm/arm-common.c
b/gcc/common/c
Hi Christophe,
> Some time ago, you proposed to enable code hoisting for -Os instead,
> and this is the approach that was chosen
> in arm-9-branch. Why are you proposing a different setting for trunk?
Like I said in my message, I've now done more detailed benchmarking which
shows it affects -O3 p
Hi Richard,
>> Yes so it does the insane "fully unrolled trailing loop before the unrolled
>> loop" thing. One always does the trailing loop last (and typically as an
>> actual loop of course) and then the code ends up much faster, close to
>> the ideal version shown in the PR.
>
> Well, you can't
ped on AArch64. OK for commit?
ChangeLog:
2019-11-15 Wilco Dijkstra
PR tree-optimization/90838
* tree-ssa-forwprop.c (optimize_count_trailing_zeroes):
Add new function.
(simplify_count_trailing_zeroes): Add new function.
(pass_forwprop::execute): Try c
Hi Martin,
> I've noticed quite significant package failures caused by the revision.
How significant? Is it mostly the common mistake of forgetting extern?
> Would you please consider documenting this change in porting_to.html
> (and in changes.html) for GCC 10 release?
Sure, I already had a pa
Hi,
Add support for fused compare with branch. Rename the existing
AARCH64_FUSE_CMP_BRANCH to ALU_BRANCH, and AARCH64_FUSE_ALU_BRANCH
to ALU_CBZ to make it clear what is being fused.
AArch64 bootstrap OK, OK to commit?
ChangeLog:
2019-11-29 Wilco Dijkstra
* config/aarch64/aarch64
Hi,
I've backported r268189 to GCC8:
aarch64: fix use-after-free in -march=native (PR driver/89014)
Running:
$ valgrind ./xgcc -B. -c test.c -march=native
on aarch64 shows a use-after-free in host_detect_local_cpu due
to the std::string result of aarch64_get_extension_string_for_isa_flags
only
Add support for Cortex-A76, Ares and Neoverse N1 cpu names in GCC8 branch.
2019-11-29 Wilco Dijkstra
* config/aarch64/aarch64-cores.def (ares): Define.
(cortex-a76): Likewise.
(neoverse-n1): Likewise.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc
s have max_cond_insns set to 5 due to historical reasons.
Benchmarking shows that max_cond_insns=2 is fastest on modern Cortex-A
cores, so change it to 2. Set it to 4 on older in-order cores as that is
the MAX_INSN_PER_IT_BLOCK limit for Thumb-2.
Bootstrapped on armhf. OK for commit?
ChangeLo
SPECINT2006 as well as a
0.4% codesize reduction.
Bootstrapped on armhf. OK for commit?
ChangeLog:
2019-12-03 Wilco Dijkstra
* config/arm/arm.c (arm_option_override_internal):
Use max_cond_insns from CPU tuning unless -mrestrict-it is used.
--
diff --git a/gcc/config/arm/arm.c b
for fused compare with branch. Rename the existing
AARCH64_FUSE_CMP_BRANCH to ALU_BRANCH, and AARCH64_FUSE_ALU_BRANCH
to ALU_CBZ to make it clear what is being fused.
AArch64 bootstrap OK, OK to commit?
ChangeLog:
2019-12-03 Wilco Dijkstra
* config/aarch64/aarch64.c
Hi,
A quick benchmark shows it's faster up to about 10 bytes, but after that it
becomes extremely slow. At 16 bytes it's already 2.5 times slower and for
larger sizes its over 13 times slower than the GLIBC implementation...
> The implementation falls back to the library call if the
> string is
Hi,
> But we still have an issue with performance, when we are using default
> unwinder, which uses unwind tables. It could be up to 10 times faster to
> use frame based stack unwinder instead "default unwinder".
Switching on the frame pointer typically costs 1-2% performance, so it's a bad
idea
ping
From: Wilco Dijkstra
Sent: 18 June 2018 15:01
To: GCC Patches
Cc: nd; Joseph Myers
Subject: [PATCH v3] Change default to -fno-math-errno
GCC currently defaults to -fmath-errno. This generates code assuming math
functions set errno and the application checks errno. Few applications
Hi Steve,
The latest version compiles the examples I used correctly, so it looks fine
from that perspective (but see comments below). However the key point of
the ABI is to enable better code generation when calling a vector function,
and that will likely require further changes that may conflict
.
This results in larger, slower code. Benchmarking FP reassociation width=1
showed a ~0.5% gain on SPECFP2006 and similar gains on other benchmarks,
so change it to 1.
Passes regress & bootstrap, OK for commit?
ChangeLog:
2017-06-12 Wilco Dijkstra
* gcc/config/aarch64/aarch
SPECFP2006 is 1.1% faster.
Passes AArch64 and ARM bootstrap and regress.
ChangeLog:
2017-05-30 Wilco Dijkstra
* config/arm/cortex-a53.md (cortex_a53_fpalu) Adjust latency.
(cortex_a53_fconst): Likewise.
(cortex_a53_fpmul): Likewise.
(cortex_a53_f_load_64): Likewise
Richard Earnshaw (lists) wrote:
>
> Why 1 and not 2? Many processors have 2 fp pipes and forcing this down
> to a sequential stream is not obviously the right thing.
1 was faster than 2. Like I said, the reassociation is too aggressive and even
splits multiply-add rather than keeping them. Until
ping
Richard Earnshaw (lists) wrote:
> On 05/05/17 13:42, Wilco Dijkstra wrote:
>> Richard Earnshaw (lists) wrote:
>>> On 04/05/17 18:38, Wilco Dijkstra wrote:
>>> > Richard Earnshaw wrote:
>>> >
>>>>> - 5,
ping
Richard Earnshaw (lists) wrote:
> --- a/gcc/config/arm/aarch-common.c
> +++ b/gcc/config/arm/aarch-common.c
> @@ -254,12 +254,7 @@ arm_no_early_alu_shift_dep (rtx producer, rtx consumer)
> return 0;
>
> if ((early_op = arm_find_shift_sub_rtx (op)))
> - {
> - if (REG_P (
ping
Richard Earnshaw (lists) wrote:
> (define_insn "*movdi_vfp"
> - [(set (match_operand:DI 0 "nonimmediate_di_operand"
> "=r,r,r,r,q,q,m,w,r,w,w, Uv")
> + [(set (match_operand:DI 0 "nonimmediate_di_operand"
> "=r,r,r,r,q,q,m,w,!r,w,w, Uv")
> Why have you introduced a no-reloads block
ping
From: Wilco Dijkstra
Sent: 31 October 2016 18:29
To: GCC Patches
Cc: nd
Subject: [RFC][PATCH][AArch64] Cleanup frame pointer usage
This patch cleans up all code related to the frame pointer. On AArch64 we
emit a frame chain even in cases where the frame pointer is not required.
So
ping
From: Wilco Dijkstra
Sent: 03 November 2016 12:20
To: GCC Patches
Cc: nd
Subject: [PATCH][ARM] Fix ldrd offsets
Fix ldrd offsets of Thumb-2 - for TARGET_LDRD the range is +-1020,
without -255..4091. This reduces the number of addressing instructions
when using DI mode operations
ping
From: Wilco Dijkstra
Sent: 10 November 2016 17:19
To: GCC Patches
Cc: nd
Subject: [PATCH][ARM] Improve max_insns_skipped logic
Improve the logic when setting max_insns_skipped. Limit the maximum size of IT
to MAX_INSN_PER_IT_BLOCK as otherwise multiple IT instructions are needed
ping
From: Wilco Dijkstra
Sent: 17 January 2017 18:00
To: GCC Patches
Cc: nd; Kyrylo Tkachov; Richard Earnshaw
Subject: [PATCH][ARM] Remove Thumb-2 iordi_not patterns
After Bernd's DImode patch [1] almost all DImode operations are expanded
early (except for -mfpu=neon). This mean
ping
From: Wilco Dijkstra
Sent: 17 January 2017 19:23
To: GCC Patches
Cc: nd; Kyrill Tkachov; Richard Earnshaw
Subject: [PATCH][ARM] Remove DImode expansions for 1-bit shifts
A left shift of 1 can always be done using an add, so slightly adjust rtx
cost for DImode left shift by 1 so that
ping
From: Wilco Dijkstra
Sent: 17 January 2017 15:14
To: Richard Earnshaw; GCC Patches; James Greenhalgh
Cc: nd
Subject: Re: [PATCH v3][AArch64] Fix symbol offset limit
Here is v3 of the patch - tree_fits_uhwi_p was necessary to ensure the size of a
declaration is an integer. So the
James Greenhalgh wrote:
> I note this is still marked as an RFC, are you now proposing it as a
> patch to be merged to trunk?
Absolutely. It was marked as an RFC to get some comments - I thought it
may be controversial to separate the frame pointer and frame chain concept.
And this fixes the lon
Hi,
Let's get back to the patch and the bug it fixes. The only outstanding question
is what
constant offsets we should allow when generating a relocation:
> So the question is whether we should allow
> largish offsets outside of the bounds of symbols (v1), no offsets (this
> version), or
> smal
Wilco Dijkstra wrote:
> James Greenhalgh wrote:
>
> > I note this is still marked as an RFC, are you now proposing it as a
> > patch to be merged to trunk?
>
> Absolutely. It was marked as an RFC to get some comments - I thought it
> may be controversial to separate
Richard Earnshaw wrote:
> Yes, I still believe that this is a bug in the way we've documented the
> -mcmodel=tiny and -mcmodel=small options.
In what way could this possibly be a documentation bug? It's not at all related
to the size of a binary. There is no limit to the offset you can apply to a
Jiong Wang wrote:
test.c
===
struct K {
int a;
int b;
int c;
int d;
char e;
short f;
long g;
float h;
double i;
};
void foo (int, struct K *);
void test (int i)
{
struct K k = {
.a = 5,
.b = 0,
.c = i,
};
foo (5, &k);
}
There are 2 separate latent bugs here, bo
Richard Earnshaw wrote:
>
> You can write it, but it's meaningless by the C standard. You can't
> take the address beyond one after the size of the object, so anything
> more than &a+1 has no meaning.
No it's perfectly valid and such out-of-range cases occur thousands of
times when building any n
Richard Earnshaw wrote:
C11: Summary of undefined behaviours.
— Addition or subtraction of a pointer into, or just beyond, an array
object and an
integer type produces a result that does not point into, or just beyond,
the same array
object (6.5.6).
That's totally irrelevant given the addition i
Richard Earnshaw wrote:
> No it's not. The optimizer doesn't create totally random bases. If the
> code + data is less than 1M in size, then any offsets it does create
> will fit within the size of the relocations selected by the compiler.
No that's completely false. There is no way you can guar
add w0, w0, w2
cmp w0, 100
ble .L5
ldr w2, [x3, 8]
add w1, w1, w2
.L5:
ldr w2, [x3, 4]
add w0, w0, w2
add w0, w0, w1
ret
Passes regress and bootstrap, OK for commit?
ChangeLog:
2017-06-19 Wilco
ret
Passes regress & bootstrap, OK for commit?
ChangeLog:
2017-06-20 Wilco Dijkstra
* config/aarch64/aarch64-simd.md (aarch64_simd_dup):
Swap alternatives, make integer dup more expensive.
--
diff --git a/gcc/config/aarch64/aarch64-simd.md
b/gcc/config/aarch64/aarc
SIMD moves are currently emitted as ORR. Change this to use the MOV
pseudo instruction just like integer moves (the ARM-ARM states MOV is the
preferred disassembly), improving readability of -S output.
Passes bootstrap, OK for commit?
ChangeLog:
2017-06-20 Wilco Dijkstra
* config
James Greenhalgh wrote:
>
> Does this introduce a dependency on a particular binutils version, or have
> we always supported this alias?
>
> The patch looks OK, but I don't want to introduce a new dependency so please
> check how far back this is supported.
Well gas/testsuite/gas/aarch64/alias.s
p OK, OK for commit?
ChangeLog:
2017-06-20 Wilco Dijkstra
* config/aarch64/aarch64.c (aarch64_legitimate_constant_p):
Return true for non-tls symbols.
--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index
5ec6bbfcf484baa4005b8
James Greenhalgh wrote:
>
> Have you tested this in cases where an integer dup is definitely the right
> thing to do?
Yes, this still generates:
#include
void f(unsigned a, unsigned b, uint32x4_t *c)
{
c[0] = vdupq_n_u32(a);
c[1] = vdupq_n_u32(b);
}
dup v1.4s, w0
Jeff Law wrote:
> But the stack pointer might have already been advanced into the guard
> page by the caller. For the sake of argument assume the guard page is
> 0xf1000 and assume that our stack pointer at entry is 0xf1010 and that
> the caller hasn't touched the 0xf1000 page.
>
> If FrameSize >
Richard Earnshaw wrote:
> A mere 256 bytes for the caller would permit 32 x 8byte arguments on the
> stack which, with at least 8 parameters passed in registers, would allow
> for calls with 40 parameters. There can't be many in that space. Any
> function making calls with more than that might ne
Jeff Law wrote:
> I'm a little confused. I'm not defining or changing the ABI. I'm
> working within my understanding of the existing aarch64 ABI used on
> linux systems. My understanding after reading that ABI and the prologue
> code for aarch64 is there's nothing that can currently be relied u
Jeff Law wrote:
> You can be in one of 3 states when you start the callee's prologue.
>
> 1. You're somewhere in the normal stack.
>
> 2. You've past the guard and are already in the heap or elsewhere
>
> 3. You're somewhere in the guard
>
> State #3 is what we're trying to address. The attacker h
Andreas Schwab wrote:
>
> This breaks gcc.target/aarch64/reload-valid-spoff.c with -mabi=ilp32:
Indeed, there is a odd ILP32 bug that causes high/lo_sum to be generated
in SI mode in expand:
(insn 15 14 16 4 (set (reg:SI 125)
(high:SI (symbol_ref/u:DI ("*.LC1") [flags 0x2])))
(nil))
/aarch64/reload-valid-spoff.c triggered
by https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01367.html.
OK for commit?
ChangeLog:
2017-06-26 Wilco Dijkstra
* config/aarch64/aarch64.md (load_pairsi): Avoid Pmode.
(store_pairsi): Likewise.
(load_pairdi): Likewise
Hi Yvan,
> Here is the backport of Wilco's patch (r237607) along with Kyrill's
> one (r244643, which removed the remaining occurences of
> aarch64_nopcrelative_literal_loads). To fix the issue the original
> patch has to be modified, to keep aarch64_pcrelative_literal_loads
> test for large model
Pmode as the base address, but
aarch64_expand_mov_immediate wasn't emitting a conversion in one case.
Besides fixing this add an assert that flags any MEM operands that are
not Pmode.
Passes regress (with/without ilp32). OK for commit?
ChangeLog:
2017-06-27 Wilco Dijkstra
* c
Hi,
This patch has been superseded by:
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg02027.html
Wilco
?
ChangeLog:
2017-06-27 Wilco Dijkstra
PR target/79041
* config/aarch64/aarch64.c (aarch64_classify_symbol):
Avoid SYMBOL_SMALL_ABSOLUTE .
* testsuite/gcc.target/aarch64/pr79041-2.c: New test.
--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
ping
From: Wilco Dijkstra
Sent: 17 January 2017 15:14
To: Richard Earnshaw; GCC Patches; James Greenhalgh
Cc: nd
Subject: Re: [PATCH v3][AArch64] Fix symbol offset limit
Here is v3 of the patch - tree_fits_uhwi_p was necessary to ensure the size of a
declaration is an integer. So the
ping
Wilco Dijkstra wrote:
> James Greenhalgh wrote:
>
> > I note this is still marked as an RFC, are you now proposing it as a
> > patch to be merged to trunk?
>
> Absolutely. It was marked as an RFC to get some comments - I thought it
> may be controversial to
ping
From: Wilco Dijkstra
Sent: 17 January 2017 19:23
To: GCC Patches
Cc: nd; Kyrill Tkachov; Richard Earnshaw
Subject: [PATCH][ARM] Remove DImode expansions for 1-bit shifts
A left shift of 1 can always be done using an add, so slightly adjust rtx
cost for DImode left shift by 1 so
ping
From: Wilco Dijkstra
Sent: 10 November 2016 17:19
To: GCC Patches
Cc: nd
Subject: [PATCH][ARM] Improve max_insns_skipped logic
Improve the logic when setting max_insns_skipped. Limit the maximum size of IT
to MAX_INSN_PER_IT_BLOCK as otherwise multiple IT instructions are needed
ping
From: Wilco Dijkstra
Sent: 17 January 2017 18:00
To: GCC Patches
Cc: nd; Kyrylo Tkachov; Richard Earnshaw
Subject: [PATCH][ARM] Remove Thumb-2 iordi_not patterns
After Bernd's DImode patch [1] almost all DImode operations are expanded
early (except for -mfpu=neon). This mean
ping
From: Wilco Dijkstra
Sent: 03 November 2016 12:20
To: GCC Patches
Cc: nd
Subject: [PATCH][ARM] Fix ldrd offsets
Fix ldrd offsets of Thumb-2 - for TARGET_LDRD the range is +-1020,
without -255..4091. This reduces the number of addressing instructions
when using DI mode operations
ping
Richard Earnshaw (lists) wrote:
> (define_insn "*movdi_vfp"
> - [(set (match_operand:DI 0 "nonimmediate_di_operand"
> "=r,r,r,r,q,q,m,w,r,w,w, Uv")
> + [(set (match_operand:DI 0 "nonimmediate_di_operand"
> "=r,r,r,r,q,q,m,w,!r,w,w, Uv")
> Why have you introduced a no-reloads
ping
On Fri, May 05, 2017 at 05:02:46PM +0100, Wilco Dijkstra wrote:
> Richard Earnshaw (lists) wrote:
>
> > --- a/gcc/config/arm/aarch-common.c
> > +++ b/gcc/config/arm/aarch-common.c
> > @@ -254,12 +254,7 @@ arm_no_early_alu_shift_dep (rtx producer, rtx con
ping
Richard Earnshaw (lists) wrote:
> On 05/05/17 13:42, Wilco Dijkstra wrote:
>> Richard Earnshaw (lists) wrote:
>>> On 04/05/17 18:38, Wilco Dijkstra wrote:
>>> > Richard Earnshaw wrote:
>>> >
>>>>> - 5,
Ramana Radhakrishnan wrote:
>
> I'm about to run home for the day but this came in from
> https://gcc.gnu.org/ml/gcc-patches/2013-09/msg02109.html and James
> said in that email that this was put in to ensure no segfaults on
> cortex-a15 / cortex-a7 tuning.
The code is historical - an older ve
Georg-Johann Lay wrote:
@@ -5300,6 +5300,9 @@ seq_cost (const rtx_insn *seq, bool spee
set = single_set (seq);
if (set)
cost += set_rtx_cost (set, speed);
+ else if (INSN_P (seq)
+ && PARALLEL == GET_CODE (PATTERN (seq)))
+ cost += insn_rtx_cost (PATT
Richard Biener wrote:
> Hurugalawadi, Naveen wrote:
> > The code (m1 > m2) * d code should be optimized as m1> m2 ? d : 0.
> What's the reason of this transform? I expect that the HW multiplier
> is quite fast given one operand is either zero or one and a multiplication
> is a gimple operation th
Richard Biener wrote:
> int f (int m, int c)
> {
> return (m & 1) * c;
> }
This case (integer[0,1] rather than boolean input) should be transformed into c
& -(m & 1).
Wilco
Andreas Schwab wrote:
> @@ -5207,6 +5209,7 @@ aarch64_print_operand (FILE *f, rtx x, int code)
>
> case MEM:
> output_address (GET_MODE (x), XEXP (x, 0));
> + gcc_assert (GET_MODE (XEXP (x, 0)) == Pmode);
> break;
>
> case CONST:
> That breaks a lot of gna
Michael Matz wrote:
>
> You'll probably also have to set GNATBIND and GNATMAKE to the
> appropriately suffixed variants. Just saying, because that's what I'm
> usually forgetting and end up with strange errors :)
Configure seems to be able to find gnatbind/gnatmake as they are in /usr/bin.
Com
Joseph Myers wrote:
> On Fri, 3 Nov 2017, Wilco Dijkstra wrote:
>
> > Almost all targets add an explict -fomit-frame-pointer in the target
> > specific
> > options. Rather than doing this in a target-specific way, do this in the
>
> Which targets do not? You shou
Jeff Law wrote:
> I'd actually prefer to deprecate the H8 and M68k. But assuming that's
> not going to happen in the immediate future I think dropping frame
> pointers on those targets is appropriate as long as we're generating
> dwarf frame info.
Is there a way to check a target does not genera
Richard Biener wrote:
> On Tue, Oct 17, 2017 at 6:32 PM, Wilco Dijkstra
> wrote:
>> (if (flag_reciprocal_math)
>> - /* Convert (A/B)/C to A/(B*C) */
>> + /* Convert (A/B)/C to A/(B*C). */
>> (simplify
>> (rdiv (rdiv:s @0 @1) @2)
>> - (rdiv @0
Richard Biener wrote:
> On Tue, Oct 17, 2017 at 6:28 PM, Wilco Dijkstra
> wrote:
>> +(if (flag_unsafe_math_optimizations)
>> + /* Simplify (C / x op 0.0) to x op 0.0 for C > 0. */
>> + (for op (lt le gt ge)
>> + neg_op (gt ge lt le)
>> + (sim
Sandra Loosemore wrote:
> I'd prefer that you remove the reference to configure options entirely
> here. Nowadays most GCC users install a package provided by their OS
> distribution, Linaro, etc, rather than trying to build GCC from scratch.
OK, I've removed that reference. Similarly the FRAM
Improve the AArch64 frame tests - add -f(no-)omit-frame-pointer,
update checks and add missing tests. As a result all tests now
pass.
Committed as obvious.
ChangeLog:
2017-11-16 Wilco Dijkstra
* gcc.target/aarch64/lr_free_2.c: Fix test.
* gcc.target/aarch64/spill_1.c
ping
From: Jackson Woodruff
Sent: 06 September 2017 10:55
To: Richard Biener
Cc: Wilco Dijkstra; kyrylo.tkac...@foss.arm.com; Joseph S. Myers; GCC Patches
Subject: Re: [PATCH] Factor out division by squares and remove division around
comparisons (2/2)
Hi all,
A minor improvement came to
ase that should cause a FP exception:
void f(void)
{
0.0 / 0.0;
}
Compiles to:
f:
ret
OK for commit?
2017-11-16 Wilco Dijkstra
* common.opt (ftrapping-math): Change default to 0.
* doc/invoke.texi (-ftrapping-math): Update documentation.
--
diff --git a/gcc/c
Richard Biener wrote:
> We are generally not preserving traps but we guard any transform that
> might introduce traps with -ftrapping-math. That's similar to how we treat
> -ftrapv and pointer dereferences.
Right. It appears it's mostly concerned about division - if it is about division
by zero
Passes regress & bootstrap, OK for commit?
ChangeLog:
2017-11-17 Wilco Dijkstra
* config/aarch64/aarch64.md (mov): Remove '*' in alternatives.
(movsi_aarch64): Likewise.
(load_pairsi): Likewise.
(load_pairdi): Likewise.
(store_p
for commit until we get rid of it?
ChangeLog:
2017-11-17 Wilco Dijkstra
gcc/
* config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1.
--
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index
056110afb228fb919e837c04aa5e55
at way you could pass the size/alignment/volatile and
decide per bitfield access.
What do people think?
ChangeLog:
2017-11-17 Wilco Dijkstra
* config/aarch64/aarch64.h: Remove SLOW_BYTE_ACCESS.
* config/alpha/alpha.h: Likewise.
* config/arc/arc.h: Likewise.
*
any packages fail to get an
idea how feasible it is. We could keep defaulting to -fcommon with -std=c89
if necessary.
2017-11-17 Wilco Dijkstra
* common.opt (fcommon): Change init to 1.
* doc/invoke.texi (-fcommon): Update documentation.
--
diff --git a/gcc/common.opt b/
Richard Biener wrote:
> A target specific default might be a good idea if we decide to revert.
>
> Note I proposed this change a few times already, but the fear was always
> we'll break too much legacy code.
It will definitely break some code, but new warnings with -Werror might too...
> Note y
Michael Matz wrote:
> bss _sections_ != bss-like segments in the executable. Targets might not
> have a bss section that could be named in the asm file, or no way to
> switch to it without disrupting surrounding code, but they might have
> common symbols, which ultimately might or might not be
that applies to store_pair_lanes, uses PARALLEL when calling
aarch64_classify_address so that it knows it is an STP.
Also add the 'z' specifier for future use by load/store pair instructions.
Passes regress, OK for commit?
ChangeLog:
2017-11-27 Wilco Dijkstra
*
Szabolcs Nagy wrote:
>On 28/10/17 05:08, Jeff Law wrote:
>
>> My hope would be that we simply don't ever use the params. They were
>> done as much for *you* to experiment with as anything. I'd happy just
>> delete them as there's essentially no guard rails to ensure their values
>> are sane.
>
>
the ICE in https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02509.html.
ChangeLog:
2017-11-30 Wilco Dijkstra
gcc/
* config/aarch64/aarch64.md (call_insn): Use %c rather than %a.
(call_value_insn): Likewise.
(sibcall_insn): Likewise.
(sibcall_value_
Hi Qing,
Just looking at a very high level, I have a few comments:
1. Constant folding str(n)cmp - folding is done separately in fold-const-call.c
and gimple-fold.c. There is already code for folding strcmp and strncmp,
so we shouldn't need to add new foldings. Or do you have an example t
comments more readable.
Bootstrap OK, OK for trunk?
ChangeLog:
2017-12-20 Wilco Dijkstra
gcc/
PR tree-optimization/83491
* tree-ssa-math-opts.c (execute_cse_reciprocals_1): Check for SSA_NAME
before walking uses. Improve coding style and comments.
gcc/testsuite
ping (note also Jeff's reply
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01916.html)
From: Wilco Dijkstra
Sent: 15 November 2017 15:36
To: Richard Biener
Cc: GCC Patches; nd
Subject: Re: [PATCH] Simplify floating point comparisons
Richard Biener wrote:
> On Tue, Oct 17, 2017 at
?
ChangeLog:
2018-01-04 Wilco Dijkstra
gcc/
* config/aarch64/aarch64.md (fma4): Change into expand pattern.
(fnma4): Likewise.
(fms4): Likewise.
(fnms4): Likewise.
(aarch64_fma4): Rename insn, reorder accumulator operand.
(aarch64_fnma4): Likewise
201 - 300 of 1188 matches
Mail list logo