Re: [PATCH] PR90838: Support ctz idioms

2019-11-15 Thread Wilco Dijkstra
on AArch64. OK for commit? ChangeLog: 2019-11-15 Wilco Dijkstra PR tree-optimization/90838 * tree-ssa-forwprop.c (optimize_count_trailing_zeroes): Add new function. (simplify_count_trailing_zeroes): Add new function. (pass_forwprop::execute): Try ctz simpl

Re: [PATCH, GCC, AArch64] Fix PR88398 for AArch64

2019-11-15 Thread Wilco Dijkstra
Hi Richard, > So what do we actually do unpatched with -funroll-loops here? Yes so it does the insane "fully unrolled trailing loop before the unrolled loop" thing. One always does the trailing loop last (and typically as an actual loop of course) and then the code ends up much faster, close to t

[PATCH][Arm] Set Armv7-A tune to Cortex-A53

2019-11-18 Thread Wilco Dijkstra
codesize reduces by 0.2%. OK for commit? ChangeLog: 2019-11-15 Wilco Dijkstra * config/arm/arm-cpus.in (armv7): Set tune to Cortex-A53. (armv7-a): Likewise. (armv7ve): Likewise. --- diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in index

Re: [PATCH][ARM] Improve max_cond_insns setting for Cortex cores

2019-11-19 Thread Wilco Dijkstra
at by MAX_INSN_PER_IT_BLOCK. Also use the CPU tuning setting when a CPU/tune is selected if -mrestrict-it is not explicitly set. On Cortex-A57 this gives 1.1% performance gain on SPECINT2006 as well as a 0.4% codesize reduction. Bootstrapped on armhf. OK for commit? ChangeLog: 2019-08-19 Wilco Dij

Re: [PATCH][Arm] Only enable fsched-pressure with Ofast

2019-11-19 Thread Wilco Dijkstra
uling floating point code is generally beneficial (more registers and higher latencies), only enable the pressure scheduler with -Ofast. On Cortex-A57 this gives a 0.7% performance gain on SPECINT2006 as well as a 0.2% codesize reduction. Bootstrapped on armhf. OK for commit? ChangeLog: 2019-11-06

Re: [PATCH][AArch64] PR79262: Adjust vector cost

2019-11-19 Thread Wilco Dijkstra
testcase - libquantum and SPECv6 performance improves. OK for commit? ChangeLog: 2018-01-22 Wilco Dijkstra PR target/79262 * config/aarch64/aarch64.c (generic_vector_cost): Adjust vec_to_scalar_cost. -- diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c

Re: [PATCH][AArch64] PR79262: Adjust vector cost

2019-11-19 Thread Wilco Dijkstra
Hi Richard, > I acked this here: > https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01229.html Thanks - I missed your email, but it's committed now. Yes we will need to look at the vector costs again and retune them based on recent vectorizer improvements and latest microarchitectures. Cheers, Wil

[COMMITTED][AArch64] Fix vrbit_1.c test failure

2019-11-20 Thread Wilco Dijkstra
The vrbit_1 test was missing a flag to disable code sharing. Committed as obvious. ChangeLog: 2019-11-20 Wilco Dijkstra testsuite/ * gcc.target/aarch64/simd/vrbit_1.c: Add -fno-ipa-icf. -- diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vrbit_1.c b/gcc/testsuite/gcc.target

Re: [PATCH] Fix libgo build (was Re: [PATCH v3] PR85678: Change default to -fno-common)

2019-11-21 Thread Wilco Dijkstra
Hi Rainer, >> ld: warning: symbol 'err' has differing types: >> (file /var/tmp//ccWQCyMc.o type=OBJT; file /lib/libc.so type=FUNC); >> /var/tmp//ccWQCyMc.o definition taken So are glob and err somehow exported as globals by your GLIBC? I don't think those are standard functions

[COMMITTED] Fix global_vars_f90_init test failure

2019-11-21 Thread Wilco Dijkstra
Add a missing extern to ensure the test passes with -fno-common. Committed as obvious. ChangeLog: 2019-11-21 Wilco Dijkstra testsuite/ * gfortran.dg/global_vars_f90_init_driver.c: Add missing extern. -- diff --git a/gcc/testsuite/gfortran.dg/global_vars_f90_init_driver.c b/gcc

Re: [PATCH] Fix libstdc++ compiling for an aarch64 multilib with big-endian.

2019-11-26 Thread Wilco Dijkstra
Hi Andrew, > Hi if we have a aarch64 compiler that has a big-endian > multi-lib, it fails to compile libstdc++ because > simd_fast_mersenne_twister_engine is only defined for little-endian > in ext/random but ext/opt_random.h thinks it is defined always. > > OK? Built an aarch64-elf toolchain whi

Re: [PATCH/AARCH64] Generate FRINTZ for (double)(long) under -ffast-math on aarch64

2019-11-26 Thread Wilco Dijkstra
Hi Andrew, Could you repost your patch please to make review easier/quicker? It's no longer linked... Cheers, Wilco

[PATCH v2][ARM] Disable code hoisting with -O3 (PR80155)

2019-11-26 Thread Wilco Dijkstra
g for -O3 and higher. OK for commit? ChangeLog: 2019-11-26 Wilco Dijkstra PR tree-optimization/80155 * common/config/arm/arm-common.c (arm_option_optimization_table): Disable -fcode-hoisting with -O3. -- diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/c

Re: [PATCH v2][ARM] Disable code hoisting with -O3 (PR80155)

2019-11-26 Thread Wilco Dijkstra
Hi Christophe, > Some time ago, you proposed to enable code hoisting for -Os instead, > and this is the approach that was chosen > in arm-9-branch. Why are you proposing a different setting for trunk? Like I said in my message, I've now done more detailed benchmarking which shows it affects -O3 p

Re: [PATCH, GCC, AArch64] Fix PR88398 for AArch64

2019-11-27 Thread Wilco Dijkstra
Hi Richard, >> Yes so it does the insane "fully unrolled trailing loop before the unrolled >> loop" thing. One always does the trailing loop last (and typically as an >> actual loop of course) and then the code ends up much faster, close to >> the ideal version shown in the PR. > > Well, you can't

Re: [PATCH] PR90838: Support ctz idioms

2019-11-28 Thread Wilco Dijkstra
ped on AArch64. OK for commit? ChangeLog: 2019-11-15 Wilco Dijkstra PR tree-optimization/90838 * tree-ssa-forwprop.c (optimize_count_trailing_zeroes): Add new function. (simplify_count_trailing_zeroes): Add new function. (pass_forwprop::execute): Try c

Re: [PATCH] PR85678: Change default to -fno-common

2019-11-29 Thread Wilco Dijkstra
Hi Martin, > I've noticed quite significant package failures caused by the revision. How significant? Is it mostly the common mistake of forgetting extern? > Would you please consider documenting this change in porting_to.html > (and in changes.html) for GCC 10 release? Sure, I already had a pa

[PATCH][AArch64] Add support for fused compare and branch

2019-11-29 Thread Wilco Dijkstra
Hi, Add support for fused compare with branch. Rename the existing AARCH64_FUSE_CMP_BRANCH to ALU_BRANCH, and AARCH64_FUSE_ALU_BRANCH to ALU_CBZ to make it clear what is being fused. AArch64 bootstrap OK, OK to commit? ChangeLog: 2019-11-29 Wilco Dijkstra * config/aarch64/aarch64

[COMMITTED][GCC8] Backport driver/89014 Use-after-free in aarch64 -march=native

2019-11-29 Thread Wilco Dijkstra
Hi, I've backported r268189 to GCC8: aarch64: fix use-after-free in -march=native (PR driver/89014) Running: $ valgrind ./xgcc -B. -c test.c -march=native on aarch64 shows a use-after-free in host_detect_local_cpu due to the std::string result of aarch64_get_extension_string_for_isa_flags only

[PATCH][GCC8][AArch64] Backport Cortex-A76, Ares and Neoverse N1 cpu names

2019-12-02 Thread Wilco Dijkstra
Add support for Cortex-A76, Ares and Neoverse N1 cpu names in GCC8 branch. 2019-11-29 Wilco Dijkstra * config/aarch64/aarch64-cores.def (ares): Define. (cortex-a76): Likewise. (neoverse-n1): Likewise. * config/aarch64/aarch64-tune.md: Regenerate. * doc

Re: [PATCH][ARM] Improve max_cond_insns setting for Cortex cores

2019-12-03 Thread Wilco Dijkstra
s have max_cond_insns set to 5 due to historical reasons. Benchmarking shows that max_cond_insns=2 is fastest on modern Cortex-A cores, so change it to 2. Set it to 4 on older in-order cores as that is the MAX_INSN_PER_IT_BLOCK limit for Thumb-2. Bootstrapped on armhf. OK for commit? ChangeLo

[PATCH v2 2/2][ARM] Improve max_cond_insns setting for Cortex cores

2019-12-03 Thread Wilco Dijkstra
SPECINT2006 as well as a 0.4% codesize reduction. Bootstrapped on armhf. OK for commit? ChangeLog: 2019-12-03 Wilco Dijkstra * config/arm/arm.c (arm_option_override_internal): Use max_cond_insns from CPU tuning unless -mrestrict-it is used. -- diff --git a/gcc/config/arm/arm.c b

Re: [PATCH][AArch64] Add support for fused compare and branch

2019-12-03 Thread Wilco Dijkstra
for fused compare with branch. Rename the existing AARCH64_FUSE_CMP_BRANCH to ALU_BRANCH, and AARCH64_FUSE_ALU_BRANCH to ALU_CBZ to make it clear what is being fused. AArch64 bootstrap OK, OK to commit? ChangeLog: 2019-12-03 Wilco Dijkstra * config/aarch64/aarch64.c

Re: [PATCH][AARCH64] inline strlen for 8-bytes aligned strings

2018-08-10 Thread Wilco Dijkstra
Hi, A quick benchmark shows it's faster up to about 10 bytes, but after that it becomes extremely slow. At 16 bytes it's already 2.5 times slower and for larger sizes its over 13 times slower than the GLIBC implementation... > The implementation falls back to the library call if the > string is

Re: [PATCH] Frame pointer for arm with THUMB2 mode

2018-08-27 Thread Wilco Dijkstra
Hi, > But we still have an issue with performance, when we are using default > unwinder, which uses unwind tables. It could be up to 10 times faster to > use frame based stack unwinder instead "default unwinder". Switching on the frame pointer typically costs 1-2% performance, so it's a bad idea

Re: [PATCH v3] Change default to -fno-math-errno

2018-09-04 Thread Wilco Dijkstra
ping From: Wilco Dijkstra Sent: 18 June 2018 15:01 To: GCC Patches Cc: nd; Joseph Myers Subject: [PATCH v3] Change default to -fno-math-errno   GCC currently defaults to -fmath-errno.  This generates code assuming math functions set errno and the application checks errno.  Few applications

Re: [Patch][Aarch64] Implement Aarch64 SIMD ABI and aarch64_vector_pcs attribute

2018-09-04 Thread Wilco Dijkstra
Hi Steve, The latest version compiles the examples I used correctly, so it looks fine from that perspective (but see comments below). However the key point of the ABI is to enable better code generation when calling a vector function, and that will likely require further changes that may conflict

[PATCH][AArch64] Change FP reassociation width

2017-06-12 Thread Wilco Dijkstra
. This results in larger, slower code. Benchmarking FP reassociation width=1 showed a ~0.5% gain on SPECFP2006 and similar gains on other benchmarks, so change it to 1. Passes regress & bootstrap, OK for commit? ChangeLog: 2017-06-12 Wilco Dijkstra * gcc/config/aarch64/aarch

[PATCH][AArch64] Improve Cortex-A53 FP scheduler

2017-06-12 Thread Wilco Dijkstra
SPECFP2006 is 1.1% faster. Passes AArch64 and ARM bootstrap and regress. ChangeLog: 2017-05-30 Wilco Dijkstra * config/arm/cortex-a53.md (cortex_a53_fpalu) Adjust latency. (cortex_a53_fconst): Likewise. (cortex_a53_fpmul): Likewise. (cortex_a53_f_load_64): Likewise

Re: [PATCH][AArch64] Change FP reassociation width

2017-06-13 Thread Wilco Dijkstra
Richard Earnshaw (lists) wrote: > > Why 1 and not 2?  Many processors have 2 fp pipes and forcing this down > to a sequential stream is not obviously the right thing. 1 was faster than 2. Like I said, the reassociation is too aggressive and even splits multiply-add rather than keeping them. Until

Re: [PATCH][ARM] Update max_cond_insns settings

2017-06-13 Thread Wilco Dijkstra
ping   Richard Earnshaw (lists) wrote: > On 05/05/17 13:42, Wilco Dijkstra wrote: >> Richard Earnshaw (lists) wrote: >>> On 04/05/17 18:38, Wilco Dijkstra wrote: >>> > Richard Earnshaw wrote: >>> > >>>>> -  5, 

Re: [PATCH][AArch64] Improve Cortex-A53 shift bypass

2017-06-13 Thread Wilco Dijkstra
ping   Richard Earnshaw (lists) wrote: > --- a/gcc/config/arm/aarch-common.c > +++ b/gcc/config/arm/aarch-common.c > @@ -254,12 +254,7 @@ arm_no_early_alu_shift_dep (rtx producer, rtx consumer) >  return 0; >  >    if ((early_op = arm_find_shift_sub_rtx (op))) > -    { > -  if (REG_P (

Re: [PATCH][ARM] Remove movdi_vfp_cortexa8

2017-06-13 Thread Wilco Dijkstra
ping   Richard Earnshaw (lists) wrote: >  (define_insn "*movdi_vfp" > -  [(set (match_operand:DI 0 "nonimmediate_di_operand" > "=r,r,r,r,q,q,m,w,r,w,w, Uv") > +  [(set (match_operand:DI 0 "nonimmediate_di_operand" > "=r,r,r,r,q,q,m,w,!r,w,w, Uv") > Why have you introduced a no-reloads block

Re: [RFC][PATCH][AArch64] Cleanup frame pointer usage

2017-06-13 Thread Wilco Dijkstra
ping From: Wilco Dijkstra Sent: 31 October 2016 18:29 To: GCC Patches Cc: nd Subject: [RFC][PATCH][AArch64] Cleanup frame pointer usage     This patch cleans up all code related to the frame pointer.  On AArch64 we emit a frame chain even in cases where the frame pointer is not required. So

Re: [PATCH][ARM] Fix ldrd offsets

2017-06-13 Thread Wilco Dijkstra
  ping From: Wilco Dijkstra Sent: 03 November 2016 12:20 To: GCC Patches Cc: nd Subject: [PATCH][ARM] Fix ldrd offsets     Fix ldrd offsets of Thumb-2 - for TARGET_LDRD the range is +-1020, without -255..4091.  This reduces the number of addressing instructions when using DI mode operations

Re: [PATCH][ARM] Improve max_insns_skipped logic

2017-06-13 Thread Wilco Dijkstra
ping From: Wilco Dijkstra Sent: 10 November 2016 17:19 To: GCC Patches Cc: nd Subject: [PATCH][ARM] Improve max_insns_skipped logic     Improve the logic when setting max_insns_skipped.  Limit the maximum size of IT to MAX_INSN_PER_IT_BLOCK as otherwise multiple IT instructions are needed

Re: [PATCH][ARM] Remove Thumb-2 iordi_not patterns

2017-06-13 Thread Wilco Dijkstra
ping From: Wilco Dijkstra Sent: 17 January 2017 18:00 To: GCC Patches Cc: nd; Kyrylo Tkachov; Richard Earnshaw Subject: [PATCH][ARM] Remove Thumb-2 iordi_not patterns     After Bernd's DImode patch [1] almost all DImode operations are expanded early (except for -mfpu=neon). This mean

Re: [PATCH][ARM] Remove DImode expansions for 1-bit shifts

2017-06-13 Thread Wilco Dijkstra
ping From: Wilco Dijkstra Sent: 17 January 2017 19:23 To: GCC Patches Cc: nd; Kyrill Tkachov; Richard Earnshaw Subject: [PATCH][ARM] Remove DImode expansions for 1-bit shifts     A left shift of 1 can always be done using an add, so slightly adjust rtx cost for DImode left shift by 1 so that

Re: [PATCH v3][AArch64] Fix symbol offset limit

2017-06-13 Thread Wilco Dijkstra
ping From: Wilco Dijkstra Sent: 17 January 2017 15:14 To: Richard Earnshaw; GCC Patches; James Greenhalgh Cc: nd Subject: Re: [PATCH v3][AArch64] Fix symbol offset limit     Here is v3 of the patch - tree_fits_uhwi_p was necessary to ensure the size of a declaration is an integer. So the

Re: [RFC][PATCH][AArch64] Cleanup frame pointer usage

2017-06-14 Thread Wilco Dijkstra
James Greenhalgh wrote: > I note this is still marked as an RFC, are you now proposing it as a > patch to be merged to trunk? Absolutely. It was marked as an RFC to get some comments - I thought it may be controversial to separate the frame pointer and frame chain concept. And this fixes the lon

Re: [PATCH v3][AArch64] Fix symbol offset limit

2017-06-14 Thread Wilco Dijkstra
Hi, Let's get back to the patch and the bug it fixes. The only outstanding question is what constant offsets we should allow when generating a relocation: > So the question is whether we should allow > largish offsets outside of the bounds of symbols (v1), no offsets (this > version), or > smal

Re: [RFC][PATCH][AArch64] Cleanup frame pointer usage

2017-06-15 Thread Wilco Dijkstra
Wilco Dijkstra wrote: > James Greenhalgh wrote: > > > I note this is still marked as an RFC, are you now proposing it as a > > patch to be merged to trunk? > > Absolutely. It was marked as an RFC to get some comments - I thought it > may be controversial to separate

Re: [PATCH v3][AArch64] Fix symbol offset limit

2017-06-15 Thread Wilco Dijkstra
Richard Earnshaw wrote: > Yes, I still believe that this is a bug in the way we've documented the > -mcmodel=tiny and -mcmodel=small options. In what way could this possibly be a documentation bug? It's not at all related to the size of a binary. There is no limit to the offset you can apply to a

Re: [RFC][PATCH][AArch64] Cleanup frame pointer usage

2017-06-15 Thread Wilco Dijkstra
Jiong Wang wrote: test.c === struct K {   int a;   int b;   int c;   int d;   char e;   short f;   long g;   float h;   double i; }; void foo (int, struct K *); void test (int i) {   struct K k = {    .a = 5,    .b = 0,    .c = i,   };   foo (5, &k); } There are 2 separate latent bugs here, bo

Re: [PATCH v3][AArch64] Fix symbol offset limit

2017-06-15 Thread Wilco Dijkstra
Richard Earnshaw wrote: > > You can write it, but it's meaningless by the C standard.  You can't > take the address beyond one after the size of the object, so anything > more than &a+1 has no meaning. No it's perfectly valid and such out-of-range cases occur thousands of times when building any n

Re: [PATCH v3][AArch64] Fix symbol offset limit

2017-06-15 Thread Wilco Dijkstra
Richard Earnshaw wrote: C11: Summary of undefined behaviours. — Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object (6.5.6). That's totally irrelevant given the addition i

Re: [PATCH v3][AArch64] Fix symbol offset limit

2017-06-15 Thread Wilco Dijkstra
Richard Earnshaw wrote: > No it's not.  The optimizer doesn't create totally random bases.  If the > code + data is less than 1M in size, then any offsets it does create > will fit within the size of the relocations selected by the compiler. No that's completely false. There is no way you can guar

[PATCH][AArch64] Mark symbols as constant

2017-06-19 Thread Wilco Dijkstra
add w0, w0, w2 cmp w0, 100 ble .L5 ldr w2, [x3, 8] add w1, w1, w2 .L5: ldr w2, [x3, 4] add w0, w0, w2 add w0, w0, w1 ret Passes regress and bootstrap, OK for commit? ChangeLog: 2017-06-19 Wilco

[PATCH][AArch64] Improve dup pattern

2017-06-20 Thread Wilco Dijkstra
ret Passes regress & bootstrap, OK for commit? ChangeLog: 2017-06-20 Wilco Dijkstra * config/aarch64/aarch64-simd.md (aarch64_simd_dup): Swap alternatives, make integer dup more expensive. -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarc

[PATCH][AArch64] Emit SIMD moves as mov

2017-06-20 Thread Wilco Dijkstra
SIMD moves are currently emitted as ORR. Change this to use the MOV pseudo instruction just like integer moves (the ARM-ARM states MOV is the preferred disassembly), improving readability of -S output. Passes bootstrap, OK for commit? ChangeLog: 2017-06-20 Wilco Dijkstra * config

Re: [PATCH][AArch64] Emit SIMD moves as mov

2017-06-20 Thread Wilco Dijkstra
James Greenhalgh wrote: > > Does this introduce a dependency on a particular binutils version, or have > we always supported this alias? > > The patch looks OK, but I don't want to introduce a new dependency so please > check how far back this is supported. Well gas/testsuite/gas/aarch64/alias.s

Re: [PATCH][AArch64] Mark symbols as constant

2017-06-20 Thread Wilco Dijkstra
p OK, OK for commit? ChangeLog: 2017-06-20 Wilco Dijkstra * config/aarch64/aarch64.c (aarch64_legitimate_constant_p): Return true for non-tls symbols. -- diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5ec6bbfcf484baa4005b8

Re: [PATCH][AArch64] Improve dup pattern

2017-06-20 Thread Wilco Dijkstra
James Greenhalgh wrote: > > Have you tested this in cases where an integer dup is definitely the right > thing to do? Yes, this still generates:   #include   void f(unsigned a, unsigned b, uint32x4_t *c)   {     c[0] = vdupq_n_u32(a);     c[1] = vdupq_n_u32(b);   } dup v1.4s, w0

Re: RFC: stack/heap collision vulnerability and mitigation with GCC

2017-06-20 Thread Wilco Dijkstra
Jeff Law wrote: > But the stack pointer might have already been advanced into the guard > page by the caller.   For the sake of argument assume the guard page is > 0xf1000 and assume that our stack pointer at entry is 0xf1010 and that > the caller hasn't touched the 0xf1000 page. > > If FrameSize >

Re: RFC: stack/heap collision vulnerability and mitigation with GCC

2017-06-21 Thread Wilco Dijkstra
Richard Earnshaw wrote: > A mere 256 bytes for the caller would permit 32 x 8byte arguments on the > stack which, with at least 8 parameters passed in registers, would allow > for calls with 40 parameters.  There can't be many in that space.  Any > function making calls with more than that might ne

Re: RFC: stack/heap collision vulnerability and mitigation with GCC

2017-06-21 Thread Wilco Dijkstra
Jeff Law wrote: > I'm a little confused.  I'm not defining or changing the ABI.  I'm > working within my understanding of the existing aarch64 ABI used on > linux systems.  My understanding after reading that ABI and the prologue > code for aarch64 is there's nothing that can currently be relied u

Re: RFC: stack/heap collision vulnerability and mitigation with GCC

2017-06-22 Thread Wilco Dijkstra
Jeff Law wrote: > You can be in one of 3 states when you start the callee's prologue. > > 1. You're somewhere in the normal stack. > > 2. You've past the guard and are already in the heap or elsewhere > > 3. You're somewhere in the guard > > State #3 is what we're trying to address.  The attacker h

Re: [PATCH][AArch64] Mark symbols as constant

2017-06-23 Thread Wilco Dijkstra
Andreas Schwab wrote: > > This breaks gcc.target/aarch64/reload-valid-spoff.c with -mabi=ilp32: Indeed, there is a odd ILP32 bug that causes high/lo_sum to be generated in SI mode in expand: (insn 15 14 16 4 (set (reg:SI 125) (high:SI (symbol_ref/u:DI ("*.LC1") [flags 0x2]))) (nil))

[PATCH][AArch64] Fix ldp/stp patterns for ILP32

2017-06-26 Thread Wilco Dijkstra
/aarch64/reload-valid-spoff.c triggered by https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01367.html. OK for commit? ChangeLog: 2017-06-26 Wilco Dijkstra * config/aarch64/aarch64.md (load_pairsi): Avoid Pmode. (store_pairsi): Likewise. (load_pairdi): Likewise

Re: [PATCH][AArch64][GCC 6] PR target/79041: Correct -mpc-relative-literal-loads logic in aarch64_classify_symbol

2017-06-27 Thread Wilco Dijkstra
Hi Yvan, > Here is the backport of Wilco's patch (r237607) along with Kyrill's > one (r244643, which removed the remaining occurences of > aarch64_nopcrelative_literal_loads).  To fix the issue the original > patch has to be modified, to keep aarch64_pcrelative_literal_loads > test for large model

[PATCH][AArch64] Fix ILP32 memory access

2017-06-27 Thread Wilco Dijkstra
Pmode as the base address, but aarch64_expand_mov_immediate wasn't emitting a conversion in one case. Besides fixing this add an assert that flags any MEM operands that are not Pmode. Passes regress (with/without ilp32). OK for commit? ChangeLog: 2017-06-27 Wilco Dijkstra * c

Re: [PATCH][AArch64] Fix ldp/stp patterns for ILP32

2017-06-27 Thread Wilco Dijkstra
Hi, This patch has been superseded by: https://gcc.gnu.org/ml/gcc-patches/2017-06/msg02027.html Wilco

[PATCH][AArch64] Fix PR79041

2017-06-27 Thread Wilco Dijkstra
? ChangeLog: 2017-06-27 Wilco Dijkstra PR target/79041 * config/aarch64/aarch64.c (aarch64_classify_symbol): Avoid SYMBOL_SMALL_ABSOLUTE . * testsuite/gcc.target/aarch64/pr79041-2.c: New test. -- diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c

Re: [PATCH v3][AArch64] Fix symbol offset limit

2017-06-27 Thread Wilco Dijkstra
ping From: Wilco Dijkstra Sent: 17 January 2017 15:14 To: Richard Earnshaw; GCC Patches; James Greenhalgh Cc: nd Subject: Re: [PATCH v3][AArch64] Fix symbol offset limit     Here is v3 of the patch - tree_fits_uhwi_p was necessary to ensure the size of a declaration is an integer. So the

Re: [RFC][PATCH][AArch64] Cleanup frame pointer usage

2017-06-27 Thread Wilco Dijkstra
ping   Wilco Dijkstra wrote: > James Greenhalgh wrote: > > > I note this is still marked as an RFC, are you now proposing it as a > > patch to be merged to trunk? > > Absolutely. It was marked as an RFC to get some comments - I thought it > may be controversial to

Re: [PATCH][ARM] Remove DImode expansions for 1-bit shifts

2017-06-27 Thread Wilco Dijkstra
  ping From: Wilco Dijkstra Sent: 17 January 2017 19:23 To: GCC Patches Cc: nd; Kyrill Tkachov; Richard Earnshaw Subject: [PATCH][ARM] Remove DImode expansions for 1-bit shifts     A left shift of 1 can always be done using an add, so slightly adjust rtx cost for DImode left shift by 1 so

Re: [PATCH][ARM] Improve max_insns_skipped logic

2017-06-27 Thread Wilco Dijkstra
  ping From: Wilco Dijkstra Sent: 10 November 2016 17:19 To: GCC Patches Cc: nd Subject: [PATCH][ARM] Improve max_insns_skipped logic     Improve the logic when setting max_insns_skipped.  Limit the maximum size of IT to MAX_INSN_PER_IT_BLOCK as otherwise multiple IT instructions are needed

Re: [PATCH][ARM] Remove Thumb-2 iordi_not patterns

2017-06-27 Thread Wilco Dijkstra
  ping From: Wilco Dijkstra Sent: 17 January 2017 18:00 To: GCC Patches Cc: nd; Kyrylo Tkachov; Richard Earnshaw Subject: [PATCH][ARM] Remove Thumb-2 iordi_not patterns     After Bernd's DImode patch [1] almost all DImode operations are expanded early (except for -mfpu=neon). This mean

Re: [PATCH][ARM] Fix ldrd offsets

2017-06-27 Thread Wilco Dijkstra
    ping From: Wilco Dijkstra Sent: 03 November 2016 12:20 To: GCC Patches Cc: nd Subject: [PATCH][ARM] Fix ldrd offsets     Fix ldrd offsets of Thumb-2 - for TARGET_LDRD the range is +-1020, without -255..4091.  This reduces the number of addressing instructions when using DI mode operations

Re: [PATCH][ARM] Remove movdi_vfp_cortexa8

2017-06-27 Thread Wilco Dijkstra
  ping     Richard Earnshaw (lists) wrote: >  (define_insn "*movdi_vfp" > -  [(set (match_operand:DI 0 "nonimmediate_di_operand" > "=r,r,r,r,q,q,m,w,r,w,w, Uv") > +  [(set (match_operand:DI 0 "nonimmediate_di_operand" > "=r,r,r,r,q,q,m,w,!r,w,w, Uv") > Why have you introduced a no-reloads

Re: [PATCH][AArch64] Improve Cortex-A53 shift bypass

2017-06-27 Thread Wilco Dijkstra
ping   On Fri, May 05, 2017 at 05:02:46PM +0100, Wilco Dijkstra wrote: > Richard Earnshaw (lists) wrote: > > > --- a/gcc/config/arm/aarch-common.c > > +++ b/gcc/config/arm/aarch-common.c > > @@ -254,12 +254,7 @@ arm_no_early_alu_shift_dep (rtx producer, rtx con

Re: [PATCH][ARM] Update max_cond_insns settings

2017-06-27 Thread Wilco Dijkstra
  ping     Richard Earnshaw (lists) wrote: > On 05/05/17 13:42, Wilco Dijkstra wrote: >> Richard Earnshaw (lists) wrote: >>> On 04/05/17 18:38, Wilco Dijkstra wrote: >>> > Richard Earnshaw wrote: >>> > >>>>> -  5, 

Re: [PATCH][AArch64] Improve Cortex-A53 shift bypass

2017-06-28 Thread Wilco Dijkstra
Ramana Radhakrishnan wrote: >  > I'm about to run home for the day but this came in from > https://gcc.gnu.org/ml/gcc-patches/2013-09/msg02109.html and James > said in that email that this was put in to ensure no segfaults on > cortex-a15 / cortex-a7 tuning. The code is historical - an older ve

Re: [patch][Ping #3] PR80929: Realistic PARALLEL cost in seq_cost.

2017-06-28 Thread Wilco Dijkstra
Georg-Johann Lay wrote: @@ -5300,6 +5300,9 @@ seq_cost (const rtx_insn *seq, bool spee set = single_set (seq); if (set) cost += set_rtx_cost (set, speed); + else if (INSN_P (seq) + && PARALLEL == GET_CODE (PATTERN (seq))) + cost += insn_rtx_cost (PATT

Re: [PATCH] Transform (m1 > m2) * d into m1> m2 ? d : 0

2017-06-29 Thread Wilco Dijkstra
Richard Biener wrote: > Hurugalawadi, Naveen wrote: > > The code (m1 > m2) * d code should be optimized as m1> m2 ? d : 0. > What's the reason of this transform? I expect that the HW multiplier > is quite fast given one operand is either zero or one and a multiplication > is a gimple operation th

Re: [PATCH] Transform (m1 > m2) * d into m1> m2 ? d : 0

2017-06-29 Thread Wilco Dijkstra
Richard Biener wrote: > int f (int m, int c) > { >  return (m & 1) * c; > } This case (integer[0,1] rather than boolean input) should be transformed into c & -(m & 1). Wilco

Re: [PATCH][AArch64] Fix ILP32 memory access

2017-07-04 Thread Wilco Dijkstra
Andreas Schwab wrote: > @@ -5207,6 +5209,7 @@ aarch64_print_operand (FILE *f, rtx x, int code) >  >    case MEM: >  output_address (GET_MODE (x), XEXP (x, 0)); > +   gcc_assert (GET_MODE (XEXP (x, 0)) == Pmode); >  break; >  >    case CONST: > That breaks a lot of gna

Re: [PATCH][AArch64] Fix ILP32 memory access

2017-07-04 Thread Wilco Dijkstra
Michael Matz wrote: > > You'll probably also have to set GNATBIND and GNATMAKE to the > appropriately suffixed variants.  Just saying, because that's what I'm > usually forgetting and end up with strange errors :) Configure seems to be able to find gnatbind/gnatmake as they are in /usr/bin. Com

Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-08 Thread Wilco Dijkstra
Joseph Myers wrote: > On Fri, 3 Nov 2017, Wilco Dijkstra wrote: > > > Almost all targets add an explict -fomit-frame-pointer in the target > > specific > > options.  Rather than doing this in a target-specific way, do this in the > > Which targets do not?  You shou

Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-08 Thread Wilco Dijkstra
Jeff Law wrote: > I'd actually prefer to deprecate the H8 and M68k.  But assuming that's > not going to happen in the immediate future I think dropping frame > pointers on those targets is appropriate as long as we're generating > dwarf frame info. Is there a way to check a target does not genera

Re: [PATCH] Canonicalize constant multiplies in division

2017-11-15 Thread Wilco Dijkstra
Richard Biener wrote: > On Tue, Oct 17, 2017 at 6:32 PM, Wilco Dijkstra > wrote: >>  (if (flag_reciprocal_math) >> - /* Convert (A/B)/C to A/(B*C)  */ >> + /* Convert (A/B)/C to A/(B*C). */ >>   (simplify >>    (rdiv (rdiv:s @0 @1) @2) >> -   (rdiv @0

Re: [PATCH] Simplify floating point comparisons

2017-11-15 Thread Wilco Dijkstra
Richard Biener wrote: > On Tue, Oct 17, 2017 at 6:28 PM, Wilco Dijkstra > wrote: >> +(if (flag_unsafe_math_optimizations) >> +  /* Simplify (C / x op 0.0) to x op 0.0 for C > 0.  */ >> +  (for op (lt le gt ge) >> +   neg_op (gt ge lt le) >> +    (sim

Re: [PATCH] Set default to -fomit-frame-pointer

2017-11-15 Thread Wilco Dijkstra
Sandra Loosemore wrote: > I'd prefer that you remove the reference to configure options entirely > here.  Nowadays most GCC users install a package provided by their OS > distribution, Linaro, etc, rather than trying to build GCC from scratch. OK, I've removed that reference. Similarly the FRAM

[COMMITTED][AArch64] Fix frame tests

2017-11-16 Thread Wilco Dijkstra
Improve the AArch64 frame tests - add -f(no-)omit-frame-pointer, update checks and add missing tests. As a result all tests now pass. Committed as obvious. ChangeLog: 2017-11-16 Wilco Dijkstra * gcc.target/aarch64/lr_free_2.c: Fix test. * gcc.target/aarch64/spill_1.c

Re: [PATCH] Factor out division by squares and remove division around comparisons (2/2)

2017-11-16 Thread Wilco Dijkstra
ping From: Jackson Woodruff Sent: 06 September 2017 10:55 To: Richard Biener Cc: Wilco Dijkstra; kyrylo.tkac...@foss.arm.com; Joseph S. Myers; GCC Patches Subject: Re: [PATCH] Factor out division by squares and remove division around comparisons (2/2)   Hi all, A minor improvement came to

[PATCH] Disable -ftrapping-math by default

2017-11-16 Thread Wilco Dijkstra
ase that should cause a FP exception: void f(void) { 0.0 / 0.0; } Compiles to: f: ret OK for commit? 2017-11-16 Wilco Dijkstra * common.opt (ftrapping-math): Change default to 0. * doc/invoke.texi (-ftrapping-math): Update documentation. -- diff --git a/gcc/c

Re: [PATCH] Disable -ftrapping-math by default

2017-11-16 Thread Wilco Dijkstra
Richard Biener wrote: > We are generally not preserving traps but we guard any transform that > might introduce traps with -ftrapping-math.  That's similar to how we treat > -ftrapv and pointer dereferences. Right. It appears it's mostly concerned about division - if it is about division by zero

[PATCH][AArch64] Remove remaining uses of * in patterns

2017-11-17 Thread Wilco Dijkstra
Passes regress & bootstrap, OK for commit? ChangeLog: 2017-11-17 Wilco Dijkstra * config/aarch64/aarch64.md (mov): Remove '*' in alternatives. (movsi_aarch64): Likewise. (load_pairsi): Likewise. (load_pairdi): Likewise. (store_p

[PATCH][AArch64] Set SLOW_BYTE_ACCESS

2017-11-17 Thread Wilco Dijkstra
for commit until we get rid of it? ChangeLog: 2017-11-17 Wilco Dijkstra gcc/ * config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1. -- diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 056110afb228fb919e837c04aa5e55

[RFC][PATCH] Remove SLOW_BYTE_ACCESS

2017-11-17 Thread Wilco Dijkstra
at way you could pass the size/alignment/volatile and decide per bitfield access. What do people think? ChangeLog: 2017-11-17 Wilco Dijkstra * config/aarch64/aarch64.h: Remove SLOW_BYTE_ACCESS. * config/alpha/alpha.h: Likewise. * config/arc/arc.h: Likewise. *

[RFC][PATCH] Change default to -fcommon

2017-11-17 Thread Wilco Dijkstra
any packages fail to get an idea how feasible it is. We could keep defaulting to -fcommon with -std=c89 if necessary. 2017-11-17 Wilco Dijkstra * common.opt (fcommon): Change init to 1. * doc/invoke.texi (-fcommon): Update documentation. -- diff --git a/gcc/common.opt b/

Re: [RFC][PATCH] Change default to -fcommon

2017-11-20 Thread Wilco Dijkstra
Richard Biener wrote: > A target specific default might be a good idea if we decide to revert. > > Note I proposed this change a few times already, but the fear was always > we'll break too much legacy code. It will definitely break some code, but new warnings with -Werror might too... > Note y

Re: [RFC][PATCH] Change default to -fcommon

2017-11-21 Thread Wilco Dijkstra
Michael Matz wrote: > bss _sections_ != bss-like segments in the executable.  Targets might not > have a bss section that could be named in the asm file, or no way to > switch to it without disrupting surrounding code, but they might have > common symbols, which ultimately might or might not be

[PATCH][AArch64] Fix ICE due to store_pair_lanes

2017-11-27 Thread Wilco Dijkstra
that applies to store_pair_lanes, uses PARALLEL when calling aarch64_classify_address so that it knows it is an STP. Also add the 'z' specifier for future use by load/store pair instructions. Passes regress, OK for commit? ChangeLog: 2017-11-27 Wilco Dijkstra *

Re: [RFA][PATCH] Stack clash protection 07/08 -- V4 (aarch64 bits)

2017-11-27 Thread Wilco Dijkstra
Szabolcs Nagy wrote: >On 28/10/17 05:08, Jeff Law wrote: > >> My hope would be that we simply don't ever use the params.  They were >> done as much for *you* to experiment with as anything.  I'd happy just >> delete them as there's essentially no guard rails to ensure their values >> are sane. > >

[PATCH][AArch64] Fix address printing on ILP32

2017-11-30 Thread Wilco Dijkstra
the ICE in https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02509.html. ChangeLog: 2017-11-30 Wilco Dijkstra gcc/ * config/aarch64/aarch64.md (call_insn): Use %c rather than %a. (call_value_insn): Likewise. (sibcall_insn): Likewise. (sibcall_value_

Re: [PATCH][Middle-end]2nd patch of PR78809 and PR83026

2017-12-15 Thread Wilco Dijkstra
Hi Qing, Just looking at a very high level, I have a few comments: 1. Constant folding str(n)cmp - folding is done separately in fold-const-call.c and gimple-fold.c. There is already code for folding strcmp and strncmp, so we shouldn't need to add new foldings. Or do you have an example t

[PATCH] Fix PR83491

2017-12-20 Thread Wilco Dijkstra
comments more readable. Bootstrap OK, OK for trunk? ChangeLog: 2017-12-20 Wilco Dijkstra gcc/ PR tree-optimization/83491 * tree-ssa-math-opts.c (execute_cse_reciprocals_1): Check for SSA_NAME before walking uses. Improve coding style and comments. gcc/testsuite

Re: [PATCH] Simplify floating point comparisons

2018-01-04 Thread Wilco Dijkstra
ping (note also Jeff's reply https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01916.html) From: Wilco Dijkstra Sent: 15 November 2017 15:36 To: Richard Biener Cc: GCC Patches; nd Subject: Re: [PATCH] Simplify floating point comparisons   Richard Biener wrote: > On Tue, Oct 17, 2017 at

[PATCH][AArch64] Improve register allocation of fma

2018-01-04 Thread Wilco Dijkstra
? ChangeLog: 2018-01-04 Wilco Dijkstra gcc/ * config/aarch64/aarch64.md (fma4): Change into expand pattern. (fnma4): Likewise. (fms4): Likewise. (fnms4): Likewise. (aarch64_fma4): Rename insn, reorder accumulator operand. (aarch64_fnma4): Likewise

<    1   2   3   4   5   6   7   8   9   10   >