[AArch64][1/14] ARMv8.2-A FP16 data processing intrinsics

2016-07-07 Thread Jiong Wang
Several data-processing instructions are agnostic to the type of their operands. This patch add the mapping between them and those bit- and lane-manipulation instructions. No ARMv8.2-A FP16 extension hardware support is required for these intrinsics. gcc/ 2016-07-07 Jiong Wang

[AArch64][6/14] ARMv8.2-A FP16 reduction vector intrinsics

2016-07-07 Thread Jiong Wang
This patch add ARMv8.2-A FP16 reduction vector intrinsics. gcc/ 2016-07-07 Jiong Wang * config/aarch64/arm_neon.h (vmaxv_f16): New. (vmaxvq_f16): Likewise. (vminv_f16): Likewise. (vminvq_f16): Likewise. (vmaxnmv_f16): Likewise. (vmaxnmvq_f16

[AArch64][8/14] ARMv8.2-A FP16 two operands scalar intrinsics

2016-07-07 Thread Jiong Wang
This patch add ARMv8.2-A FP16 two operands scalar intrinsics. 2016-07-07 Jiong Wang gcc/ * config/aarch64/aarch64-simd-builtins.def: Register new builtins. * config/aarch64/aarch64.md (hf3): New. (hf3): Likewise. (add3): Likewise. (sub3): Likewise

[AArch64][9/14] ARMv8.2-A FP16 three operands scalar intrinsics

2016-07-07 Thread Jiong Wang
This patch add ARMv8.2-A FP16 three operands scalar intrinsics. gcc/ 2016-07-07 Jiong Wang * config/aarch64/aarch64-simd-builtins.def: Register new builtins. * config/aarch64/aarch64.md (fma): New for HF. (fnma): Likewise. * config/aarch64/arm_fp16.h (vfmah_f16

[AArch64][10/14] ARMv8.2-A FP16 lane scalar intrinsics

2016-07-07 Thread Jiong Wang
This patch adds ARMv8.2-A FP16 lane scalar intrinsics. gcc/ 2016-07-07 Jiong Wang * config/aarch64/arm_neon.h (vfmah_lane_f16): New. (vfmah_laneq_f16): Likewise. (vfmsh_lane_f16): Likewise. (vfmsh_laneq_f16): Likewise. (vmulh_lane_f16): Likewise

[AArch64][11/14] ARMv8.2-A FP16 testsuite selector

2016-07-07 Thread Jiong Wang
directives arm_v8_2a_fp16_scalar_ok, arm_v8_2a_fp16_scalar_hw, arm_v8_2a_fp16_neon_ok and arm_v8_2a_fp16_neon_hw to check for target and hardware support of FP16 instructions on AArch64. gcc/testsuite/ 2016-07-07 Matthew Wahab Jiong Wang * target-supports.exp

[AArch64][12/14] ARMv8.2-A testsuite for new data movement intrinsics

2016-07-07 Thread Jiong Wang
This patch contains testcases for those new scalar intrinsics which are only available for AArch64. gcc/testsuite/ 2016-07-07 Jiong Wang * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h (FP16_SUPPORTED): Enable AArch64. * gcc.target/aarch64/advsimd-intrinsics

[AArch64][13/14] ARMv8.2-A testsuite for new vector intrinsics

2016-07-07 Thread Jiong Wang
This patch contains testcases for those new vector intrinsics which are only available for AArch64. gcc/testsuite/ 2016-07-07 Jiong Wang * gcc.target/aarch64/advsimd-intrinsics/vdiv_f16_1.c: New. * gcc.target/aarch64/advsimd-intrinsics/vfmas_lane_f16_1.c: New

[AArch64][14/14] ARMv8.2-A testsuite for new scalar intrinsics

2016-07-07 Thread Jiong Wang
This patch contains testcases for those new scalar intrinsics which are only available for AArch64. gcc/testsuite/ 2016-07-07 Jiong Wang * gcc.target/aarch64/advsimd-intrinsics/unary_scalar_op.inc: Support FMT64. * gcc.target/aarch64/advsimd-intrinsics/vabdh_f16_1.c: New

[AArch64][3/14] ARMv8.2-A FP16 two operands vector intrinsics

2016-07-07 Thread Jiong Wang
This patch add ARMv8.2-A FP16 two operands vector intrinsics. gcc/ 2016-07-07 Jiong Wang * config/aarch64/aarch64-simd-builtins.def: Register new builtins. * config/aarch64/aarch64-simd.md (aarch64_rsqrts): Extend to HF modes. (fabd3): Likewise. (3): Likewise

[AArch64][2/14] ARMv8.2-A FP16 one operand vector intrinsics

2016-07-07 Thread Jiong Wang
VDQF are with new FP16 support, thus we introduced new, temperary iterators, and only apply new iterators on those patterns which do have FP16 supports. gcc/ 2016-07-07 Jiong Wang * config/aarch64/aarch64-builtins.c (TYPES_BINOP_USS): New. * config/aarch64/aarch64-simd-builtin

[AArch64][5/14] ARMv8.2-A FP16 lane vector intrinsics

2016-07-07 Thread Jiong Wang
intrinsics with vdup intrinsics 2016-07-07 Jiong Wang gcc/ * config/aarch64/aarch64-simd.md (*aarch64_mulx_elt_to_64v2df): Rename to "*aarch64_mulx_elt_from_dup". (*aarch64_mul3_elt): Update schedule type. (*aarch64_mul3_elt_from_dup)

[AArch64][7/14] ARMv8.2-A FP16 one operand scalar intrinsics

2016-07-07 Thread Jiong Wang
This patch add ARMv8.2-A FP16 one operand scalar intrinsics Scalar intrinsics are kept in arm_fp16.h instead of arm_neon.h. gcc/ 2016-07-07 Jiong Wang * config.gcc (aarch64*-*-*): Install arm_fp16.h. * config/aarch64/aarch64-builtins.c (hi_UP): New. * config/aarch64

[AArch64][4/14] ARMv8.2-A FP16 three operands vector intrinsics

2016-07-07 Thread Jiong Wang
This patch add ARMv8.2-A FP16 three operands vector intrinsics. Three operands intrinsics only contain fma and fms. 2016-07-07 Jiong Wang gcc/ * config/aarch64/aarch64-simd-builtins.def: Register new builtins. * config/aarch64/aarch64-simd.md (fma4): Extend to HF modes

[COMMITTED][AArch64] Fix simd intrinsics bug on float vminnm/vmaxnm

2016-07-08 Thread Jiong Wang
On 07/07/16 10:34, James Greenhalgh wrote: To make backporting easier, could you please write a very simple standalone test that exposes this bug, and submit this patch with just that simple test? I've already OKed the functional part of this patch, and I'm happy to pre-approve a simple testcase

[AArch64][1/3] Migrate aarch64_add_constant to new interface & kill aarch64_build_constant

2016-07-20 Thread Jiong Wang
ll the old aarch64_build_constant. OK for trunk? gcc/ 2016-07-20 Jiong Wang * config/aarch64/aarch64.c (aarch64_add_constant): New parameter "mode". Use aarch64_internal_mov_immediate instead of aarch64_build_constant. (aarch64_build_cons

[AArch64][2/3] Optimize aarch64_add_constant to generate better addition sequences

2016-07-20 Thread Jiong Wang
#x27;t do this if it fit into single move instruction, in which case move the immedaite to scratch register firstly, then generate one addition to add the scratch register to the destination register. * Otherwise invoke general constant build function. OK for trunk? gcc/ 20

[AArch64][3/3] Migrate aarch64_expand_prologue/epilogue to aarch64_add_constant

2016-07-20 Thread Jiong Wang
ch as aarch64_add_constant has better utilization of scratch register. OK for trunk? gcc/ 2016-07-20 Jiong Wang * config/aarch64/aarch64.c (aarch64_add_constant): Mark instruction as frame related when it is. Generate CFA annotation when it's

Re: [AArch64][3/3] Migrate aarch64_expand_prologue/epilogue to aarch64_add_constant

2016-07-20 Thread Jiong Wang
On 20/07/16 15:18, Richard Earnshaw (lists) wrote: On 20/07/16 14:03, Jiong Wang wrote: Those stack adjustment sequences inside aarch64_expand_prologue/epilogue are doing exactly what's aarch64_add_constant offered, except they also need to be aware of dwarf generation. This patch

Re: [AArch64][2/14] ARMv8.2-A FP16 one operand vector intrinsics

2016-07-20 Thread Jiong Wang
On 07/07/16 17:14, Jiong Wang wrote: This patch add ARMv8.2-A FP16 one operand vector intrinsics. We introduced new mode iterators to cover HF modes, qualified patterns which was using old mode iterators are switched to new ones. We can't simply extend old iterator like VDQF to conver HF

Re: [AArch64][3/14] ARMv8.2-A FP16 two operands vector intrinsics

2016-07-20 Thread Jiong Wang
On 07/07/16 17:15, Jiong Wang wrote: This patch add ARMv8.2-A FP16 two operands vector intrinsics. The updated patch resolve the conflict with https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00309.html The change is to let aarch64_emit_approx_div return false for V4HFmode and V8HFmode. gcc

Re: [AArch64][7/14] ARMv8.2-A FP16 one operand scalar intrinsics

2016-07-20 Thread Jiong Wang
On 07/07/16 17:17, Jiong Wang wrote: This patch add ARMv8.2-A FP16 one operand scalar intrinsics Scalar intrinsics are kept in arm_fp16.h instead of arm_neon.h. The updated patch resolve the conflict with https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00308.html The change is to let

Re: [AArch64][8/14] ARMv8.2-A FP16 two operands scalar intrinsics

2016-07-20 Thread Jiong Wang
On 07/07/16 17:17, Jiong Wang wrote: This patch add ARMv8.2-A FP16 two operands scalar intrinsics. The updated patch resolve the conflict with https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00309.html The change is to let aarch64_emit_approx_div return false for HFmode. gcc/ 2016-07-20

Re: [AArch64][3/3] Migrate aarch64_expand_prologue/epilogue to aarch64_add_constant

2016-07-25 Thread Jiong Wang
On 21/07/16 11:08, Richard Earnshaw (lists) wrote: On 20/07/16 16:02, Jiong Wang wrote: Richard, Thanks for the review, yes, I believe using aarch64_add_constant is unconditionally safe here. Because we have generated a stack tie to clobber the whole memory thus prevent any instruction

Re: [5.0 Backport][AArch64] Fix simd intrinsics bug on float vminnm/vmaxnm

2016-07-29 Thread Jiong Wang
Jiong Wang writes: > On 07/07/16 10:34, James Greenhalgh wrote: >> >> To make backporting easier, could you please write a very simple >> standalone test that exposes this bug, and submit this patch with just >> that simple test? I've already OKed the functiona

Re: [Patch] Don't expand targetm.stack_protect_fail if it's NULL_TREE

2016-11-24 Thread Jiong Wang
gcc/ 2016-11-11 Jiong Wang * function.c (expand_function_end): Guard stack_protect_epilogue with ENABLE_DEFAULT_SSP_RUNTIME. * cfgexpand.c (pass_expand::execute): Likewise guard for stack_protect_prologue. * defaults.h (ENABLE_DEFAULT_SSP_RUNTIME): New

Re: [1/9][RFC][DWARF] Reserve three DW_OP numbers in vendor extension space

2016-11-30 Thread Jiong Wang
On 16/11/16 14:02, Jakub Jelinek wrote: On Wed, Nov 16, 2016 at 02:54:56PM +0100, Mark Wielaard wrote: On Wed, 2016-11-16 at 10:00 +, Jiong Wang wrote: The two operations DW_OP_AARCH64_paciasp and DW_OP_AARCH64_paciasp_deref were designed as shortcut operations when LR is signed with A

Re: [1/9][RFC][DWARF] Reserve three DW_OP numbers in vendor extension space

2016-12-01 Thread Jiong Wang
:54:56PM +0100, Mark Wielaard wrote: On Wed, 2016-11-16 at 10:00 +, Jiong Wang wrote: The two operations DW_OP_AARCH64_paciasp and DW_OP_AARCH64_paciasp_deref were designed as shortcut operations when LR is signed with A key and using function's CFA as salt. This is the default beha

Re: [Ping~][1/9][RFC][DWARF] Reserve three DW_OP numbers in vendor extension space

2016-12-12 Thread Jiong Wang
Jiong Wang writes: > On 16/11/16 14:02, Jakub Jelinek wrote: >> On Wed, Nov 16, 2016 at 02:54:56PM +0100, Mark Wielaard wrote: >>> On Wed, 2016-11-16 at 10:00 +, Jiong Wang wrote: >>>> The two operations DW_OP_AARCH64_paciasp and DW_OP_AARCH64_paciasp_der

[Ping^2][1/9][RFC][DWARF] Reserve three DW_OP numbers in vendor extension space

2016-12-19 Thread Jiong Wang
Jiong Wang writes: > Jiong Wang writes: > >> On 16/11/16 14:02, Jakub Jelinek wrote: >>> On Wed, Nov 16, 2016 at 02:54:56PM +0100, Mark Wielaard wrote: >>>> On Wed, 2016-11-16 at 10:00 +, Jiong Wang wrote: >>>>>

[AArch64][0/4] Improve variable argument (vaarg) support

2016-05-06 Thread Jiong Wang
issues. AArch64 boostrap OK, no regression, new testcases passed. --- Jiong Wang (4) Enable tree-stdarg pass for AArch64 by defining counter fields R63596, honor tree-stdarg analysis result to improve VAARG codegen Don't generate redundant checks when there is no composite arg Simplif

[AArch64][1/4] Enable tree-stdarg pass for AArch64 by defining counter fields

2016-05-06 Thread Jiong Wang
ok for upstream? 2016-05-06 Jiong Wang gcc/ * config/aarch64/aarch64.c (aarch64_build_builtin_va_list): Initialize va_list_gpr_counter_field and va_list_fpr_counter_field. gcc/testsuite/ * gcc.dg/tree-ssa/stdarg-2.c: Enable all testcases for AArch64. * gcc.dg/tree-ssa/stdarg-3.c: Likew

[AArch64][2/4] PR63596, honor tree-stdarg analysis result to improve VAARG codegen

2016-05-06 Thread Jiong Wang
s optimized into: f: ret OK for trunk? 2016-05-06 Jiong Wang gcc/ PR63596 * config/aarch64/aarch64.c (aarch64_expand_builtin_va_start): Honor tree-stdarg analysis results. (aarch64_setup_incoming_varargs): Likewise. gcc/testsuite/ PR63596 * gcc.target/aarch64/va_arg_1.c: New test

[AArch64][3/4] Don't generate redundant checks when there is no composite arg

2016-05-06 Thread Jiong Wang
from incoming_stack. And this simplified version actually is the most usual case. For example, this patch reduced this instructions number from about 130 to 100 for the included testcase. ok for trunk? 2016-05-06 Jiong Wang gcc/ * config/aarch64/aarch64.c (aarch64_gimplify_va_arg_expr):

[AArch64][4/4] Simplify cfg during vaarg gimplification

2016-05-06 Thread Jiong Wang
rg_offset + arg_size > 0)) fetch from stack else fetch from register OK for trunk? 2016-05-06 Alan Lawrence Jiong Wang gcc/ * config/aarch64/aarch64.c (aarch64_gimplify_va_arg_expr): Use TRUTH_ORIF_EXPR. gcc/testsuite/ * gcc.target/aarch64/va_arg_5.c: New

Re: [PATCH 3/3] shrink-wrap: Remove complicated simple_return manipulations

2016-05-11 Thread Jiong Wang
On 09/05/16 16:08, Segher Boessenkool wrote: Hi Christophe, On Mon, May 09, 2016 at 03:54:26PM +0200, Christophe Lyon wrote: After this patch, I've noticed that gcc.target/arm/pr43920-2.c now fails at: /* { dg-final { scan-assembler-times "pop" 2 } } */ Before the patch, the generated code w

Re: [Patch ARM/AArch64 09/11] Add missing vrnd{,a,m,n,p,x} tests.

2016-05-12 Thread Jiong Wang
On 11/05/16 14:23, Christophe Lyon wrote: 2016-05-02 Christophe Lyon * gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrnd.c: New. * gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrndX.inc: New. * gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrnda.c

[Patch, lra] relax the restriction on subreg reload for wide mode

2016-05-12 Thread Jiong Wang
ps://gcc.gnu.org/bugzilla/show_bug.cgi?id=70904) 2016-05-12 Jiong Wang gcc/ PR target/70904 * lra-constraint.c (process_addr_reg): Relax the restriction on subreg reload for wide mode. Index: gcc/lra-constraints.c ===

Re: [Patch ARM/AArch64 09/11] Add missing vrnd{,a,m,n,p,x} tests.

2016-05-13 Thread Jiong Wang
On 12/05/16 13:56, Christophe Lyon wrote: On 12 May 2016 at 10:45, Jiong Wang wrote: On 11/05/16 14:23, Christophe Lyon wrote: 2016-05-02 Christophe Lyon * gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrnd.c: New. * gcc/testsuite/gcc.target/aarch64/advsimd

[Patch, testsuite] PR70227, skip g++.dg/lto/pr69589_0.C on targets without -rdynamic support

2016-05-13 Thread Jiong Wang
dg-xfail-if because the latter is not supported inside lto.exp. OK for trunk? 2016-05-13 Jiong Wang gcc/testsuite/ PR testsuite/70227 * g++.dg/lto/pr69589_0.C: Skip arm and aarch64 bare-metal targets. diff --git a/gcc/testsuite/g++.dg/lto/pr69589_0.C b/gcc/testsuite/g++.dg/lto/pr69589_0.

[Patch, ARM] PR71061, length pop* pattern in epilogue correctly

2016-05-13 Thread Jiong Wang
. For the fix, I think we should extend arm_attr_length_push_multi to pop* pattern. OK for trunk? 2016-05-13 Jiong. Wang gcc/ PR target/71061 * config/arm/arm-protos.h (arm_attr_length_push_multi): Rename to "arm_attr_length_pp_multi". Add one parameter "first_index&q

[AArch64, 1/4] Add the missing support of vfms_n_f32, vfmsq_n_f32, vfmsq_n_f64

2016-05-16 Thread Jiong Wang
__c) { return __builtin_aarch64_fmav2sf (__b, (float32x2_t) {__c, __c}, __a); } before (-O2) === vfma_n_f32: dup v2.2s, v2.s[0] fmlav0.2s, v1.2s, v2.2s ret after === vfma_n_f32: fmlav0.2s, v1.2s, v2.s[0] ret OK for trunk? 2016-05-

[AArch64, 2/4] Extend vector mutiply by element to all supported modes

2016-05-16 Thread Jiong Wang
mul v1.2s, v1.2s, v2.2s + ldr s1, [x3, 160] + mul v1.2s, v0.2s, v1.s[0] OK for trunk? 2016-05-16 Jiong Wang gcc/ * config/aarch64/aarch64-simd.md (*aarch64_mul3_elt_to_128df): Extend to all supported modes. Rename to "*aarch64_mul3_elt_from_dup"

[AArch64, 3/4] Reimplement multiply by element to get rid of inline assembly

2016-05-16 Thread Jiong Wang
This patch reimplement vector multiply by element on top of the existed vmul_lane* intrinsics instead of inline assembly. There is no code generation change from this patch. OK for trunk? 2016-05-16 Jiong Wang gcc/ * config/aarch64/aarch64-simd.md (vmul_n_f32): Remove inline assembly

[AArch64, 4/4] Reimplement vmvn* intrinscis, remove inline assembly

2016-05-16 Thread Jiong Wang
on on the exist advsimd-intrinsics/vmvn.c. 2016-05-16 Jiong Wang gcc/ * config/aarch64/arm_neon.h (vmvn_s8): Reimplement using C operator. Remove inline assembly. (vmvn_s16): Likewise. (vmvn_s32): Likewise. (vmvn_u8): Likewise. (vmvn_u16): Likewise. (vmvn_u32): Likewise. (vmvn

[Patch] PR rtl-optimization/71150, guard in_class_p check with REG_P

2016-05-17 Thread Jiong Wang
le testcase for x86 can reproduce this bug. long foo (long a) { return (unsigned) foo; } OK for trunk? x86-64 bootstrap OK and no regression on check-gcc/g++. 2016-05-17 Jiong Wang gcc/ PR rtl-optimization/71150 * lra-constraint (process_addr_reg): Guard "in_class_p" with R

Re: [Patch] PR rtl-optimization/71150, guard in_class_p check with REG_P

2016-05-17 Thread Jiong Wang
On 17/05/16 11:23, Uros Bizjak wrote: On Tue, May 17, 2016 at 12:17 PM, Uros Bizjak wrote: Hello! This bug is introduced by my commit r236181 where the inner rtx of SUBREG haven't been checked while it should as "in_class_p" only works with REG, and SUBREG_REG is actually not always REG. If

Re: [AArch64, 1/4] Add the missing support of vfms_n_f32, vfmsq_n_f32, vfmsq_n_f64

2016-05-18 Thread Jiong Wang
te/ 2016-05-18 Jiong Wang * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h: Guard float64_t with __aarch64__. * gcc.target/aarch64/advsimd-intrinsics/vfms_vfma_n.c: Guard variable declaration under __aarch64__ and __ARM_FEATURE_FMA. diff --git a/gc

Re: [AArch64, 2/4] Extend vector mutiply by element to all supported modes

2016-05-18 Thread Jiong Wang
On 18/05/16 09:17, Christophe Lyon wrote: On 17 May 2016 at 14:27, James Greenhalgh wrote: On Mon, May 16, 2016 at 10:09:31AM +0100, Jiong Wang wrote: AArch64 support vector multiply by element for V2DF, V2SF, V4SF, V2SI, V4SI, V4HI, V8HI. All above are well supported by "*aarch64_mul

Re: [Patch, ARM] PR71061, length pop* pattern in epilogue correctly

2016-05-19 Thread Jiong Wang
On 13/05/16 14:54, Jiong Wang wrote: For thumb mode, this is causing wrong size calculation and may affect some rtl pass, for example bb-order where copy_bb_p needs accurate insn length info. This have eventually part of the reason for https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00639.html

Re: [PATCH 9/17][ARM] Add NEON FP16 arithmetic instructions.

2016-05-19 Thread Jiong Wang
On 18/05/16 01:58, Joseph Myers wrote: On Tue, 17 May 2016, Matthew Wahab wrote: As with the VFP FP16 arithmetic instructions, operations on __fp16 values are done by conversion to single-precision. Any new optimization supported by the instruction descriptions can only apply to code generate

[AArch64, 0/6] Remove inline assembly in arm_neon.h

2016-05-24 Thread Jiong Wang
so no regression on big-endian bare-metal tests. --- Jiong Wang (6) Reimplement scalar fixed-point intrinsics Reimplement vector fixed-point intrinsics Reimplement frsqrte intrinsics Reimplement frsqrts intrinsics Reimplement fabd intrinsics & merge rtl patterns Reimplement vpadd i

[AArch64, 1/6] Reimplement scalar fixed-point intrinsics

2016-05-24 Thread Jiong Wang
Jiong Wang * config/aarch64/aarch64-builtins.c (TYPES_BINOP_USS): New (TYPES_BINOP_SUS): Likewise. (aarch64_simd_builtin_data): Update include file name. (aarch64_builtins): Likewise. * config/aarch64/aarch64-simd-builtins.def: Rename to aarch64

[AArch64, 3/6] Reimplement frsqrte intrinsics

2016-05-24 Thread Jiong Wang
These intrinsics were implemented before the instruction pattern "aarch64_rsqrte" added, that these intrinsics were implemented through inline assembly. This mirgrate the implementation to builtin. gcc/ 2016-05-23 Jiong Wang * config/aarch64/aarch64-builtins.def (rs

[AArch64, 4/6] Reimplement frsqrts intrinsics

2016-05-24 Thread Jiong Wang
Similar as [3/6], these intrinsics were implemented before the instruction pattern "aarch64_rsqrts" added, that these intrinsics were implemented through inline assembly. This mirgrate the implementation to builtin. gcc/ 2016-05-23 Jiong Wang * config/aarch64/aarch64-bu

[AArch64, 2/6] Reimplement vector fixed-point intrinsics

2016-05-24 Thread Jiong Wang
Based on top of [1/6], this patch reimplement vector intrinsics for conversion between floating-point and fixed-point. gcc/ 2016-05-23 Jiong Wang * config/aarch64/aarch64-builtins.def (scvtf): New builtins for vector types. (ucvtf): Likewise. (fcvtzs): Likewise

[AArch64, 5/6] Reimplement fabd intrinsics & merge rtl patterns

2016-05-24 Thread Jiong Wang
These intrinsics were implemented before "fabd_3" introduces. Meanwhile the patterns "fabd_3" and "*fabd_scalar3" can be merged into a single "fabd3" using VALLF. This patch migrate the implementation to builtins backed by this pattern. gcc/ 2016-05-2

[AArch64, 6/6] Reimplement vpadd intrinsics & extend rtl patterns to all modes

2016-05-24 Thread Jiong Wang
These intrinsics was implemented by inline assembly using "faddp" instruction. There was a pattern "aarch64_addpv4sf" which supportsV4SF mode only while we can extend this pattern to support VDQF mode, then we can reimplement these intrinsics through builtlins. gcc/ 20

Re: [COMMITTED][AArch64][sibcall]Tighten direct call pattern to repair -fno-plt

2015-08-07 Thread Jiong Wang
James Greenhalgh writes: > On Thu, Aug 06, 2015 at 05:16:33PM +0100, Jiong Wang wrote: > > Hi Jiong, > > The new testcases introduced in this and the related patch are failing > for me on aarch64-none-elf: > > aarch64-none-elf > > NA->FAIL: gcc.target

Re: [COMMITTED][AArch64] Improve TLS Descriptor pattern to release RTL loop IV opt

2015-08-10 Thread Jiong Wang
Andreas Schwab writes: > Jiong Wang writes: > >> Index: gcc/ChangeLog >> === >> --- gcc/ChangeLog(revision 226682) >> +++ gcc/ChangeLog(working copy) >> @@ -1,3 +1,16 @@ &g

Re: [COMMITTED][AArch64] Improve TLS Descriptor pattern to release RTL loop IV opt

2015-08-10 Thread Jiong Wang
Jiong Wang writes: > Andreas Schwab writes: > >> Jiong Wang writes: >> >>> Index: gcc/ChangeLog >>> === >>> --- gcc/ChangeLog (revision 226682) >>> +++ gcc/ChangeLog (workin

Re: [COMMITTED][AArch64] Improve TLS Descriptor pattern to release RTL loop IV opt

2015-08-10 Thread Jiong Wang
Andreas Schwab writes: > Jiong Wang writes: > >> And I just finished two round of native aarch64 build/check w/wo my patch. > > Did you rebuild everything? No. Just applied the patch, then "make all" and re-check > > Andreas. -- Regards, Jiong

Re: [COMMITTED][AArch64] Improve TLS Descriptor pattern to release RTL loop IV opt

2015-08-10 Thread Jiong Wang
Andreas Schwab writes: > Jiong Wang writes: > >> Andreas Schwab writes: >> >>> Jiong Wang writes: >>> >>>> And I just finished two round of native aarch64 build/check w/wo my patch. >>> >>> Did you rebuild everything? >

Re: [COMMITTED][AArch64] Improve TLS Descriptor pattern to release RTL loop IV opt

2015-08-11 Thread Jiong Wang
Andreas Schwab writes: > Jiong Wang writes: > >> I Just finished several round of rebuild & testing on clean >> environment. > > How did you even manage to compile it? > > ../../gcc/ira.c: In function 'void print_translated_classes(FILE*, bool)':

Re: [COMMITTED][AArch64] Improve TLS Descriptor pattern to release RTL loop IV opt

2015-08-11 Thread Jiong Wang
Andreas Schwab writes: > Jiong Wang writes: > >> I Just finished several round of rebuild & testing on clean >> environment. > > How did you even manage to compile it? > > ../../gcc/ira.c: In function 'void print_translated_classes(FILE*, bool)':

[COMMITTED][AArch64] Add the missing "," for enumeration element

2015-08-11 Thread Jiong Wang
Jiong Wang writes: > Andreas Schwab writes: > >> Jiong Wang writes: >> >>> I Just finished several round of rebuild & testing on clean >>> environment. >> >> How did you even manage to compile it? >> >> ../../gcc/ira.c: In fun

Re: [Patch/rtl-expand] Take tree range info into account to improve LSHIFT_EXP expanding

2015-08-14 Thread Jiong Wang
Jeff Law writes: > On 04/29/2015 03:36 PM, Jiong Wang wrote: >> >> Jeff Law writes: >> >>> On 04/27/2015 02:21 PM, Jiong Wang wrote: >>> >>>> Jeff, >>>> >>>> Sorry, I can't understand the meaning of "ove

Re: [Patch/rtl-expand] Take tree range info into account to improve LSHIFT_EXP expanding

2015-08-14 Thread Jiong Wang
Jeff Law writes: > On 08/14/2015 11:40 AM, Jiong Wang wrote: >> >>* Figuring out whether the shift source is coming from sign extension >> by checking SSA_NAME_DEF_STMT instead of deducing from tree range >> info. I fell checking the gimple sta

Re: [Patch/rtl-expand] Take tree range info into account to improve LSHIFT_EXP expanding

2015-08-18 Thread Jiong Wang
Jiong Wang writes: > Jeff Law writes: > >> On 08/14/2015 11:40 AM, Jiong Wang wrote: >>> >>>* Figuring out whether the shift source is coming from sign extension >>> by checking SSA_NAME_DEF_STMT instead of deducing from tree range >>>

[COMMITTED][AArch64] Cleanup whitespace in aarch64.c

2015-08-19 Thread Jiong Wang
These whitespaces are introduced by my commit r225017. Those whitespaces should be replaced with tab according to GNU coding style. Commited as obvisous (r227005), after cross build aarch64-elf OK. 2015-08-19 Jiong Wang gcc/ * config/aarch64/aarch64.c (aarch64_load_symref_appropriately

[AArch64][TLSLE][1/3] Add the option "-mtls-size" for AArch64

2015-08-19 Thread Jiong Wang
Marcus Shawcroft writes: > On 21 May 2015 at 17:44, Jiong Wang wrote: >> >> This patch add -mtls-size option for AArch64. This option let user to do >> finer control on code generation for various TLS model on AArch64. >> >> For example, for TLS LE, user

[AArch64][TLSLE][2/3] Add the option "-mtls-size" for AArch64

2015-08-19 Thread Jiong Wang
support for other symbol types. OK for trunk? 2015-08-19 Jiong Wang gcc/ * config/aarch64/aarch64-protos.h (aarch64_symbol_type): Rename SYMBOL_TLSLE to SYMBOL_TLSLE24. * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Likewise (aarch64_expand_mov_immediate): Likewise

[AArch64][TLSLE][3/3] Implement local executable mode for all memory model

2015-08-19 Thread Jiong Wang
Marcus Shawcroft writes: > On 21 May 2015 at 17:49, Jiong Wang wrote: > >> 2015-05-14 Jiong Wang >> gcc/ >> * config/aarch64/aarch64.c (aarch64_print_operand): Support tls_size. >> * config/aarch64/aarch64.md (tlsle): Choose proper instruction >

[AArch64][TLSLE][2/3] Rename SYMBOL_TLSLE to SYMBOL_TLSLE24

2015-08-19 Thread Jiong Wang
Jiong Wang writes: > As we have added -mtls-size support, there should be four types TLSLE > symbols: > > SYMBOL_TLSLE12 > SYMBOL_TLSLE24 > SYMBOL_TLSLE32 > SYMBOL_TLSLE48 > > which reflect the maximum address bits needed to address this symbol. > >

Re: [Patch/rtl-expand] Take tree range info into account to improve LSHIFT_EXP expanding

2015-08-19 Thread Jiong Wang
-- Regards, Jiong Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 227017) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2015-08-19 Jiong Wang + + * expr.c (expand_expr_r

Re: [PATCH] Fix PRs 66502 and 67167

2015-08-21 Thread Jiong Wang
Richard Biener writes: > Given there is now PR67167 I am going forward with the earlier posted > patch to switch SCCVN to PHI elimination in favor of another PHI > (to remove IVs) rather than in favor of its only executable edge value. > > I still see no way to capture both cases without detectin

Re: [PATCH] Fix PRs 66502 and 67167

2015-08-21 Thread Jiong Wang
Richard Biener writes: > I see the following ICE: > > t.c:13:1: internal compiler error: in decompose_normal_address, at > rtlanal.c:6090 > } > ^ > 0xc94a37 decompose_normal_address > /space/rguenther/tramp3d/trunk/gcc/rtlanal.c:6090 > 0xc94d25 decompose_address(address_info*, rtx_def*

Re: [AArch64][TLSLE][1/3] Add the option "-mtls-size" for AArch64

2015-08-25 Thread Jiong Wang
Marcus Shawcroft writes: > On 19 August 2015 at 15:26, Jiong Wang wrote: > >> 2015-08-19 Jiong Wang >> >> gcc/ >> * config/aarch64/aarch64.opt (mtls-size): New entry. >> * config/aarch64/aarch64.c (initialize_aarch64_tls_size): New function. >>

[AArch64/testsuite] Add more TLS local executable testcases

2015-08-26 Thread Jiong Wang
This patch cover tlsle tiny model tests, tls size truncation for tiny & small model included also. All testcases pass native test. OK for trunk? 2015-08-26 Jiong Wang gcc/testsuite/ * gcc.target/aarch64/tlsle12_tiny_1.c: New testcase for tiny model. * gcc.target/aarch64/tlsle24_ti

Re: [AArch64][TLSLE][3/3] Implement local executable mode for all memory model

2015-08-27 Thread Jiong Wang
Christophe Lyon writes: > On 19 August 2015 at 16:21, Jiong Wang wrote: >> >> Marcus Shawcroft writes: >> >>> On 21 May 2015 at 17:49, Jiong Wang wrote: >>> >>>> 2015-05-14 Jiong Wang >>>> gcc/ >>>> * config/a

Re: [AArch64][TLSLE][3/3] Implement local executable mode for all memory model

2015-08-27 Thread Jiong Wang
Christophe Lyon writes: > On 27 August 2015 at 10:35, Jiong Wang wrote: >> >> Christophe Lyon writes: >> >>> On 19 August 2015 at 16:21, Jiong Wang wrote: >>>> >>>> Marcus Shawcroft writes: >>>> >>>>> On 21 May 2

[AArch64][TLSGD][1/2] Remove unncessary define_expand for TLS GD traditional

2015-08-27 Thread Jiong Wang
Jiong Wang writes: > Currently, there is only small model support for TLS GD on > AArch64. While TLS Global Dynamic (Traditional) is actually the same for > all memory mode. > > For TLSGD, the code logic is always the following: > > RegA = GOT descriptor address of

[AArch64][TLSGD][2/2] Implement TLS GD traditional for tiny code model

2015-08-27 Thread Jiong Wang
As described this is the main implementaion patch. 2015-08-26 Jiong Wang gcc/ * configure.ac: Add check for binutils global dynamic tiny code model relocation support. * configure: Regenerate. * config.in: Regenerate. * config/aarch64/aarch64.md (tlsgd_tiny): New define_insn

Re: [AArch64][TLSLE][3/3] Implement local executable mode for all memory model

2015-08-27 Thread Jiong Wang
Jiong Wang writes: >>> >>> Those relocation types required by tls-size 12 & 24 are supported by >>> binutils-2.25 already, and you have passed compilation and failed at >>> exectuion, so there do have something wrong I guess. >>> >>> E

Re: [AArch64][TLSLE][3/3] Implement local executable mode for all memory model

2015-08-27 Thread Jiong Wang
Christophe Lyon writes: > On 27 August 2015 at 12:03, Jiong Wang wrote: >> >> Jiong Wang writes: >>>>> >>>>> Those relocation types required by tls-size 12 & 24 are supported by >>>>> binutils-2.25 already, and you have pa

[COMMITTED][AArch64] Rename SYMBOL_SMALL_GOTTPREL to SYMBOL_SMALL_TLSIE

2015-08-28 Thread Jiong Wang
SYMBOL_SMALL_GOTTPREL is for TLS IE model, while it is the only symbol name which is not following the name convention SYMBOL_[code model]_TLS[tls model]. This patch fix this. Committed as obivious. 2015-08-28 Jiong Wang gcc/ * config/aarch64/aarch64-protos.h (aarch64_symbol_context

Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting

2015-09-02 Thread Jiong Wang
Jeff Law writes: > On 05/21/2015 02:46 PM, Jiong Wang wrote: >> >> Thanks for these thoughts. >> >> I tried but still can't prove this transformation will not introduce >> extra pointer overflow even given it's reassociation with vfp, although >>

[PATCH] PR67421, Cost instruction sequences when doing left wide shift

2015-09-03 Thread Jiong Wang
gen_* directly, instead I reused "expand_variable_shift" to let it handle all the left work. wide-shift-64 pass on sparc under the option "-mv8plus -mcpu=v9" now, and arm32 also generate better code for wide-shift-64. OK for trunk? 2015-09-03 Jiong. Wang gcc/ P

[PATCH] Fix seq_cost prototype to use signed int

2015-09-08 Thread Jiong Wang
All other cost helper functions are using signed int to hold cost while seq_cost is using unsigned int. This fix this. bootstrap OK on x86. OK for trunk? 2015-09-08 Jiong Wang gcc/ * rtl.h (seq_cost): Change return type from "unsigned" to "int". * rtlanal.c

[AArch64] Delete aarch64_symbol_context which is not used

2015-09-08 Thread Jiong Wang
The concept of aarch64_symbol_context is not used in AArch64, this patch remove it and all relevant code. ok for trunk? 2015-09-08 Jiong. Wang gcc/ * config/aarch64/aarch64-protos.h (aarch64_symbol_context): Delete. * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Likewise

[AArch64] Handle const address in aarch64_print_operand

2015-09-08 Thread Jiong Wang
ble error, as aarch64_print_operand doesn't support "const" wrapper when there is no output modifier which is wrong. This problem does not existed on other backends like arm, mips because they use a "default" case to support all remaining situations which including address

Re: [PATCH] Fix seq_cost prototype to use signed int

2015-09-08 Thread Jiong Wang
Jeff Law writes: > On 09/08/2015 06:17 AM, Jiong Wang wrote: >> >> All other cost helper functions are using signed int to hold cost >> while seq_cost is using unsigned int. >> >> This fix this. bootstrap OK on x86. >> >> OK for trunk? >> &

[COMMITTED]AArch64] Skip tiny and large code model on gcc.target/aarch64/pic-small.c

2015-09-10 Thread Jiong Wang
The testcase is written for small model. If customized local test environment add -mcmodel=tiny explicitly then it will override what's passed in dg-options. Need to apply the same restriction as got_mem_hoist_1.c Committed as obivious. gcc/testsuite/ * gcc.target/aarch64/pic-small.c (dg-ski

[AArch64] Simplify TLS pattern by hardcoding relocation modifiers into pattern

2015-09-10 Thread Jiong Wang
TLS instruction sequences are always with fixed format, there is no need to use operand modifier, we can hardcode the relocation modifiers into instruction pattern, all those redundant checks in aarch64_print_operand can be removed. OK for trunk? 2015-09-10 Jiong Wang gcc/ * config

Re: [PATCH 2/2] shrink-wrap: Rewrite try_shrink_wrapping

2015-09-10 Thread Jiong Wang
Segher Boessenkool writes: > 2015-09-10 Segher Boessenkool > > * shrink-wrap.c (requires_stack_frame_p): Fix formatting. > (dup_block_and_redirect): Delete function. > (can_dup_for_shrink_wrapping): New function. > (fix_fake_fallthrough_edge): New function. >

Re: [PATCH 2/2] shrink-wrap: Rewrite try_shrink_wrapping

2015-09-11 Thread Jiong Wang
Segher Boessenkool writes: > On Thu, Sep 10, 2015 at 08:14:29AM -0700, Segher Boessenkool wrote: >> This patch rewrites the shrink-wrapping algorithm, allowing non-linear >> pieces of CFG to be duplicated for use without prologue instead of just >> linear pieces. > >> Bootstrapped and regression

Re: [PATCH 2/2] shrink-wrap: Rewrite try_shrink_wrapping

2015-09-11 Thread Jiong Wang
Segher Boessenkool writes: > On Fri, Sep 11, 2015 at 10:24:42AM +0100, Jiong Wang wrote: >> >> Segher Boessenkool writes: >> >> > On Thu, Sep 10, 2015 at 08:14:29AM -0700, Segher Boessenkool wrote: >> >> This patch rewrites the shrink-wrapping algorith

Re: [Patch, rtl] PR middle-end/78016, keep REG_NOTE order during insn copy

2016-11-07 Thread Jiong Wang
On 07/11/16 17:04, Bernd Schmidt wrote: On 11/03/2016 03:00 PM, Eric Botcazou wrote: FWIW here's a more complete version of my patch which I'm currently testing. Let me know if you think it's at least a good enough intermediate step to be installed. It is, thanks. Testing showed the same i

[0/9] Support ARMv8.3-A Pointer Authentication Extension

2016-11-11 Thread Jiong Wang
flow protect on risky functions --- sign LR+ 1.82% + 2.18% LR protect on All Please review this patchset. Thanks. Jiong Wang (9): [RFC] Reserve three DW_OP number in vendor extension space Encoding supp

[1/9][RFC][DWARF] Reserve three DW_OP numbers in vendor extension space

2016-11-11 Thread Jiong Wang
include/ 2016-11-09 Richard Earnshaw Jiong Wang * dwarf2.def (DW_OP_AARCH64_pauth): Reserve the number 0xea. (DW_OP_AARCH64_paciasp): Reserve the number 0xeb. (Dw_OP_AARCH64_paciasp_deref): Reserve the number 0xec. diff --git a/include/dwarf2.

<    2   3   4   5   6   7   8   9   10   11   >