[PATCH, aarch64 1/4] aarch64: Add movprfx alternatives for unpredicated patterns

2018-07-01 Thread Richard Henderson
* config/aarch64/aarch64.md (movprfx): New attr. (length): Default movprfx to 8. * config/aarch64/aarch64-sve.md (*mul3): Add movprfx alt. (*madd, *msubmul3_highpart): Likewise. (*3): Likewise. (*v3): Likewise. (*3): Likewise. (*3): Li

[PATCH, aarch64 0/4] Add movprfx patterns and alternatives

2018-07-01 Thread Richard Henderson
These don't fire very often, but at least a few times within the testsuite. Enough to test my qemu implementation of the insns. r~ Richard Henderson (4): aarch64: Add movprfx alternatives for unpredicated patterns aarch64: Remove predicate from inside SVE_COND_FP_BINARY aarch64

[PATCH, aarch64 2/4] aarch64: Remove predicate from inside SVE_COND_FP_BINARY

2018-07-01 Thread Richard Henderson
The predicate is present within the containing UNSPEC_SEL; there is no need to duplicate it. * config/aarch64/aarch64-sve.md (cond_): Remove match_dup 1 from the inner unspec. (*cond_): Likewise. --- gcc/config/aarch64/aarch64-sve.md | 9 +++-- 1 file changed, 3 insert

[PATCH, aarch64 4/4] aarch64: Add movprfx patterns for zero and unmatched select

2018-07-01 Thread Richard Henderson
* config/aarch64/aarch64-protos.h, config/aarch64/aarch64.c (aarch64_sve_prepare_conditional_op): Remove. * config/aarch64/aarch64-sve.md (cond_): Allow aarch64_simd_reg_or_zero as select operand; remove the aarch64_sve_prepare_conditional_op call. (c

[PATCH, aarch64 3/4] aarch64: Add movprfx alternatives for predicate patterns

2018-07-01 Thread Richard Henderson
* config/aarch64/iterators.md (SVE_INT_BINARY_REV): Remove. (SVE_COND_FP_BINARY_REV): Remove. (sve_int_op_rev, sve_fp_op_rev): New. * config/aarch64/aarch64-sve.md (*cond__0): New. (*cond__0): New. (*cond__0): New. (*cond__2): Rename, add movp

Re: [PATCH, aarch64 3/4] aarch64: Add movprfx alternatives for predicate patterns

2018-07-02 Thread Richard Henderson
On 07/02/2018 04:55 AM, Richard Sandiford wrote: >> +;; Predicated floating-point operations with select matching output. >> +(define_insn "*cond__0" >> + [(set (match_operand:SVE_F 0 "register_operand" "+w, w, ?&w") >> (unspec:SVE_F >> - [(match_operand: 1 "register_operand" "Upl") >> +

[PATCH] alpha: Use TARGET_COMPUTE_FRAME_LAYOUT

2018-07-09 Thread Richard Henderson
At the same time, merge several related frame computing functions. Recall that HWI is now always 64-bit, so merge IMASK and FMASK, which allows merging of several loops within prologue and epilogue. Full regression testing will take some time, but a quick browse suggests no change in generated cod

Re: [PATCH] alpha: Use TARGET_COMPUTE_FRAME_LAYOUT

2018-07-10 Thread Richard Henderson
On 07/10/2018 12:05 AM, Richard Biener wrote: > On Mon, Jul 9, 2018 at 9:05 PM Richard Henderson wrote: >> >> At the same time, merge several related frame computing functions. >> Recall that HWI is now always 64-bit, so merge IMASK and FMASK, >> which allows merg

Re: [GCC][PATCH][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks

2018-07-16 Thread Richard Henderson
On 07/16/2018 10:10 AM, Sam Tebbs wrote: > +++ b/gcc/config/aarch64/aarch64.c > @@ -1439,6 +1439,14 @@ aarch64_hard_regno_caller_save_mode (unsigned regno, > unsigned, > return SImode; > } > > +/* Implement IS_LEFT_CONSECUTIVE. Check if an integer's bits are consecutive > + ones from th

Re: [GCC][PATCH][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks

2018-07-17 Thread Richard Henderson
On 07/17/2018 06:33 AM, Richard Earnshaw (lists) wrote: >> (define_predicate "pow2m1_operand" >> (and (match_code "const_int") >>(match_test "exact_pow2 (INTVAL(op) + 1) > 0"))) > ITYM exact_log2() Yes of course. Thanks, r~

[PATCH 0/6] Implement asm flag outputs for arm + aarch64

2019-11-08 Thread Richard Henderson
done now and I could just use it in the kernel... ;-) r~ Richard Henderson (6): aarch64: Add "c" constraint arm: Fix the "c" constraint arm: Rename CC_NOOVmode to CC_NZmode arm, aarch64: Add support for __GCC_ASM_FLAG_OUTPUTS__ arm: Add testsuite checks fo

[PATCH 1/6] aarch64: Add "c" constraint

2019-11-08 Thread Richard Henderson
Mirror arm in letting "c" match the condition code register. * config/aarch64/constraints.md (c): New constraint. --- gcc/config/aarch64/constraints.md | 4 1 file changed, 4 insertions(+) diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index d0c3

[PATCH 4/6] arm, aarch64: Add support for __GCC_ASM_FLAG_OUTPUTS__

2019-11-08 Thread Richard Henderson
Since all but a couple of lines is shared between the two targets, enable them both at once. * config/arm/aarch-common-protos.h (arm_md_asm_adjust): Declare. * config/arm/aarch-common.c (arm_md_asm_adjust): New. * config/arm/arm-c.c (arm_cpu_builtins): Define __GCC_

[PATCH 3/6] arm: Rename CC_NOOVmode to CC_NZmode

2019-11-08 Thread Richard Henderson
CC_NZmode is a more accurate description of what we require from the mode, and matches up with the definition in aarch64. Rename noov_comparison_operator to nz_comparison_operator in order to match. * config/arm/arm-modes.def (CC_NZ): Rename from CC_NOOV. * config/arm/predicates.m

[PATCH 2/6] arm: Fix the "c" constraint

2019-11-08 Thread Richard Henderson
The existing definition using register class CC_REG does not work because CC_REGNUM does not support normal modes, and so fails to match register_operand. Use a non-register constraint and the cc_register predicate instead. * config/arm/constraints.md (c): Use cc_register predicate. ---

[PATCH 5/6] arm: Add testsuite checks for asm-flag

2019-11-08 Thread Richard Henderson
Inspired by the tests in gcc.target/i386. Testing code generation, diagnostics, and execution. * gcc.target/arm/asm-flag-1.c: New test. * gcc.target/arm/asm-flag-3.c: New test. * gcc.target/arm/asm-flag-5.c: New test. * gcc.target/arm/asm-flag-6.c: New test. --- g

[PATCH 6/6] aarch64: Add testsuite checks for asm-flag

2019-11-08 Thread Richard Henderson
Inspired by the tests in gcc.target/i386. Testing code generation, diagnostics, and execution. * gcc.target/aarch64/asm-flag-1.c: New test. * gcc.target/aarch64/asm-flag-3.c: New test. * gcc.target/aarch64/asm-flag-5.c: New test. * gcc.target/aarch64/asm-flag-6.c:

Re: [PATCH 4/6] arm, aarch64: Add support for __GCC_ASM_FLAG_OUTPUTS__

2019-11-08 Thread Richard Henderson
On 11/8/19 11:54 AM, Richard Henderson wrote: > +@table @code > +@item eq > +``equal'' or Z flag set > +@item ne > +``not equal'' or Z flag clear > +@item cs > +``carry'' or C flag set > +@item cc > +C flag clear > +@item mi > +

Re: [PATCH][arm][1/X] Add initial support for saturation intrinsics

2019-11-09 Thread Richard Henderson
> +;; define_subst and associated attributes > + > +(define_subst "add_setq" > + [(set (match_operand:SI 0 "" "") > +(match_operand:SI 1 "" ""))] > + "" > + [(set (match_dup 0) > +(match_dup 1)) > + (set (reg:CC APSRQ_REGNUM) > + (unspec:CC [(reg:CC APSRQ_REGNUM)] UNSPEC_Q_

Re: [Committed] IBM Z: Add pattern for load truth value of comparison into reg

2019-11-11 Thread Richard Henderson
On 11/7/19 12:52 PM, Andreas Krebbel wrote: > +; Such patterns get directly emitted by noce_emit_store_flag. > +(define_insn_and_split "*cstorecc_z13" > + [(set (match_operand:GPR 0 "register_operand""=&d") > + (match_operator:GPR 1 "s390_comparison" > +

Re: [Committed] IBM Z: Add pattern for load truth value of comparison into reg

2019-11-11 Thread Richard Henderson
On 11/11/19 4:03 PM, Andreas Krebbel wrote: > On 11.11.19 15:39, Richard Henderson wrote: >> On 11/7/19 12:52 PM, Andreas Krebbel wrote: >>> +; Such patterns get directly emitted by noce_emit_store_flag. >>> +(define_insn_and_split "*cstorecc_z13&quo

Re: [PATCH 0/6] Implement asm flag outputs for arm + aarch64

2019-11-13 Thread Richard Henderson
On 11/12/19 9:21 PM, Richard Sandiford wrote: > Apart from the vc/vs thing you mentioned in the follow-up for 4/6, > it looks like 4/6, 5/6 and 6/6 are missing "hs" and "lo". OK for > aarch64 with those added. Are those aliases for two of the other conditions? They're not in the list within the

Re: [PATCH i386 3/8] [AVX512] Add AVX-512 patterns.

2013-08-20 Thread Richard Henderson
On 08/20/2013 07:04 AM, Kirill Yukhin wrote: > 2013-08-20 Kirill Yukhin > > * config/i386/sse.md (V16): Rename to... > (VMOVE): this. > (mov): Update iterator name. > (*mov_internal): Ditto. > (push1): Ditto. > (movmisalign): Ditto. This is ok.

Re: [PATCH i386 1/8] [AVX512] Adjust register classes.

2013-08-20 Thread Richard Henderson
On 08/20/2013 10:48 AM, Kirill Yukhin wrote: > @@ -34589,8 +34649,20 @@ ix86_hard_regno_mode_ok (int regno, enum > machine_mode mode) > { >/* We implement the move patterns for all vector modes into and >out of SSE registers, even when no operation instructions > - are av

Re: [PATCH] libitm: Add custom HTM fast path for RTM on x86_64.

2013-08-21 Thread Richard Henderson
> -#if defined(USE_HTM_FASTPATH) && !defined(HTM_CUSTOM_FASTPATH) > +#ifdef USE_HTM_FASTPATH >// HTM fastpath. Only chosen in the absence of transaction_cancel to allow >// using an uninstrumented code path. >// The fastpath is enabled only by dispatch_htm's method group, which uses >

Re: [PATCH] libitm: Add custom HTM fast path for RTM on x86_64.

2013-08-21 Thread Richard Henderson
On 08/21/2013 10:14 AM, Andi Kleen wrote: > The rest seems reasonable to me, although I haven't tried to untangle > the full dependencies between C++ and asm code for retries. > It would be likely cleaner to just keep the retries fully > in C++ like the original patch did. There's no advantage > of

Re: [PATCH i386 1/8] [AVX512] Adjust register classes.

2013-08-21 Thread Richard Henderson
On 08/21/2013 11:28 AM, Kirill Yukhin wrote: >>> + && (mode == XImode >>> + || VALID_AVX512F_REG_MODE (mode) >>> + || VALID_AVX512F_SCALAR_MODE (mode))) >>> + return true; >>> + >>> + /* In xmm16-xmm31 we can store only 512 bit modes. */ >>> + if (EXT_REX_SSE_REGNO_

Re: [PATCH i386 2/8] [AVX512] Add mask registers.

2013-08-22 Thread Richard Henderson
On 08/22/2013 02:35 AM, Kirill Yukhin wrote: > Despite of generic OR, mask version of OR do not clobber FLAGS_REG. > Of course, we may conservatively think that it is, but I believe > this is not good idea. I believe that having two different patterns is a worse idea. You can always split away th

Re: [PATCH] libitm: Add custom HTM fast path for RTM on x86_64.

2013-08-22 Thread Richard Henderson
On 08/22/2013 11:39 AM, Torvald Riegel wrote: > + /* Store edi for future HTM fast path retries. We use a stack slot > +lower than the jmpbuf so that the jmpbuf's rip field will overlap > +with the proper return address on the stack. */ > + movl%edi, -64(%rsp) You hav

Re: [PATCH i386 1/8] [AVX512] Adjust register classes.

2013-08-22 Thread Richard Henderson
On 08/22/2013 11:56 AM, Kirill Yukhin wrote: > ChangeLog: > 2013-08-22 Kirill Yukhin > > * gcc/config/i386/i386.md (*movti_internal): Use > predicate to determine if EVEX is needed. > (*movsi_internal): Ditto. > (*movdf_internal): Ditto. > (*movsf_internal): Ditto.

Re: Add overload for register_pass

2013-08-26 Thread Richard Henderson
On 08/24/2013 02:33 PM, Oleg Endo wrote: > gcc/ChangeLog: > * passes.c (register_pass): Add overload. > * tree-pass.h (register_pass): Forward declare it. > Add comment. Ok. r~

Re: [Ping^4] [Patch, AArch64, ILP32] 3/5 Minor change in function.c:assign_parm_find_data_types()

2013-08-26 Thread Richard Henderson
On 08/15/2013 11:21 AM, Yufeng Zhang wrote: > Ping^4~ > > I am aware that it is currently holiday season, but it would be really nice if > this tiny patch can get some further comments even if it is not an approval. > > The original RFA email is here: > http://gcc.gnu.org/ml/gcc-patches/2013-06/m

Re: RFA: prefer double over same-size float as conversion result

2013-08-26 Thread Richard Henderson
On 08/26/2013 09:07 AM, Joern Rennecke wrote: > 2013-05-14 Joern Rennecke > > * c-typeck.c (c_common_type): Prefer double_type_node over > other REAL_TYPE types with the same precision. > (convert_arguments): Likewise. Ok. r~

Re: [PATCH i386 2/8] [AVX512] Add mask registers.

2013-08-26 Thread Richard Henderson
On 08/26/2013 09:13 AM, Kirill Yukhin wrote: > +(define_split > + [(set (match_operand:SWI12 0 "mask_reg_operand") > + (any_logic:SWI12 (match_operand:SWI12 1 "mask_reg_operand") > + (match_operand:SWI12 2 "mask_reg_operand"))) > + (clobber (reg:CC FLAGS_REG))] > + "TAR

Re: [PATCH] libitm: Add custom HTM fast path for RTM on x86_64.

2013-08-26 Thread Richard Henderson
On 08/22/2013 02:57 PM, Torvald Riegel wrote: > On Thu, 2013-08-22 at 12:05 -0700, Richard Henderson wrote: >> On 08/22/2013 11:39 AM, Torvald Riegel wrote: >>> + /* Store edi for future HTM fast path retries. We use a stack slot >>> + lower than the jmpbuf so

Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk

2013-08-26 Thread Richard Henderson
> +static tree > +c_check_cilk_loop_incr (location_t loc, tree decl, tree incr) > +{ > + if (EXPR_HAS_LOCATION (incr)) > +loc = EXPR_LOCATION (incr); > + > + if (!incr) > +{ > + error_at (loc, "missing increment"); > + return error_mark_node; > +} Either these tests are swa

Re: [PATCH i386 2/8] [AVX512] Add mask registers.

2013-08-27 Thread Richard Henderson
On 08/27/2013 11:11 AM, Kirill Yukhin wrote: >> > What happened to the bmi andn alternative we discussed? > BMI only supported for 4- and 8- byte integers, while > kandw - for HI/QI > We're talking about values in registers. Ignoring the high bits of the andn result still produces the correct re

Re: [PATCH i386 2/8] [AVX512] Add mask registers.

2013-08-28 Thread Richard Henderson
On 08/28/2013 10:45 AM, Kirill Yukhin wrote: > Hello Richard, > > On 27 Aug 13:07, Richard Henderson wrote: >> On 08/27/2013 11:11 AM, Kirill Yukhin wrote: >>>>> What happened to the bmi andn alternative we discussed? >>> BMI only supported for 4- and 8- byt

Re: [PATCH i386 2/8] [AVX512] Add mask registers.

2013-08-28 Thread Richard Henderson
On 08/28/2013 11:38 AM, Kirill Yukhin wrote: >> When combine puts the AND and the NOT together, we don't know what registers >> we >> want the data in. If we do not supply the general register alternative, with >> the clobber, then we will be FORCED to implement the operation in the mask >> regis

Re: [PATCH i386 2/8] [AVX512] Add mask registers.

2013-09-09 Thread Richard Henderson
On 08/29/2013 04:59 AM, Kirill Yukhin wrote: > @@ -7616,10 +7677,10 @@ >[(set_attr "type" "alu") > (set_attr "mode" "SI")]) > > -(define_insn "*andhi_1" > - [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r,Ya") > - (and:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,qm")

Re: [PATCH] Handle target specific memory models in C frontend

2013-09-09 Thread Richard Henderson
On 08/10/2013 12:40 PM, Andi Kleen wrote: > On Fri, Nov 09, 2012 at 07:08:07AM -0800, Richard Henderson wrote: >> On 2012-11-09 07:03, Andi Kleen wrote: >>> PR55139 >>> * c-common.c (get_atomic_generic_size): Mask with >>> MEMMODEL_MASK >>

Re: [PATCH i386 2/8] [AVX512] Add mask registers.

2013-09-10 Thread Richard Henderson
On 09/10/2013 05:57 AM, Kirill Yukhin wrote: > Do you still think we need "*"? No, I suppose that's fine. r~

Re: [PATCH i386 2/8] [AVX512] Add mask registers.

2013-09-10 Thread Richard Henderson
On 09/10/2013 11:25 AM, Kirill Yukhin wrote: > Hello, > On 10 Sep 09:17, Richard Henderson wrote: >> On 09/10/2013 05:57 AM, Kirill Yukhin wrote: >>> + { OPTION_MASK_ISA_AVX512F, CODE_FOR_andhi_1, "__builtin_ia32_kandhi", >>> IX86_BUILT

Re: [PATCH i386 2/8] [AVX512] Add mask registers.

2013-09-10 Thread Richard Henderson
On 09/10/2013 05:57 AM, Kirill Yukhin wrote: > + { OPTION_MASK_ISA_AVX512F, CODE_FOR_andhi_1, "__builtin_ia32_kandhi", > IX86_BUILTIN_KAND16, UNKNOWN, (int) HI_FTYPE_HI_HI }, Alternately, why not use the standard CODE_FOR_andhi3 expander? r~

Re: [PATCH i386 3/8] [AVX512] [1/n] Add AVX-512 patterns: VF iterator extended.

2013-09-24 Thread Richard Henderson
On 08/27/2013 11:37 AM, Kirill Yukhin wrote: > Hello, > >> This patch is still far too large. >> >> I think you should split it up based on every single mode iterator that >> you need to add or change. > > Problem is that some iterators are depend on each other, so patches are > not going to be t

Re: [PATCH]: Fix use of __builtin_eh_pointer in EH_ELSE

2013-09-24 Thread Richard Henderson
On 09/03/2013 07:08 AM, Tristan Gingold wrote: > Hi, > > The field state->ehp_region wasn't updated before lowering constructs in the > eh > path of EH_ELSE. As a consequence, __builtin_eh_pointer is lowered to 0 (or > possibly to a wrong region number) in this path. > > The only user of EH_ELS

Re: [gomp4] Library side of depend clause support

2013-09-26 Thread Richard Henderson
On 09/26/2013 11:36 AM, Jakub Jelinek wrote: > +struct gomp_task; > struct gomp_taskgroup; > +struct htab; > + > +struct gomp_task_depend_entry > +{ > + void *addr; > + struct gomp_task_depend_entry *next; > + struct gomp_task_depend_entry *prev; > + struct gomp_task *task; > + bool is_in; >

Re: [PATCH]: Fix use of __builtin_eh_pointer in EH_ELSE

2013-09-30 Thread Richard Henderson
On 09/30/2013 03:24 AM, Tristan Gingold wrote: > 2013-09-03 Tristan Gingold > > * tree.c (set_call_expr_flags): Reject ECF_TM_PURE. > (build_common_builtin_nodes): Set "transaction_pure" > attribute on __builtin_eh_pointer function type (and not on > its declaration). O

Re: PR/54893: allow volatiles inside relaxed transactions

2012-10-11 Thread Richard Henderson
On 10/11/2012 01:56 PM, Aldy Hernandez wrote: > PR middle-end/54893 > * trans-mem.c (diagnose_tm_1_op): Allow volatiles inside relaxed > transactions. Ok. r~

Re: [PATCH] TARGET_ support, was [PATCH] Rs6000 infrastructure cleanup

2012-10-16 Thread Richard Henderson
On 2012-10-17 08:21, Michael Meissner wrote: > Now, the x86 actually maps the OPTION_ISA_xxx switches to TARGET_xxx switches, > so it is easy to change all of the defines from OPTION_ISA_ to > TARGET_ISA_. For Android, it was simple to change the one line of > reference. > > 2012-10-16 Michael M

Re: PR/54893: allow volatiles inside relaxed transactions

2012-10-16 Thread Richard Henderson
On 2012-10-12 20:42, Aldy Hernandez wrote: > PR middle-end/54893 > * trans-mem.c (diagnose_tm_1_op): Allow volatiles inside relaxed > transactions. Ok. r~

Re: [path] PR 54900: store data race in if-conversion pass

2012-10-16 Thread Richard Henderson
On 2012-10-17 09:53, Aldy Hernandez wrote: > +/* Like memory_modified_in_insn_p, but return TRUE if INSN will > + *SURELY* modify the memory contents of MEM. */ > +bool > +memory_surely_modified_in_insn_p (const_rtx mem, const_rtx insn) I don't like the word "surely". Are we certain or not? I

Re: [patch] libitm: Clarify ABI requirements for data-logging functions.

2012-10-23 Thread Richard Henderson
On 2012-10-24 01:50, Torvald Riegel wrote: > Clarify ABI requirements for data-logging functions. > > * libitm.texi: Clarify ABI requirements for data-logging functions. Ok. r~

Re: [Patch] libitm: Ask dispatch whether it requires serial mode.

2012-10-23 Thread Richard Henderson
On 2012-10-24 01:48, Torvald Riegel wrote: > Ask dispatch whether it requires serial mode. > > * retry.cc (gtm_thread::decide_begin_dispatch): Ask dispatch whether > it requires serial mode instead of assuming that for certain > dispatchs. > * dispatch.h (abi_dispat

Re: [patch] move GIMPLE_TRANSACTION expansion to tmmark pass

2012-10-30 Thread Richard Henderson
On 2012-10-30 05:32, Aldy Hernandez wrote: > + // If we have a ``_transaction_cancel [[outer]]'', there is only > + // one abnormal edge: to the transaction marked OUTER. > + tree arg = gimple_call_arg (stmt, 0); > + if (TREE_CODE (arg) == INTEGER_CST) > + { > + if (TR

Re: [PATCH]: Fix PR58542, Arguments of __atomic_* functions are converted in unsigned mode

2013-10-09 Thread Richard Henderson
On 10/08/2013 11:37 AM, Uros Bizjak wrote: > > As shown in the attached testcase, arguments of various __atomic > builtins should be converted as signed, so the immediates get properly > extended. > > 2013-10-08 Uros Bizjak > > * optabs.c (maybe_emit_atomic_exchange): Convert operands as

Re: [PATCH i386 3/8] [AVX512] [2/n] Add AVX-512 patterns: Fix missing `v' constraint.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:24 AM, Kirill Yukhin wrote: > Here's 2nd subpatch. It fixes missing `v' constraints. And one v constraint that shouldn't have been. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [3/n] Add AVX-512 patterns: VF1 and VI iterators.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:24 AM, Kirill Yukhin wrote: > Here's 3rd subpatch. It extends VF1 and VI iterators. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [4/n] Add AVX-512 patterns: V iterator.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:25 AM, Kirill Yukhin wrote: > Here's 4th subpatch. It extends V iterator. And much much more that's totally unrelated to changing V. That said, I didn't see anything wrong in there. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [5/n] Add AVX-512 patterns: Introduce `multdiv' code iterator.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:25 AM, Kirill Yukhin wrote: > Here's 5th subpatch. It introduces `multdiv' code iterator. This is the sort of patch I like to see. It's the first one you've sent that's done exactly one thing. Congratulations. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [6/n] Add AVX-512 patterns: VI2 and VI124 iterators.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:26 AM, Kirill Yukhin wrote: > Here's 6th subpatch. It extends VI2 and VI124 iterators. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [7/n] Add AVX-512 patterns: VI4 and VI8 iterators.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:26 AM, Kirill Yukhin wrote: > Here's 7th subpatch. It extends VI4 and VI8 iterators. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [8/n] Add AVX-512 patterns: VI48 and VI48_AVX2 iterators.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:27 AM, Kirill Yukhin wrote: > Here's 8th subpatch. It extends VI48 and VI48_AVX2 iterators. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [9/n] Add AVX-512 patterns: VI124_AVX2, VI8F iterators.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:27 AM, Kirill Yukhin wrote: > Here's 9th subpatch. It extends VI124_AVX2_48 and VI8F iterators. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [10/n] Add AVX-512 patterns: VI248_AVX2_8_AVX512F and VI124_256_48_AVX512F iterators.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:27 AM, Kirill Yukhin wrote: > Here's 10th subpatch. It introduces VI248_AVX2_8_AVX512F and VI124_256_48_512 > iterators. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [11/n] Add AVX-512 patterns: FMA.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:28 AM, Kirill Yukhin wrote: > +;; CPUID bit AVX512F enables evex encoded scalar and 512-bit fma. It doesn't > +;; care about FMA bit, so we enable fma for TARGET_AVX512F even when > TARGET_FMA > +;; and TARGET_FMA4 are both false. How do you force an evex encoding of the instruc

Re: [PATCH i386 3/8] [AVX512] [12/n] Add AVX-512 patterns: V_512 and VI_512 iterators.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:28 AM, Kirill Yukhin wrote: > Here's 12th subpatch. It introduces VF_512 and VI_512 iterators. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [13/n] Add AVX-512 patterns: VI4_AVX iterator.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:29 AM, Kirill Yukhin wrote: > Here's 13th subpatch. It introduces VI4_AVX iterator. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [14/n] Add AVX-512 patterns: VI48F_256_512 iterator.

2013-10-09 Thread Richard Henderson
On 10/09/2013 03:29 AM, Kirill Yukhin wrote: > Here's 14th subpatch. It introduces VI48F_256_512 iterator. Ok. r~

Re: [PATCH] Fix asm goto miscompilation of Linux kernel (PR middle-end/58670)

2013-10-10 Thread Richard Henderson
On 10/10/2013 03:22 AM, Jakub Jelinek wrote: > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and 4.8? > > 2013-10-10 Jakub Jelinek > > PR middle-end/58670 > * stmt.c (expand_asm_operands): Add FALLTHRU_BB argument, > if any labels are in FALLTHRU_BB, use

Re: [PATCH i386 3/8] [AVX512] [11/n] Add AVX-512 patterns: FMA.

2013-10-10 Thread Richard Henderson
On 10/10/2013 07:27 AM, Kirill Yukhin wrote: > Currently, from HW point of view, there're no CPUs which > feature AVX-512, but not AVX2. So, I believe we may put > a `TODO` in comment, like this: > > +;; CPUID bit AVX512F enables evex encoded scalar and 512-bit fma. It doesn't > +;; care about FM

Re: [PATCH i386 3/8] [AVX512] [15/n] Add AVX-512 patterns: VI48F_512 iterator.

2013-10-11 Thread Richard Henderson
On 10/09/2013 03:29 AM, Kirill Yukhin wrote: > +(define_insn "avx512f_vec_dup_mem" > + [(set (match_operand:VI48F_512 0 "register_operand" "=x") > + (vec_duplicate:VI48F_512 > + (match_operand: 1 "nonimmediate_operand" "xm")))] > + "TARGET_AVX512F" > + "vbroadcast\t{%1, %0|%0, %1}" > +

Re: [PATCH i386 3/8] [AVX512] [16/n] Add AVX-512 patterns: VI48_512 and VI4F_128 iterators.

2013-10-11 Thread Richard Henderson
On 10/09/2013 03:30 AM, Kirill Yukhin wrote: > +;; Return true if OP is either -1 constant or stored in register. > +(define_predicate "register_or_constm1_operand" > + (ior (match_operand 0 "register_operand") > + (match_test "op == constm1_rtx"))) This won't do the right thing, because yo

Re: [PATCH i386 3/8] [AVX512] [17/n] Add AVX-512 patterns: V8FI and V16FI iterators.

2013-10-14 Thread Richard Henderson
On 10/09/2013 03:30 AM, Kirill Yukhin wrote: > Hello, > >> This patch is still far too large. >> >> I think you should split it up based on every single mode iterator that >> you need to add or change. > > Here's 17th subpatch. It introduces V8FI and V16FI iterators. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [18/n] Add AVX-512 patterns: various RCPs and SQRTs.

2013-10-14 Thread Richard Henderson
On 10/09/2013 03:31 AM, Kirill Yukhin wrote: > Hello, > >> This patch is still far too large. >> >> I think you should split it up based on every single mode iterator that >> you need to add or change. > > Here's 18th subpatch. It introduces various new insns. Ok. r~

Re: [PATCH i386 3/8] [AVX512] [18/n] Add AVX-512 patterns: various RCPs and SQRTs.

2013-10-14 Thread Richard Henderson
On 10/09/2013 03:31 AM, Kirill Yukhin wrote: > +(define_mode_attr ssefixupmode > + [(V16SF "V16SI") (V4SF "V4SI") (V8DF "V8DI") (V2DF "V2DI")]) > + Oh, I forgot. How is this different from sseintvecmode? r~

Re: [PATCH i386 3/8] [AVX512] [18/n] Add AVX-512 patterns: various RCPs and SQRTs.

2013-10-15 Thread Richard Henderson
On 10/15/2013 06:57 AM, Kirill Yukhin wrote: > Hello, > On 14 Oct 13:10, Richard Henderson wrote: >> On 10/09/2013 03:31 AM, Kirill Yukhin wrote: >>> +(define_mode_attr ssefixupmode >>> + [(V16SF "V16SI") (V4SF "V4SI") (V8DF "V8DI") (V2D

Re: [PATCH i386 3/8] [AVX512] [19/n] Add AVX-512 patterns: Extracts and converts.

2013-10-15 Thread Richard Henderson
On 10/09/2013 03:31 AM, Kirill Yukhin wrote: > + rtx op1 = operands[1]; > + if (REG_P (op1)) > +op1 = gen_rtx_REG (V16HImode, REGNO (op1)); > + else > +op1 = gen_lowpart (V16HImode, op1); The IF case is incorrect. You need to use gen_lowpart always. > +(define_insn "*avx512f_unpcklpd5

Re: [PATCH i386 3/8] [AVX512] [20/n] Add AVX-512 patterns: Misc.

2013-10-15 Thread Richard Henderson
On 10/09/2013 03:31 AM, Kirill Yukhin wrote: > + else if (TARGET_AVX512PF && (write || !TARGET_PREFETCH_SSE)) > +operands[2] = GEN_INT (1); I don't believe you want the TARGET_PREFETCH_SSE check there. That was really to select between SSE and 3dNow prefetch. If we have AVX, we're guaranteed

Re: [PATCH i386 3/8] [AVX512] [19/n] Add AVX-512 patterns: Extracts and converts.

2013-10-16 Thread Richard Henderson
On 10/16/2013 09:07 AM, Kirill Yukhin wrote: > I suspect gen_lowpart is bad turn when reload is completed, as > far as it can create new pseudo. gen_lowpart () may call > gen_reg_rtx (), which contain corresponging gcc_assert (). False. gen_lowpart is perfectly safe post-reload. Indeed, taking th

Re: [PATCH i386 AVX2] Remove redundant expands.

2013-10-16 Thread Richard Henderson
On 10/16/2013 09:47 AM, Uros Bizjak wrote: > On Wed, Oct 16, 2013 at 6:06 PM, Kirill Yukhin > wrote: > >> It seems that gang of AVX* patterns were copy and >> pasted from SSE, however as far as they are NDD, >> we may remove corresponding expands which sort operands. > > OTOH, I have some secon

Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-21 Thread Richard Henderson
On 10/17/2013 07:15 AM, Kirill Yukhin wrote: > +(define_mode_attr ssescalarsize > + [(V8DI "64") (V4DI "64") (V2DI "64") > + (V32HI "16") (V16HI "16") (V8HI "16") > + (V16SI "32") (V8SI "32") (V4SI "32") > + (V16SF "16") (V8DF "64")]) Error on V16SF. Probably better to fill this out. >

Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-22 Thread Richard Henderson
On 10/22/2013 07:42 AM, Kirill Yukhin wrote: > Hello Richard, > Thanks for remarks, they all seems reasonable. > > One question > > On 21 Oct 16:01, Richard Henderson wrote: >>> +(define_insn "avx512f_moves_mask" >>> + [(set

Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-22 Thread Richard Henderson
On 10/21/2013 11:10 AM, Jeff Law wrote: > So why are bounds distinct modes?Is there some inherent reason why bounds > are something other than an integer mode (MODE_INT)? I suggested the distinct modes during the NDA phase. The primary reason for this is that MPX is designed to be kind of bac

Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-22 Thread Richard Henderson
On 10/22/2013 12:18 PM, Jeff Law wrote: >> The only way I could think to positively ensure that normal operations >> didn't get implemented via mpx insns is to describe the new patterns >> with distinct modes. > Presumably once we have a distinct mode, we do the right magic in > HARD_REGNO_MODE_OK

Re: [PATCH 1/n] Add conditional compare support

2013-10-23 Thread Richard Henderson
> +static enum rtx_code > +arm_ccmode_to_code (enum machine_mode mode) > +{ > + switch (mode) > +{ > +case CC_DNEmode: > + return NE; Why would you need to encode comparisons in CCmodes? That looks like a mis-design to me. > +Conditional compare instruction. Operand 2 and 5 are RTL

Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-23 Thread Richard Henderson
On 10/23/2013 02:41 PM, Jeff Law wrote: > Out of curiosity, did you consider and/or discuss with Richard whether or not > to make these target-dependent or target-independent builtins? I realize it's > a bit problematic with Richard being involved during the NDA portion and > someone else during t

[RFC] PR 58542: const_int vs lost modes

2013-10-23 Thread Richard Henderson
In this pr, we have a -1 in type __int128. Since this value can be represented in a HOST_WIDE_INT, we expand this to a const_int. The expansion from tree to rtl happens in expand_builtin_atomic_store. And as with most of our builtins, we then pass off the rtl to another routine for expansion. W

Re: [PATCH] Generate fused widening multiply-and-accumulate operations only when the widening multiply has single use

2013-10-23 Thread Richard Henderson
On 10/21/2013 03:01 PM, Yufeng Zhang wrote: > > This patch changes the widening_mul pass to fuse the widening multiply with > accumulate only when the multiply has single use. The widening_mul pass > currently does the conversion regardless of the number of the uses, which can > cause poor code-g

Re: [PATCH 1/n] Add conditional compare support

2013-10-24 Thread Richard Henderson
On 10/24/2013 01:11 AM, Zhenqiang Chen wrote: >> Why would you need to encode comparisons in CCmodes? >> That looks like a mis-design to me. > > The CCmodes are used to check whether the result of a previous conditional > compare can combine with current compare. By changing it to rtx_code, I can

Re: [PATCH 1/n] Add conditional compare support

2013-10-24 Thread Richard Henderson
On 10/24/2013 09:24 AM, Richard Earnshaw wrote: > On 24/10/13 17:15, Richard Henderson wrote: >> On 10/24/2013 01:11 AM, Zhenqiang Chen wrote: >>>> Why would you need to encode comparisons in CCmodes? >>>> That looks like a mis-design to me. >>> >>&g

Re: [PATCH 1/n] Add conditional compare support

2013-10-24 Thread Richard Henderson
On 10/24/2013 09:37 AM, Richard Earnshaw wrote: > It still needs to put out the right final condition based on the > comparisons that were previously done. At least traditionaly (in the > existing ARM code) the comparison was just EQ or NE even if the original > tests were inequalities; so the onl

Re: [RFC] PR 58542: const_int vs lost modes

2013-10-24 Thread Richard Henderson
On 10/24/2013 05:02 AM, Richard Sandiford wrote: > Do we actually need to do a conversion here at all? It looks like the > modes of "expected" and "desired" should already match "mem", so we could > just use create_input_operand. This works. I've committed the following to mainline, and will tes

Re: [PATCH 1/n] Add conditional compare support

2013-10-28 Thread Richard Henderson
On 10/28/2013 01:32 AM, Zhenqiang Chen wrote: > Patch is updated according to your comments. Main changes are: > * Add two hooks: legitimize_cmp_combination and legitimize_ccmp_combination > * Improve document. No, these are not the hooks I proposed. You should *not* have a ccmp_optab, because th

Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-28 Thread Richard Henderson
On 10/28/2013 03:24 AM, Kirill Yukhin wrote: > Hello Richard, > On 22 Oct 08:16, Richard Henderson wrote: >> On 10/22/2013 07:42 AM, Kirill Yukhin wrote: >>> Hello Richard, >>> Thanks for remarks, they all seems reasonable. >>> >>> One question

Re: [PATCH i386 4/8] [AVX512] Add substed patterns: mask_scalar_merge subst.

2013-10-28 Thread Richard Henderson
On 10/28/2013 07:53 AM, Kirill Yukhin wrote: > This patch introduces "mask_scalar_merge" subst. > > Is it ok to commit to main trunk? Ok. r~

Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-28 Thread Richard Henderson
On 10/28/2013 01:58 PM, Kirill Yukhin wrote: > Hello Richard, > On 28 Oct 08:20, Richard Henderson wrote: >> Why is a masked *scalar* operation useful? > > The reason the instructions exist is so that > you can do fully fault correct predicated scalar algorithms. Using VEC_M

Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-29 Thread Richard Henderson
On 10/29/2013 03:02 AM, Kirill Yukhin wrote: > Hello Richard, > > On 28 Oct 14:45, Richard Henderson wrote: >> On 10/28/2013 01:58 PM, Kirill Yukhin wrote: >>> Hello Richard, >>> On 28 Oct 08:20, Richard Henderson wrote: >>>> Why is a masked *sca

Re: [RFC PATCH] For TARGET_AVX use *mov_internal for misaligned loads

2013-10-30 Thread Richard Henderson
On 10/30/2013 02:47 AM, Jakub Jelinek wrote: > 2013-10-30 Jakub Jelinek > > * config/i386/i386.c (ix86_avx256_split_vector_move_misalign): If > op1 is misaligned_operand, just use *mov_internal insn > rather than UNSPEC_LOADU load. > (ix86_expand_vector_move_misalign): L

Re: [PATCH 1/n] Add conditional compare support

2013-10-30 Thread Richard Henderson
> +/* RCODE0, RCODE1 and a valid return value should be enum rtx_code. > + TCODE should be enum tree_code. > + Check whether two compares are a valid combination in the target to > generate > + a conditional compare. If valid, return the new compare after > combination. > + */ > +DEFHOOK

  1   2   3   4   5   6   7   8   9   10   >