Re: [PATCH] Add emulated gather capability to the vectorizer

2021-08-02 Thread Kewen.Lin via Gcc-patches
on 2021/8/2 下午5:11, Richard Biener wrote: > On Mon, 2 Aug 2021, Kewen.Lin wrote: > >> on 2021/8/2 下午3:09, Richard Biener wrote: >>> On Mon, 2 Aug 2021, Kewen.Lin wrote: >>> >>>> on 2021/7/30 下午10:04, Kewen.Lin via Gcc-patches wrote: >>>>>

[PATCH] Add cond_add/sub/mul for vector integer modes.

2021-08-02 Thread liuhongt via Gcc-patches
Hi: This is a follow up of [1]. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Pushed to trunk. [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576514.html gcc/ChangeLog: * config/i386/sse.md (cond_): New expander. (cond_mul): Ditto. gcc/testsuite

[PATCH v3] Make loops_list support an optional loop_p root

2021-08-03 Thread Kewen.Lin via Gcc-patches
> - Keep the linear search for LI_ONLY_INNERMOST with >> tree_root of fn loops. >> - Use class loop * instead of loop_p. >> >> Bootstrapped & regtested on powerpc64le-linux-gnu Power9 >> (with/without the hunk for LI_ONLY_INNERMOST linear search, >>

Ping: [PATCH v2] Analyze niter for until-wrap condition [PR101145]

2021-08-03 Thread guojiufu via Gcc-patches
Hi, I would like to have a ping on this. https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574596.html BR, Jiufu On 2021-07-15 08:17, guojiufu via Gcc-patches wrote: Hi, I would like to have an early ping on this with more mail addresses. BR, Jiufu. On 2021-07-07 20:47, Jiufu Guo wrote

[PATCH] [i386] Refine predicate of peephole2 to general_reg_operand. [PR target/101743]

2021-08-03 Thread liuhongt via Gcc-patches
believe that the PR problem should be solved by this patch. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/101743 * config/i386/i386.md (peephole2): Refine predicate from register_operand to general_reg_operand. --- gcc/config

[PATCH] [i386] Support cond_{fma, fms, fnma, fnms} for vector float/double under AVX512.

2021-08-03 Thread liuhongt via Gcc-patches
Hi: This patch add expanders cond_{fma,fms,fnms,fnms} for vector float/double modes. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Pushed to trunk. gcc/ChangeLog: * config/i386/sse.md (cond_fma): New expander. (cond_fms): Ditto. (cond_fnma): Ditto

[PATCH] Add dg-require-effective-target for testcases.

2021-08-03 Thread liuhongt via Gcc-patches
Hi: Pushed to trunk as an abvious fix. gcc/testsuite/ChangeLog: * gcc.target/i386/cond_op_addsubmul_d-2.c: Add dg-require-effective-target for avx512. * gcc.target/i386/cond_op_addsubmul_q-2.c: Ditto. * gcc.target/i386/cond_op_addsubmul_w-2.c: Ditto

Re: [PATCH v3] Make loops_list support an optional loop_p root

2021-08-04 Thread Kewen.Lin via Gcc-patches
ee_root->num == 0 >>> >>> and the walk_loop_tree could simply do >>> >>> class loop *exclude = flags & LI_INCLUDE_ROOT ? NULL : root; >>> >>> and pointer test aloop against exclude. That avoids the idea that >>> 'mn' is a

[PATCH 0/3] [i386] Support cond_{smax, smin, umax, umin, xor, ior, and} for vector modes under AVX512

2021-08-04 Thread liuhongt via Gcc-patches
under AVX512. [i386] Support cond_{smax,smin} for vector float/double modes under AVX512. [i386] Support cond_{xor,ior,and} for vector integer mode under AVX512. gcc/config/i386/sse.md| 54 + .../gcc.target/i386/cond_op_anylogic_d-1.c| 38

[PATCH 1/3] [i386] Support cond_{smax, smin, umax, umin} for vector integer modes under AVX512.

2021-08-04 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/sse.md (cond_): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/cond_op_maxmin_b-1.c: New test. * gcc.target/i386/cond_op_maxmin_b-2.c: New test. * gcc.target/i386/cond_op_maxmin_d-1.c: New test. * gcc.target/i386

[PATCH 3/3] [i386] Support cond_{xor, ior, and} for vector integer mode under AVX512.

2021-08-04 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/sse.md (cond_): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/cond_op_anylogic_d-1.c: New test. * gcc.target/i386/cond_op_anylogic_d-2.c: New test. * gcc.target/i386/cond_op_anylogic_q-1.c: New test. * gcc.target

[PATCH 2/3] [i386] Support cond_{smax, smin} for vector float/double modes under AVX512.

2021-08-04 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/sse.md (cond_): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/cond_op_maxmin_double-1.c: New test. * gcc.target/i386/cond_op_maxmin_double-2.c: New test. * gcc.target/i386/cond_op_maxmin_float-1.c: New test

[PATCH] rs6000: Add vec_unpacku_{hi,lo}_v4si

2021-08-04 Thread Kewen.Lin via Gcc-patches
tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. btw, the loop in unpack-vectorize-2.c doesn't get vectorized without this patch, unpack-vectorize-[13]* is to verify the vector merging and simplification works expectedly. Is it ok for trunk? BR, Kewen - gcc/ChangeLog:

Re: [PATCH v3] Make loops_list support an optional loop_p root

2021-08-05 Thread Kewen.Lin via Gcc-patches
on 2021/8/4 下午8:04, Richard Biener wrote: > On Wed, Aug 4, 2021 at 12:47 PM Kewen.Lin wrote: >> >> on 2021/8/4 下午6:01, Richard Biener wrote: >>> On Wed, Aug 4, 2021 at 4:36 AM Kewen.Lin wrote: on 2021/8/3 下午8:08, Richard Biener wrote: > On Fri, Jul 30, 2021 at 7:20 AM Kewen.Lin wro

[PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.

2021-08-05 Thread liuhongt via Gcc-patches
call to extract_integral_bit_field, extracting in an integer mode with the same size as 'mode' and then converting the result as (subreg:HF (reg:HI ...)). --- This is a separate patch as a follow up of upper comments. gcc/ChangeLog: * expmed.c (extract_bit_field_1): Wrap the call to

[PATCH] [rtl-optimization] Simplify vector shift/rotate with const_vec_duplicate to vector shift/rotate with const_int element.

2021-08-06 Thread liuhongt via Gcc-patches
Hi: Bootstrapped and regtested on x86_64-linux-gnu{-m32,} Ok for trunk? gcc/ChangeLog: PR rtl-optimization/101796 * simplify-rtx.c (simplify_context::simplify_binary_operation_1): Simplify vector shift/rotate with const_vec_duplicate to vector shift

[r12-2729 Regression] FAIL: g++.dg/cpp2a/concepts-pr67774.C -std=c++2a (test for excess errors) on Linux/x86_64

2021-08-06 Thread sunil.k.pandey via Gcc-patches
/concepts-pr67774.C -std=c++2a (test for excess errors) with GCC configured with To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/cpp2a/concepts-pr67774.C --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS=&qu

[r12-2733 Regression] FAIL: gcc.target/i386/vect-gather-1.c scan-tree-dump vect "loop vectorized" on Linux/x86_64

2021-08-06 Thread sunil.k.pandey via Gcc-patches
scan-tree-dump vect "loop vectorized" with GCC configured with To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/vect-gather-1.c --target_board='unix{-m32}'" (Please do not reply to this email, for question about this r

[r12-2730 Regression] FAIL: g++.old-deja/g++.other/inline7.C -std=gnu++2a (test for excess errors) on Linux/x86_64

2021-08-06 Thread sunil.k.pandey via Gcc-patches
-std=gnu++17 (test for excess errors) FAIL: g++.old-deja/g++.other/inline7.C -std=gnu++2a (test for excess errors) with GCC configured with To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="old-deja.exp=g++.old-deja/g++.other/inline7.C --target_board='

[r12-2789 Regression] FAIL: gcc.dg/tree-ssa/gen-vect-11b.c scan-tree-dump-times vect "vectorized 0 loops" 1 on Linux/x86_64

2021-08-06 Thread sunil.k.pandey via Gcc-patches
-vect-11b.c scan-tree-dump-times vect "vectorized 0 loops" 1 with GCC configured with To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/gen-vect-11b.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc &&

[r12-2766 Regression] FAIL: g++.dg/warn/Wstringop-overflow-6.C -std=gnu++2a (test for excess errors) on Linux/x86_64

2021-08-06 Thread sunil.k.pandey via Gcc-patches
/Wstringop-overflow-6.C -std=gnu++2a (test for excess errors) with GCC configured with To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/warn/Wstringop-overflow-6.C --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make che

[PATCH v2] rs6000: Add vec_unpacku_{hi,lo}_v4si

2021-08-08 Thread Kewen.Lin via Gcc-patches
nux-gnu P9 and >> powerpc64-linux-gnu P8. >> >> btw, the loop in unpack-vectorize-2.c doesn't get vectorized >> without this patch, unpack-vectorize-[13]* is to verify >> the vector merging and simplification works expectedly. >> >> Is it ok for trunk? >&g

[PATCH AArch64]Fix expanding of %w for *extend... pattern

2021-08-08 Thread bin.cheng via Gcc-patches
Hi, When playing with std::experimental::simd, I found a bug newly introduced in AArch64 backend. As commit message describes: 7 Pattern "*extend2_aarch64" is duplicated 8 from the corresponding zero_extend pattern, however % needs 9 to be expanded according to its mode iterator

[PATCH] [i386] Support cond_ashr/lshr/ashl for vector integer modes under AVX512.

2021-08-09 Thread liuhongt via Gcc-patches
Hi: Boostrapped and regtested on x86_64-linux-gnu{-m32,}. gcc/ChangeLog: * config/i386/sse.md (cond_): New expander. (VI248_AVX512VLBW): New mode iterator. * config/i386/predicates.md (nonimmediate_or_const_vec_dup_operand): New predicate. gcc/testsuite

[r12-2808 Regression] FAIL: gfortran.dg/ieee/pr77372.f90 -O (test for excess errors) on Linux/x86_64

2021-08-09 Thread sunil.k.pandey via Gcc-patches
: gfortran.dg/ieee/pr77372.f90 -O (test for excess errors) with GCC configured with To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/dec_math_1.f90 --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUN

[PATCH] Extend ldexp{s, d}f3 to vscalefs{s, d} when TARGET_AVX512F and TARGET_SSE_MATH.

2021-08-10 Thread liuhongt via Gcc-patches
Hi: AVX512F supported vscalefs{s,d} which is the same as ldexp except the second operand should be floating point. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. gcc/ChangeLog: PR target/98309 * config/i386/i386.md (ldexp3): Extend to vscalefs[sd] when

[PATCH] rs6000: Add missing unsigned info for some P10 bifs

2021-08-10 Thread Kewen.Lin via Gcc-patches
tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. Is it ok for trunk? BR, Kewen - gcc/ChangeLog: * config/rs6000/rs6000-call.c (builtin_function_type): Add unsigned signedness for some Power10 bifs. --- gcc/config/rs6000/rs6000-call.c | 5 + 1 file c

[PATCH] [i386] Combine avx_vec_concatv16si and avx512f_zero_extendv16hiv16si2_1 to avx512f_zero_extendv16hiv16si2_2.

2021-08-10 Thread liuhongt via Gcc-patches
=6] avx_vec_concatv16si/2 vpmovzxwd %ymm0, %zmm0# 22[c=4 l=6] avx512f_zero_extendv16hiv16si2 ret # 25[c=0 l=1] simple_return_internal Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/101846

[PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-10 Thread Kewen.Lin via Gcc-patches
ewen ----- gcc/ChangeLog: * config/rs6000/rs6000.c (rs6000_builtin_md_vectorized_function): Add support for some built-in functions vectorized on Power10. gcc/testsuite/ChangeLog: * gcc.target/powerpc/dive-vectorize-1.c: New test. * gcc.target/powerpc/dive-vectorize

Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-11 Thread Kewen.Lin via Gcc-patches
This patch is to add the support to make vectorizer able to >> vectorize scalar version of some built-in functions with its >> corresponding vector version with Power10 support. >> >> Bootstrapped & regtested on powerpc64le-linux-gnu {P9,P10} >> and powerpc64-linux-gnu

[PATCH] [i386] Introduce a scalar version of avx512f_vmscalef and adjust ldexp3 for it.

2021-08-11 Thread liuhongt via Gcc-patches
Hi: This is the patch i'm going to checkin. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}; 2021-08-12 Uros Bizjak gcc/ChangeLog: PR target/98309 * config/i386/i386.md (avx512f_scalef2): New define_insn. (ldexp3): Adjust for new define

[PATCH] [i386] Optimize vec_perm_expr to match vpmov{dw,qd,wb}.

2021-08-11 Thread liuhongt via Gcc-patches
still keep this part of the code. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/101846 * config/i386/i386-expand.c (expand_vec_perm_trunc_vinsert): New function. (ix86_vectorize_vec_perm_const): Call

Re: [PATCH] rs6000: Add missing unsigned info for some P10 bifs

2021-08-12 Thread Kewen.Lin via Gcc-patches
e won't have this kind of issue with your new support, nice!! FWIW, for now the bif vectorization still requires this type consistence to make type check happy. BR, Kewen > Thanks, > Bill > >> >> BR, >> Kewen >> - >> gcc/ChangeLog: >> >> * config/rs6000/rs6000-call.c (builtin_function_type): Add unsigned >> signedness for some Power10 bifs.

Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-12 Thread Kewen.Lin via Gcc-patches
if (fn == MISC_BUILTIN_DIVDE) >> +vname = P10V_BUILTIN_DIVES_V2DI; >> + else >> +vname = P10V_BUILTIN_DIVEU_V2DI; >> + break; > > All of the above should not be builtin functions really, they are all > simple arithmetic :-( They should not be

Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-12 Thread Kewen.Lin via Gcc-patches
on 2021/8/12 下午11:51, Segher Boessenkool wrote: > On Thu, Aug 12, 2021 at 10:10:10AM +0800, Kewen.Lin wrote: >>> + enum rs6000_builtins vname = RS6000_BUILTIN_COUNT; >>> >>> Using this as a flag value looks unnecessary. Is this just being done to >>> silence a warning? >> >> Good question!

[r12-2898 Regression] FAIL: g++.dg/warn/uninit-1.C -std=gnu++98 (test for excess errors) on Linux/x86_64

2021-08-13 Thread sunil.k.pandey via Gcc-patches
=gnu++14 (test for excess errors) FAIL: g++.dg/warn/uninit-1.C -std=gnu++17 (test for excess errors) FAIL: g++.dg/warn/uninit-1.C -std=gnu++2a (test for excess errors) FAIL: g++.dg/warn/uninit-1.C -std=gnu++98 (test for excess errors) with GCC configured with To reproduce: $ cd {build_dir}/gcc

[PATCH] Fix PR c++/66590: incorrect warning "reaches end of non-void function" for switch

2021-08-13 Thread apinski--- via Gcc-patches
always fall through. Anyways this adds the code for the case of a CLEANUP_STMT that is only for !CLEANUP_EH_ONLY (the try/finally case). OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/cp/ChangeLog: * cp-objcp-common.c (cxx_block_may_fallthru): Handle

[PATCH 1/2] Add gimple_truth_valued_p to match.pd and use it

2021-08-13 Thread apinski--- via Gcc-patches
match pattern call gimple_truth_valued_p which matches on SSA_NAME and checks ssa_name_has_boolean_range. This is the first of the few cleanups I am going to do for match and simplify and boolean related changes. gcc/ChangeLog: * match.pd: New match, gimple_truth_valued_p. Use it for

[PATCH 2/2] Fix 101805: Simplify min/max of boolean arguments

2021-08-13 Thread apinski--- via Gcc-patches
From: Andrew Pinski I noticed this while Richard B. fixing PR101756. Basically min of two bools is the same as doing an "and" and max of two bools is doing an "ior". gcc/ChangeLog: * match.pd: Add min/max patterns for bool types. gcc/testsuite/ChangeLog:

[PATCH] Add range/nonzero info to generated ADD_OVERFLOW and simplify

2021-08-13 Thread apinski--- via Gcc-patches
From: Andrew Pinski Even though this does not change the generated code, it does improve the initial RTL generation. gcc/ChangeLog: * tree-ssa-math-opts.c (match_arith_overflow): Add range and nonzero bits information to the new overflow ssa name. Also fold the

Re: Ping: [PATCH v2] Analyze niter for until-wrap condition [PR101145]

2021-08-15 Thread Bin.Cheng via Gcc-patches
On Wed, Aug 4, 2021 at 10:42 AM guojiufu wrote: > > Hi, > > I would like to have a ping on this. > > https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574596.html Sorry for being late in replying. > > BR, > Jiufu > > On 2021-07-15 08:17, guojiufu via Gcc-patch

[PATCH] [i386] Optimize __builtin_shuffle_vector.

2021-08-15 Thread liuhongt via Gcc-patches
d and regtested on x86_64-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/101846 * config/i386/i386-expand.c (ix86_expand_vec_perm_vpermt2): Support vpermi2b for V32QI/V16QImode. (ix86_extract_perm_from_pool_constant): New fun

[PATCH] vect: Add extraction cost for slp reduc

2021-08-15 Thread Kewen.Lin via Gcc-patches
and aarch64 is ongoing. Is it ok for trunk? BR, Kewen ----- gcc/ChangeLog: * tree-vect-slp.c (vectorizable_bb_reduc_epilogue): Add the cost for value extraction. diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index b9d88c2d943..841a0872afa 100644 --- a/gcc/tree-vect-s

Re: [PATCH] vect: Add extraction cost for slp reduc

2021-08-16 Thread Kewen.Lin via Gcc-patches
we cost the reduction as shuffle and reduc_op during SLP for now, I guess it's good to get vec_to_scalar considered here for consistency? Then it can be removed together when we have a better modeling in the end? BR, Kewen > > Richard. > >> BR, >> Kewen >>

[PATCH] [i386] Fix ICE.

2021-08-16 Thread liuhongt via Gcc-patches
Hi: avx512f_scalef2 only accept register_operand for operands[1], force it to reg in ldexp3. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for trunk. gcc/ChangeLog: PR target/101930 * config/i386/i386.md (ldexp3): Force operands[1] to reg. gcc

[r12-2919 Regression] FAIL: gcc.target/i386/pr82460-2.c scan-assembler-not \\mvpermi2b\\M on Linux/x86_64

2021-08-16 Thread sunil.k.pandey via Gcc-patches
\\mvpermi2b\\M with GCC configured with To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/pr82460-2.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i

[PATCH] [i386] Add x86 tune to enable v2df vector reduction by paddpd.

2021-08-17 Thread liuhongt via Gcc-patches
Hi: This patch add a new x86 tune named X86_TUNE_V2DF_REDUCTION_PREFER_HADDPD to enable haddpd for v2df vector reduction, the tune is disabled by default. Bootstrapped and regtested on x86_64-linux-gnu{-m32,} Ok for trunk? gcc/ChangeLog: PR target/97147 * config/i386/i386

[PATCH] Revert "Add the member integer_to_sse to processor_cost as a cost simulation for movd/pinsrd. It will be used to calculate the cost of vec_construct."

2021-08-17 Thread liuhongt via Gcc-patches
This reverts commit 872da9a6f664a06d73c987aa0cb2e5b830158a10. PR target/101936 PR target/101929 Bootstrapped and regtested on x86_64-linux-gnu{-m32,} Pushed to master. --- gcc/config/i386/i386.c | 6 +- gcc/config/i386/i386.h | 1 - gcc/config/i386

[r12-3003 Regression] FAIL: g++.dg/analyzer/pr96641.C -std=c++98 (test for excess errors) on Linux/x86_64

2021-08-18 Thread sunil.k.pandey via Gcc-patches
++14 (test for excess errors) FAIL: g++.dg/analyzer/pr96641.C -std=c++17 (test for excess errors) FAIL: g++.dg/analyzer/pr96641.C -std=c++2a (test for excess errors) FAIL: g++.dg/analyzer/pr96641.C -std=c++98 (test for excess errors) with GCC configured with To reproduce: $ cd {build_dir}/gcc

[r12-3002 Regression] FAIL: gfortran.dg/analyzer/pr96949.f90 -O (test for excess errors) on Linux/x86_64

2021-08-18 Thread sunil.k.pandey via Gcc-patches
\) entry to 'make_boxed_int'.*\n\| NN \| \{.*\n\| NN \| boxed_int \*result = \(boxed_int \*\)wrapped_malloc \(sizeof \(boxed_int\)\);.*\n\| \| ~~~\n\| \|

Re: [PATCH] more warning code refactoring

2021-08-18 Thread Kewen.Lin via Gcc-patches
Hi David, on 2021/8/19 上午11:26, David Edelsohn via Gcc-patches wrote: > Hi, Martin > > A few PowerPC-specific testcases started failing yesterday on AIX with > a strange failure mode: the compiler runs out of memory. As you may > expect from telling you this in an email reply t

Re: [PATCH] more warning code refactoring

2021-08-19 Thread Kewen.Lin via Gcc-patches
Hi Martin, on 2021/8/20 上午12:30, Martin Sebor wrote: > On 8/19/21 9:03 AM, Martin Sebor wrote: >> On 8/18/21 11:56 PM, Kewen.Lin wrote: >>> Hi David, >>> >>> on 2021/8/19 上午11:26, David Edelsohn via Gcc-patches wrote: >>>> Hi, Martin >>&g

[r12-3052 Regression] FAIL: gcc.dg/analyzer/malloc-callbacks.c (test for excess errors) on Linux/x86_64

2021-08-21 Thread sunil.k.pandey via Gcc-patches
errors) with GCC configured with To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="analyzer.exp=gcc.dg/analyzer/malloc-callbacks.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="analyzer.exp=gcc.dg/analyzer

[PATCH] Disable slp in loop vectorizer when cost model is very-cheap.

2021-08-22 Thread liuhongt via Gcc-patches
mp2decoddata2 24.12 mp2decoddata3 10.83 mp2decoddata4 10.04 mp2decoddata5 10.07 Survived regression test. Ok for trunk? gcc/ChangeLog: PR tree-optimization/100089 * tree-vectorizer.c (try_vectorize_loop_1): Disable slp in loop vectorizer when cost model is

[PATCH] [i386] Fix ICE.

2021-08-23 Thread liuhongt via Gcc-patches
Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Pushed to trunk. gcc/ChangeLog: PR target/102016 * config/i386/sse.md (*avx512f_pshufb_truncv8hiv8qi_1): Add TARGET_AVX512BW to condition. gcc/testsuite/ChangeLog: PR target/102016 * gcc.target

[PATCH] [i386] Optimize (a & b) | (c & ~b) to vpternlog instruction.

2021-08-23 Thread liuhongt via Gcc-patches
should be equal to op1/op2 Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. gcc/ChangeLog: PR target/101989 * config/i386/i386-protos.h (ix86_strip_reg_or_notreg_operand): New declare. * config/i386/i386.c (ix86_rtx_costs): Define cost for

Re: [PATCH][v2] Remove --param vect-inner-loop-cost-factor

2021-08-23 Thread Kewen.Lin via Gcc-patches
Hi Richi, on 2021/8/23 下午10:33, Richard Biener via Gcc-patches wrote: > This removes --param vect-inner-loop-cost-factor in favor of looking > at the estimated number of iterations of the inner loop > when available and otherwise just assumes a single inner > iteration which is con

[PATCH] Fix a few problems with download_prerequisites.

2021-08-23 Thread apinski--- via Gcc-patches
"Darwin"|"FreeBSD"|"DragonFly"|"AIX") +chksum='shasum -a 512 --check' + ;; + "OpenBSD") +chksum='sha512 -c' + ;; + *) +chksum='sha512sum -c' + ;; +esac + ;; + md5) +

[PATCH] Change illegitimate constant into memref of constant pool in change_zero_ext.

2021-08-24 Thread liuhongt via Gcc-patches
u{-m32,}. Ok for trunk? gcc/ChangeLog: PR rtl-optimization/43147 * combine.c (recog_for_combine_1): Adjust comments of .. (change_zero_ext):.. this, and extend to change illegitimate constant into constant pool. gcc/testsuite/ChangeLog: PR rtl-

[PATCH] [i386] Enable avx512 embedde broadcast for vpternlog.

2021-08-24 Thread liuhongt via Gcc-patches
gcc/ChangeLog: PR target/101989 * config/i386/sse.md (_vternlog): Enable avx512 embedded broadcast. (*_vternlog_all): Ditto. (_vternlog_mask): Ditto. gcc/testsuite/ChangeLog: PR target/101989 * gcc.target/i386/pr101989-broadcast-1.c: New

[r12-3108 Regression] FAIL: gcc.target/i386/sse2-shiftqihi-constant-1.c scan-assembler-times pxor[^\n]*%xmm 1 on Linux/x86_64

2021-08-24 Thread sunil.k.pandey via Gcc-patches
hiftqihi-constant-1.c scan-assembler-times vpand[^\n]*%ymm 3 FAIL: gcc.target/i386/avx2-shiftqihi-constant-1.c scan-assembler-times vpxor[^\n]*%ymm 1 FAIL: gcc.target/i386/sse2-shiftqihi-constant-1.c scan-assembler-times pxor[^\n]*%xmm 1 with GCC configured with To reproduce: $ cd {build_dir}/gcc

[PATCH] Adjust testcases to avoid new failures brought by r12-3108 when compiled w -march=cascadelake.

2021-08-24 Thread liuhongt via Gcc-patches
Pushed to trunk as an obvious fix. gcc/testsuite/ChangeLog: PR target/101989 * gcc.target/i386/avx2-shiftqihi-constant-1.c: Add -mno-avx512f. * gcc.target/i386/sse2-shiftqihi-constant-1.c: Add -mno-avx --- gcc/testsuite/gcc.target/i386/avx2-shiftqihi-constant-1.c | 2

Re: Ping: [PATCH v2] Analyze niter for until-wrap condition [PR101145]

2021-08-24 Thread guojiufu via Gcc-patches
On 2021-08-16 09:33, Bin.Cheng wrote: On Wed, Aug 4, 2021 at 10:42 AM guojiufu wrote: ... >> diff --git a/gcc/testsuite/gcc.dg/vect/pr101145.inc >> b/gcc/testsuite/gcc.dg/vect/pr101145.inc >> new file mode 100644 >> index 000..6eed3fa8aca >> --- /d

Re: [PATCH v2] rs6000: Add vec_unpacku_{hi,lo}_v4si

2021-08-24 Thread Kewen.Lin via Gcc-patches
on 2021/8/24 下午9:02, Segher Boessenkool wrote: > Hi Ke Wen, > > On Mon, Aug 09, 2021 at 10:53:00AM +0800, Kewen.Lin wrote: >> on 2021/8/6 下午9:10, Bill Schmidt wrote: >>> On 8/4/21 9:06 PM, Kewen.Lin wrote: The existing vec_unpacku_{hi,lo} supports emulated unsigned unpacking for short an

Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-24 Thread Kewen.Lin via Gcc-patches
imes we >> don't have a proper value for initialization, I agree it should be better to >> just leave it be, but IMHO it isn't the case here. :) > > ICEing is always wrong. A user should never see an ICE (not counting > "sorry"s as ICEs here -- not that those are good, but they tell the user > exactly what is going on). > Yeah, but here I was expecting the ICE happens when GCC developers are testing the newly added bif supports. :) BR, Kewen

Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-24 Thread Kewen.Lin via Gcc-patches
es builtins that do not exist on 32-bit? >> >> Yeah, those bifs which are guarded with lp64 in their cases are only >> supported on 64-bit environment. > > It is a pity we cannot use "powerpc64" here (that selector does not test > what you would/could/should hope it

Re: Ping: [PATCH v2] Analyze niter for until-wrap condition [PR101145]

2021-08-24 Thread Bin.Cheng via Gcc-patches
On Wed, Aug 25, 2021 at 11:26 AM guojiufu wrote: > > On 2021-08-16 09:33, Bin.Cheng wrote: > > On Wed, Aug 4, 2021 at 10:42 AM guojiufu > > wrote: > >> > ... > >> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr101145.inc > >> >> b/gcc/tes

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-08-25 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2021/8/25 下午3:06, HAO CHEN GUI via Gcc-patches wrote: > Hi, > >     I refined the patch according to Bill's advice. I pasted the ChangeLog > and diff file here. If it doesn't work, please let me know. Thanks. > > 2021-08-25 Haochen Gui > > gcc

[r12-3144 Regression] FAIL: gcc.dg/tree-ssa/pr64130.c scan-tree-dump evrp "int \\[-8589934591, -2\\]" on Linux/x86_64

2021-08-25 Thread sunil.k.pandey via Gcc-patches
rp "\\[12, \\+INF" FAIL: gcc.dg/tree-ssa/pr64130.c scan-tree-dump evrp "int \\[2, 8589934591\\]" FAIL: gcc.dg/tree-ssa/pr64130.c scan-tree-dump evrp "int \\[-8589934591, -2\\]" with GCC configured with To reproduce: $ cd {build_dir}/gcc && make check RUNTES

[r12-3142 Regression] FAIL: 20_util/enable_shared_from_this/89303.cc (test for excess errors) on Linux/x86_64

2021-08-25 Thread sunil.k.pandey via Gcc-patches
(test for excess errors) FAIL: g++.dg/torture/pr89303.C -O1 (internal compiler error) FAIL: g++.dg/torture/pr89303.C -O1 (test for excess errors) with GCC configured with To reproduce: $ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check RUNTESTFLAGS="conformanc

[PATCH] Fold more shuffle builtins to VEC_PERM_EXPR.

2021-08-25 Thread liuhongt via Gcc-patches
This patch is a follow-up to [1], it fold all shufps/shufpd builtins into gimple. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. [1] https://gcc.gnu.org/pipermail/gcc-patches/2019-May/521983.html gcc/ PR target/98167 PR target/43147 * config/i386/i386.c

[r12-3159 Regression] FAIL: gfortran.dg/pr68251.f90 -O (test for excess errors) on Linux/x86_64

2021-08-26 Thread sunil.k.pandey via Gcc-patches
-O (test for excess errors) with GCC configured with To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="compile.exp=gcc.c-torture/compile/960514-1.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="com

[PATCH] Check the type of mask while generating cond_op in gimple simplication.

2021-08-26 Thread liuhongt via Gcc-patches
than vectorized_internal_fn_supported_p. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR middle-end/102080 * internal-fn.c (cond_vectorized_internal_fn_supported_p): New functions. * internal-fn.h (cond_vectorized_internal_fn_supported_p)

[r12-3159 Regression] FAIL: gfortran.dg/pr68251.f90 -O (test for excess errors) on Linux/x86_64

2021-08-28 Thread sunil.k.pandey via Gcc-patches
/debug/99402.cc (test for excess errors) FAIL: 25_algorithms/for_each/for_each_n_debug.cc (test for excess errors) with GCC configured with To reproduce: $ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check RUNTESTFLAGS="conformance.exp=20_util/to_address/debug.cc -

Re: [PATCH] rs6000: Add missing unsigned info for some P10 bifs

2021-08-29 Thread Kewen.Lin via Gcc-patches
on 2021/8/11 下午1:44, Kewen.Lin via Gcc-patches wrote: > Hi, > > This patch is to make prototypes of some Power10 built-in > functions consistent with what's in the documentation, as > well as the vector version. Otherwise, useless conversions > can be generated in gimple

Re: [PATCH] Set bound/cmp/control for until wrap loop.

2021-08-29 Thread guojiufu via Gcc-patches
define of them and requirements in determine_exit_conditions. This patch calculate niter->control, niter->bound and niter->cmp in number_of_iterations_until_wrap. The ICEs in the PR are pass with this patch. Bootstrap and reg-tests pass on ppc64/ppc64le and x86. Is this ok for trunk? BR. Ji

[PATCH] [i386] Unify UNSPEC_MASKED_EQ/GT to the form of UNSPEC_PCMP.

2021-08-30 Thread liuhongt via Gcc-patches
define_insn_and_split to match the two forms respectively, this patch removes UNSPEC_MASK_EQ/GT, unifying them into the form of UNSPEC_PCMP. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Pushed to trunk. gcc/ChangeLog: * config/i386/sse.md (*_ucmp3_1): Change from

[PATCH] Fix PR 90142: contrib/download_prerequisites uses test ==

2021-08-30 Thread apinski--- via Gcc-patches
From: Andrew Pinski Since == is not portable, it is better to use = in contrib/ download_prerequisites. The only place == was used is inside the function md5_check which is used only on Mac OS X. Tested on Mac OS X as: ./contrib/download_prerequisites --md5 Both with all files having the correc

[PATCH] Fix PR driver/79181 (and others), not deleting some /tmp/cc* files for LTO.

2021-08-30 Thread apinski--- via Gcc-patches
regressions. gcc/ChangeLog: PR driver/79181 * collect-utils.c (setup_signals): New declaration. * collect-utils.h (setup_signals): New function. * collect2.c (handler): Delete. (main): Instead of manually setting up the signals, just call setup_signals

Re: [PATCH] Set bound/cmp/control for until wrap loop.

2021-08-30 Thread guojiufu via Gcc-patches
ICEs in the PR are pass with this patch. > Bootstrap and reg-tests pass on ppc64/ppc64le and x86. > Is this ok for trunk? > > BR. > Jiufu Guo > Add ChangeLog: gcc/ChangeLog: 2021-08-30 Jiufu Guo PR tree-optimization/102087 * tree-ssa-loop-niter.c (number_of_

[PATCH] Fix gcc.dg/ipa/inline-8.c for -fPIC

2021-08-30 Thread apinski--- via Gcc-patches
From: Andrew Pinski The problem here is with -fPIC, both cmp and move don't bind locally so they are not even tried to be inlined. This fixes the issue by marking both functions as static and now the testcase passes for both -fPIC and -fno-PIC cases. OK? Tested on x86_64-linux-gnu.

[PATCH 1/2] Revert "Make sure we're playing with integral modes before call extract_integral_bit_field."

2021-08-31 Thread liuhongt via Gcc-patches
This reverts commit 7218c2ec365ce95f5a1012a6eb425b0a36aec6bf. PR middle-end/102133 --- gcc/expmed.c | 103 +-- 1 file changed, 25 insertions(+), 78 deletions(-) diff --git a/gcc/expmed.c b/gcc/expmed.c index f083d6e86d0..3143f38e057 100644

[PATCH 0/2] Get rid of all float-int special cases in validate_subreg.

2021-08-31 Thread liuhongt via Gcc-patches
o see whether binaries are the same as HEAD~2, i guess they're the same. [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578189.html. liuhongt (2): Revert "Make sure we're playing with integral modes before call extract_integral_bit_field." Get rid of all f

[PATCH 2/2] Get rid of all float-int special cases in validate_subreg.

2021-08-31 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * emit-rtl.c (validate_subreg): Get rid of all float-int special cases. --- gcc/emit-rtl.c | 40 1 file changed, 40 deletions(-) diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index ff3b4449b37..77ea8948ee8 100644 --- a/gcc

Re: [PATCH] testsuite: Fix gcc.dg/vect/pr101145* tests [PR101145]

2021-08-31 Thread guojiufu via Gcc-patches
PR tree-optimization/102072 * gcc.dg/vect/pr101145.c: Use dg-additional-options with just -O3 instead of dg-options with -O3 -fdump-tree-vect-details. * gcc.dg/vect/pr101145_1.c: Likewise. * gcc.dg/vect/pr101145_2.c: Likewise. * gcc.dg/vect/pr101145_3

Re:Re: [PATCH] libstdc++-v3: Check for TLS support on mingw

2021-08-31 Thread lhmouse via Gcc-patches
在 2021-08-31 17:02, Jonathan Wakely 写道: > It looks like my questions about this patch never got an answer, and > it never got applied. > > Could somebody say whether TLS is enabled for native *-*-mingw* > builds? If it is, then we definitely need to add GCC_CHECK_TLS to the > cross-compiler config

Re: [PATCH] libstdc++: use a link test to test for -Wl,-z,relro

2020-09-13 Thread JonY via Gcc-patches
On 9/10/20 2:23 PM, JonY wrote: > Do a link test instead of just a grep. The linker can > support multiple targets, but not all targets can use it. > > Cygwin/MinGW ld can support ELF but the PE format for Windows itself > does not support such a feature. Attached patch OK? > > I'm not confident

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-14 Thread luoxhu via Gcc-patches
On 2020/9/10 18:08, Richard Biener wrote: > On Wed, Sep 9, 2020 at 6:03 PM Segher Boessenkool > wrote: >> >> On Wed, Sep 09, 2020 at 04:28:19PM +0200, Richard Biener wrote: >>> On Wed, Sep 9, 2020 at 3:49 PM Segher Boessenkool >>> wrote: Hi! On Tue, Sep 08, 2020 at 10:26:51

[PATCH]rs6000: Remove useless insns fed into lvx/stvx [PR97019]

2020-09-14 Thread Kewen.Lin via Gcc-patches
, they will remove all useless ANDs further. Bootstrapped/regtested on powerpc64le-linux-gnu P8. Is it OK for trunk? BR, Kewen - gcc/ChangeLog: * config/rs6000/rs6000-p8swap.c (insn_rtx_pair_t): New type. (find_alignment_op): Adjust to support multiple defintions which

[r11-3192 Regression] FAIL: libgomp.c++/udr-3.C execution test on Linux/x86_64 (-m64 -march=cascadelake)

2020-09-14 Thread sunil.k.pandey via Gcc-patches
execution test with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3192/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx

[r11-3192 Regression] FAIL: libgomp.c++/udr-13.C execution test on Linux/x86_64 (-m64 -march=cascadelake)

2020-09-14 Thread sunil.k.pandey via Gcc-patches
execution test with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3192/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx

[r11-3192 Regression] FAIL: libgomp.c++/udr-3.C execution test on Linux/x86_64 (-m64)

2020-09-14 Thread sunil.k.pandey via Gcc-patches
execution test with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3192/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx

[r11-3192 Regression] FAIL: libgomp.c++/udr-13.C execution test on Linux/x86_64 (-m64)

2020-09-14 Thread sunil.k.pandey via Gcc-patches
execution test with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3192/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-14 Thread luoxhu via Gcc-patches
t;v" is global memory? I only see VAR_DECL and PARM_DECL, is there any function to check the tree variable is global? I added DECL_REGISTER, but the RTL still expands to stack: gcc/internal-fn.c: rtx to_rtx = expand_expr (view_op0, NULL_RTX, VOIDmode, EXPAND_WRITE); (gdb) p view_op0 $584 =

[PATCH v2] rs6000: Remove useless insns fed into lvx/stvx [PR97019]

2020-09-14 Thread Kewen.Lin via Gcc-patches
> - rtx and_operation = 0; >> + rtx and_operation = NULL_RTX; > > Don't change code randomly (to something arguably worse, even). Done. I may think too much and thought NULL_RTX may be preferred since it has the potential to be changed by defining it as nullptr in the current C++1

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-15 Thread Kewen.Lin via Gcc-patches
Hi Hans, on 2020/9/6 上午10:47, Hans-Peter Nilsson wrote: > On Tue, 1 Sep 2020, Bin.Cheng via Gcc-patches wrote: >>> Great idea! With explicitly specified -funroll-loops, it's bootstrapped >>> but the regression testing did show one failure (the only one): >>> &g

PING^2 [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2020-09-15 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546698.html BR, Kewen on 2020/8/31 下午1:49, Kewen.Lin via Gcc-patches wrote: > Hi, > > I'd like to gentle ping this since IVOPTs part is already to land. > > https://gcc.gnu.org/pipermail/gcc-patches/

[PATCH PR93334][RFC]Skip output dep if values stored are byte wise the same

2020-09-15 Thread bin.cheng via Gcc-patches
Hi, As suggested by PR93334 comments, this patch adds an interface identifying output dependence which can be skipped in terms of reordering and skip it in loop distribution. It also adds a new test case. Any comment? Thanks, bin 0001-Skip-output-dependence-if-values-stored-are-bytewise.patch D

[r11-3204 Regression] FAIL: g++.dg/vect/slp-pr87105.cc -std=c++2a scan-tree-dump-times slp2 "basic block part vectorized" 1 on Linux/x86_64 (-m64 -march=cascadelake)

2020-09-15 Thread sunil.k.pandey via Gcc-patches
an-tree-dump-times slp2 "basic block part vectorized" 1 FAIL: g++.dg/vect/slp-pr87105.cc -std=c++2a scan-tree-dump slp2 "vect_[^\rm]* = MIN" FAIL: g++.dg/vect/slp-pr87105.cc -std=c++2a scan-tree-dump-times slp2 "basic block part vectorized" 1 with GCC configured

[r11-3207 Regression] FAIL: gcc.dg/tree-ssa/20030807-10.c scan-tree-dump-times vrp1 " & 3" 1 on Linux/x86_64 (-m64 -march=cascadelake)

2020-09-15 Thread sunil.k.pandey via Gcc-patches
scan-tree-dump-times vrp1 " >> 2" 1 FAIL: gcc.dg/tree-ssa/20030807-10.c scan-tree-dump-times vrp1 " & 3" 1 with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3207/usr --enable-clocale=gnu --with-syst

<    4   5   6   7   8   9   10   11   12   13   >