Re: Question about generating vpmovzxbd instruction without using the interfaces in immintrin.h

2024-05-30 Thread Hongtao Liu via Gcc
r to directly calling > _mm256_cvtepu8_epi32? > > Or would it be easier if I modified the GIMPLE directly? But it seems > that there is no relevant expression or interface directly > corresponding to `vpmovzxbd` in GIMPLE. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652484.html We're working on the patch to optimize __builtin_convertvector, after that it can be as optimal as intel intrinsic. > > Thanks > Hanke Zhang -- BR, Hongtao

Re: /home/toon/compilers/gcc/libgfortran/generated/matmul_i1.c:1781:1: internal compiler error: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in vpternlog_redundant_operand_mask,

2023-08-06 Thread Hongtao Liu via Gcc
On Mon, Aug 7, 2023 at 9:38 AM Hongtao Liu wrote: > > On Mon, Aug 7, 2023 at 9:35 AM Hongtao Liu wrote: > > > > On Mon, Aug 7, 2023 at 2:08 AM Toon Moene wrote: > > > > > > Wonder if I am the only one to see this: > > > > > > https://gcc.gnu

Re: /home/toon/compilers/gcc/libgfortran/generated/matmul_i1.c:1781:1: internal compiler error: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in vpternlog_redundant_operand_mask,

2023-08-06 Thread Hongtao Liu via Gcc
On Mon, Aug 7, 2023 at 9:35 AM Hongtao Liu wrote: > > On Mon, Aug 7, 2023 at 2:08 AM Toon Moene wrote: > > > > Wonder if I am the only one to see this: > > > > https://gcc.gnu.org/pipermail/gcc-testresults/2023-August/792616.html Could you share your GCC configure,

Re: /home/toon/compilers/gcc/libgfortran/generated/matmul_i1.c:1781:1: internal compiler error: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in vpternlog_redundant_operand_mask,

2023-08-06 Thread Hongtao Liu via Gcc
19 2023 +0300 i386: eliminate redundant operands of VPTERNLOG As mentioned in PR 110202, GCC may be presented with input where control word of the VPTERNLOG intrinsic implies that some of its operands do not affect the result. In that case, we can eliminate redundant operands of the instruction by substituting any other operand in their place. This removes false dependencies. -- BR, Hongtao

Re: x86: making better use of vpternlog{d,q}

2023-05-24 Thread Hongtao Liu via Gcc
ot;) > + (set_attr "prefix" "orig,vex,evex,evex") > (set (attr "mode") > (cond [(match_test "TARGET_AVX2") > (const_string "") > @@ -17119,7 +17124,11 @@ > (match_test "optimize_function_for_size_p (cfun)")) > (const_string "V4SF") > ] > - (const_string "")))]) > + (const_string ""))) > + (set (attr "enabled") > + (if_then_else (eq_attr "alternative" "3") > + (symbol_ref " == 64 ? TARGET_AVX512F : > TARGET_AVX512VL") > + (const_string "*")))]) > > ;; PR target/100711: Split notl; vpbroadcastd; vpand as vpbroadcastd; vpandn > (define_split -- BR, Hongtao

Re: [Intel SPR] Progress of GCC support for Intel SPR features

2022-02-06 Thread Hongtao Liu via Gcc
are-development-emulator.html. And please use GCC12(main trunk, not released yet), and binutils 2.38(main trunk, not released yet). > Thanks! > > yancheng > > >> Thanks for all the help, > >> > >> yancheng > >> -- BR, Hongtao

Re: _Float16-related failures on x86_64-apple-darwin

2021-12-23 Thread Hongtao Liu via Gcc
> > Sucks to have to fix headers… and we certainly can’t fix people’s code that > may depend on __FLT_EVAL_METHOD__ have well-defined values. So not convinced > this is the right approach. > > FX -- BR, Hongtao

RE: GCC/OpenMP offloading for Intel GPUs?

2021-09-15 Thread Liu, Hongtao via Gcc
>From: Thomas Schwinge >Sent: Wednesday, September 15, 2021 7:20 PM >To: Liu, Hongtao >Cc: gcc@gcc.gnu.org; Jakub Jelinek ; Tobias Burnus >; Kirill Yukhin ; Richard >Biener >Subject: RE: GCC/OpenMP offloading for Intel GPUs? > >Hi! > >On 2021-09-15T02:00:33+, &qu

RE: GCC/OpenMP offloading for Intel GPUs?

2021-09-14 Thread Liu, Hongtao via Gcc
Thomas Schwinge >Sent: Wednesday, September 15, 2021 12:57 AM >To: gcc@gcc.gnu.org >Cc: Jakub Jelinek ; Tobias Burnus >; Kirill Yukhin ; Liu, >Hongtao >Subject: GCC/OpenMP offloading for Intel GPUs? > >Hi! > >I've had a person ask about GCC/OpenMP offloadi

Re: Enable the vectorizer at -O2 for GCC 12

2021-09-06 Thread Hongtao Liu via Gcc
via Gcc > > Sent: Monday, August 30, 2021 2:05 PM > > To: gcc@gcc.gnu.org > > Cc: ja...@redhat.com; Richard Earnshaw ; > > Segher Boessenkool ; Richard Sandiford > > ; premachandra.malla...@amd.com; > > Hongtao Liu > > Subject: Enable the vectorizer at -O2

Re: Enable the vectorizer at -O2 for GCC 12

2021-08-30 Thread Hongtao Liu via Gcc
think it's good to turn this on by default for Power. The intel side is also willing to enable O2 vectorization after measuring performance impact for SPEC2017 and eembc. Meanwhile we are investigating PR101908/PR101909/PR101910/PR92740 which are reported O2 vectorization regresses extra benchmarks on znver and kabylake. > > BR, > Kewen -- BR, Hongtao

Re: Why vectorization didn't turn on by -O2

2021-08-05 Thread Hongtao Liu via Gcc
but the > > underlying problem that the PR exposed. Enabling this “BB SLP in loop > > vectorisation” code can lead to the generation of scalar COND_EXPRs even > > though we know that ifcvt doesn't have a proper cost model for deciding > > whether scalar COND_EXPRs are a win. > > > > Introducing scalar COND_EXPRs at -O3 is arguably an acceptable risk > > (although still dubious), but I think it's something we need to avoid > > for -O2, even if that means losing the optimisation. > > Yeah -- -O2 should almost always do the right thing, while -O3 can do > bad things more often, it just has to be better "on average". > > > Segher Move thread to gcc-patches and gcc -- BR, Hongtao

Re: Suboptimal code generated for __buitlin_ceil on AMD64 without SS4_4.1

2021-08-05 Thread Hongtao Liu via Gcc
rgument) ># = ceil(argument) > 32: 66 0f 73 d0 3fpsrlq xmm0, 63 > 37: 66 0f 73 f0 3fpsllq xmm0, 63 # xmm0 = (argument & -0.0) > ? -0.0 : 0.0 > 3c: 66 0f 56 c3 orpdxmm0, xmm3 # xmm0 = ceil(argument) > 40: c3 .L0: ret > .end > > regards > Stefan -- BR, Hongtao

Re: How to detect user uses -masm=intel?

2021-07-28 Thread Hongtao Liu via Gcc
nks.protection.outlook.com/?url=https%3A%2F%2Fgo.microsoft.com%2Ffwlink%2F%3FLinkId%3D550986&data=04%7C01%7C%7C9ff9312911b84c6126dc08d952323529%7C84df9e7fe9f640afb435%7C1%7C0%7C637631197911449533%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ygQvHY1b7whxaAMvhglHY12E688oc%2F%2BqBe7AKwVQfBs%3D&reserved=0> > > for Windows 10 > > > -- BR, Hongtao

Re: [Questions] Is there any bit in gimple/rtl to indicate this IR support fast-math or not?

2021-07-14 Thread Hongtao Liu via Gcc
On Wed, Jul 14, 2021 at 4:17 PM Richard Biener wrote: > > On Wed, Jul 14, 2021 at 10:11 AM Hongtao Liu wrote: > > > > On Wed, Jul 14, 2021 at 3:49 PM Matthias Kretz wrote: > > > > > > On Wednesday, 14 July 2021 09:39:42 CEST Richard Biener wrote: > >

Re: [Questions] Is there any bit in gimple/rtl to indicate this IR support fast-math or not?

2021-07-14 Thread Hongtao Liu via Gcc
kretz.github.io > GSI Helmholtz Centre for Heavy Ion Research https://gsi.de > std::experimental::simd https://github.com/VcDevel/std-simd > ── -- BR, Hongtao

Re: [Questions] Is there any bit in gimple/rtl to indicate this IR support fast-math or not?

2021-07-14 Thread Hongtao Liu via Gcc
On Wed, Jul 14, 2021 at 2:39 PM Matthias Kretz wrote: > > On Wednesday, 14 July 2021 07:18:29 CEST Hongtao Liu via Gcc-help wrote: > > On Wed, Jul 14, 2021 at 1:15 PM Hongtao Liu wrote: > > > Hi: > > > The original problem was that some users wanted the cmdline

Re: [Questions] Is there any bit in gimple/rtl to indicate this IR support fast-math or not?

2021-07-13 Thread Hongtao Liu via Gcc
On Wed, Jul 14, 2021 at 1:15 PM Hongtao Liu wrote: > > Hi: > The original problem was that some users wanted the cmdline option > -ffast-math not to act on intrinsic production code. .i.e for codes > like > > #include > __m256d > foo2 (__m256d a, __m256d b, __m256d

[Questions] Is there any bit in gimple/rtl to indicate this IR support fast-math or not?

2021-07-13 Thread Hongtao Liu via Gcc
marked with "no-fast-math". It seems to be more flexible to support fast math control of a region(inside a function). Does GCC have a similar mechanism? -- BR, Hongtao

Re: Hongtao Liu as x86 vectorization maintainer

2021-06-22 Thread Hongtao Liu via Gcc
On Tue, Jun 22, 2021 at 3:58 PM Jakub Jelinek via Gcc wrote: > > On Mon, Jun 21, 2021 at 02:49:56AM +, Liu, Hongtao via Gcc wrote: > > >-Original Message- > > >From: Jason Merrill > > >Sent: Monday, June 21, 2021 10:07 AM > > >To: Liu,

RE: Hongtao Liu as x86 vectorization maintainer

2021-06-20 Thread Liu, Hongtao via Gcc
>-Original Message- >From: Jason Merrill >Sent: Monday, June 21, 2021 10:07 AM >To: Liu, Hongtao >Cc: gcc Mailing List ; Marek Polacek >Subject: Hongtao Liu as x86 vectorization maintainer > >I am pleased to announce that the GCC Steering Committee has a

Re: State of AutoFDO in GCC

2021-04-26 Thread Hongtao Yu via Gcc
with extended uses. Thanks, Hongtao From: Xinliang David Li Date: Monday, April 26, 2021 at 11:05 AM To: Andi Kleen Cc: Jan Hubicka , gcc@gcc.gnu.org , Wei Mi , Eugene Rozenfeld , Wenlei He , Hongtao Yu Subject: Re: State of AutoFDO in GCC On Mon, Apr 26, 2021 at 11:00 AM Andi Kleen

Re: Help with PR97872

2020-12-09 Thread Hongtao Liu via Gcc
> > > On Mon, 7 Dec 2020 at 17:37, Hongtao Liu wrote: > > > > > > On Mon, Dec 7, 2020 at 7:11 PM Prathamesh Kulkarni > > > wrote: > > > > > > > > On Mon, 7 Dec 2020 at 16:15, Hongtao Liu wrote: > > > > > > > > &g

Re: Help with PR97872

2020-12-07 Thread Hongtao Liu via Gcc
On Mon, Dec 7, 2020 at 7:11 PM Prathamesh Kulkarni wrote: > > On Mon, 7 Dec 2020 at 16:15, Hongtao Liu wrote: > > > > On Mon, Dec 7, 2020 at 5:47 PM Richard Biener wrote: > > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote: > > > > > >

Re: Help with PR97872

2020-12-07 Thread Hongtao Liu via Gcc
} > > > > > > > > Before patch: > > > > baz: > > > > pcmpeqq %xmm1, %xmm0 > > > > pcmpeqd %xmm1, %xmm1 > > > > pandn %xmm1, %xmm0 > > > > ret > > > > > > > > Afte

Re: [IMPORTANT] ChangeLog related changes

2020-05-25 Thread Hongtao Liu via Gcc
Great, thanks! On Tue, May 26, 2020 at 2:08 PM Martin Liška wrote: > > On 5/26/20 7:22 AM, Hongtao Liu via Gcc wrote: > > i commit a separate patch alone only for ChangeLog files, should i revert > > it? > > Hello. > > I've just done it. > > Martin -- BR, Hongtao

Re: [IMPORTANT] ChangeLog related changes

2020-05-25 Thread Hongtao Liu via Gcc
afterwards it can be fixed by adjusting the ChangeLog > files. But you can only touch the ChangeLog files in that case (and > shouldn't write a ChangeLog entry for that in the commit message). > > If anything goes wrong, please let me, other RMs and Martin Liška know. > > Jakub > -- BR, Hongtao

Re: [9/10 Regression] [PR87833] Intel MIC (emulated) offloading still broken (was: GCC 9.0.1 Status Report (2019-04-25))

2019-05-07 Thread Hongtao Liu
> > > Well, I'm actually not asking for review of the WIP patch, but rather > > looking for someone to take on ownership/maintenance of the functionality > > of Intel MIC offloading. > > That would be indeed greatly appreciated. > > Jakub I don't konw this guy ilya.ver...@intel.com. Do you know him/her, H.J? -- BR, Hongtao

Re: About new project

2013-01-27 Thread Hongtao Yu
On 1/27/2013 5:04 PM, Gerald Pfeifer wrote: On Sat, 26 Jan 2013, Hongtao Yu wrote: How can I set up a new project under GCC and make it open-sourced? Thanks! That depends on what you mean by "under GCC", I'd say. If you have improvements for GCC, submitting those as patches a

About new project

2013-01-26 Thread Hongtao Yu
Hi All, How can I set up a new project under GCC and make it open-sourced? Thanks! Cheers, Hongtao

Symbolic range analysis

2010-11-14 Thread Hongtao Yu
Hi All, Does it perform symbolic range analysis or array section analysis in GCC-4.6 ? Thanks, Hongtao Yu Purdue University

Re: ipa on all files together

2010-11-01 Thread Hongtao
On 11/01/10 20:35, Diego Novillo wrote: > On Mon, Nov 1, 2010 at 19:57, Hongtao wrote: >> Hi All, >> >> While using gcc-4.6 with option -flto, I found that interprocedural >> analysis were performed on each source file separately. For example for >> the pas

ipa on all files together

2010-11-01 Thread Hongtao
there a way that can perform IPA on all source files together? Thanks, Hongtao Purdue University

Options for dumping dependence checking results

2010-10-14 Thread Hongtao
Hi All, What's the option for dumping the results of loop dependence checking? such as dependence relations, direction vectors, etc. Thanks, Hongtao

Re: Insert new global declaration to Gimple

2010-10-12 Thread Hongtao
Sorry, I mean if I had built a VAR_DECL node as well as its DECL_INITIAL, where should I place the node? Since it is a global variable, can I just build the VAR_DECL node without placing it in any container, say symbol table or anywhere else? Thanks, Hongtao On 10/12/10 10:11, Hongtao wrote

Insert new global declaration to Gimple

2010-10-12 Thread Hongtao
Dear All, How can I build a new global variable as well as its initializer on Gimple? Thanks, Hongtao Yu Purdue Univeristy

Map tree to properties

2010-10-04 Thread Hongtao
Hi All, Do we have a mechanism to map a tree or gimple to a series of properties so that we can transfer information from one pass to another? Thanks, Hongtao Purdue Univeristy

Re: About DECL_UID

2010-09-25 Thread Hongtao
On 09/25/10 16:48, Diego Novillo wrote: > On Sat, Sep 25, 2010 at 16:40, Hongtao wrote: > >> May the DECL_UID of any two local variables of two separated functions >> be the same during LTO ? > No. DECL_UIDs are unique within a single translation unit. > OK, thanks

About DECL_UID

2010-09-25 Thread Hongtao
Hi All, May the DECL_UID of any two local variables of two separated functions be the same during LTO ? Thanks, Hongtao Yu Purdue University

Interprocedural points-to analysis

2010-09-23 Thread Hongtao
Hi All, Has the interprocedural points-to analysis(pass-ipa-pta) been put into practice, i.e. using the ipa points-to set to aid optimizations? Thanks, Hongtao

Re: How to dump SSA in lto

2010-09-21 Thread Hongtao
Thanks very much. But I still want an option to dump the SSA form during or after LTO optimizations, such as -fdump-tree-... Hongtao On 09/21/10 10:07, Richard Guenther wrote: > On Tue, Sep 21, 2010 at 3:31 PM, Hongtao wrote: >> Hi All, >> >> >> I'm program

How to dump SSA in lto

2010-09-21 Thread Hongtao
Hi All, I'm programming in the LTO phase. How can I dump the SSA representation after a optimization of LTO? For example, if I would like to know the effect of interprocedural pointer analysis(pass_ipa_pta), how can I dump the SSA form after the pass? Thanks, Hongtao Yu Purdue University

About LTO merging symbol tables

2010-09-16 Thread Hongtao
node? Thanks, Hongtao Yu Purdue University

Re: Errors when invoking refs_may_alias_p_1

2010-08-27 Thread Hongtao
On 08/27/10 14:29, Richard Guenther wrote: > On Fri, Aug 27, 2010 at 8:24 PM, Hongtao wrote: > >> On 08/27/10 12:35, Richard Guenther wrote: >> >>> On Fri, Aug 27, 2010 at 5:27 PM, Hongtao wrote: >>> >>> >>>> Hi all, >

Re: Errors when invoking refs_may_alias_p_1

2010-08-27 Thread Hongtao
On 08/27/10 12:35, Richard Guenther wrote: > On Fri, Aug 27, 2010 at 5:27 PM, Hongtao wrote: > >> Hi all, >> >> I have instrumented a function call like foo(&a,&b) into the gimple SSA >> representation (gcc-4.5) and the consequent optimizations can not

Errors when invoking refs_may_alias_p_1

2010-08-27 Thread Hongtao
src/gcc/toplev.c:1065 #24 0x0081a1c5 in do_compile () at ../../src/gcc/toplev.c:2417 #25 0x0081a286 in toplev_main (argc=21, argv=0x7fffe0f8) at ../../src/gcc/toplev.c:2459 #26 0x00519c6b in main (argc=21, argv=0x7fffe0f8) at ../../src/gcc/main.c:35 Thanks, Hongtao Purdue University