[PATCH] D151537: [NFC] Update cpu_specific test to use a newer CPU

2023-05-30 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D151537#4380763 , @erichkeane wrote: > I don't really see the justification here? Why do this change? If the > intent is to just test a newer architecture, we can add tests for that, not > change existing ones. KNL is d

[PATCH] D151537: [NFC] Update cpu_specific test to use a newer CPU

2023-05-29 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151537/new/ https://reviews.llvm.org/D151537 _

[PATCH] D147165: [Windows SEH] Fix catch+return crash for Windows -EHa

2023-03-30 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, pls wait for 1 or 2 days in case there are comments from others Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D147165/new/ https://re

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-02-26 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. > Deal. And because I am on busy for a long time and it is also better to let > intel guy handle x86-related feature, I am happy with the patch being > commandeered. @nikic's proposal looks a promising solution, we can investigate more about it. Repository: rG LLV

[PATCH] D143094: [clang] Change AMX macros to match names from GCC

2023-02-02 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. @pengfei, would you take a look to double confirm? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D143094/new/ https://reviews.llvm.org/D143094 ___ cfe-commits mailing list cfe-c

[PATCH] D143094: [clang] Change AMX macros to match names from GCC

2023-02-01 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D143094/new/ https://reviews.llvm.org/D143094 _

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D141899#4061237 , @zixuan-wu wrote: > With considering > https://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility I think > we need make consensus to choose one option from following 2 options. > > 1. Remove X8

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D141899#4061150 , @zixuan-wu wrote: > In D141899#4058173 , @LuoYuanke > wrote: > >> @zixuan-wu, changing x86_amx would break our internal code. May I know the >> motivation to chang

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D141899#4061150 , @zixuan-wu wrote: > In D141899#4058173 , @LuoYuanke > wrote: > >> @zixuan-wu, changing x86_amx would break our internal code. May I know the >> motivation to chang

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. We need consider how to be compatible with the existing software if we want the change the IR type. There are some existing software that is based on the existing type. For example the AMX dialect of MLIR and TLX code are based on the x86_amx, it would break them if w

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D141899#4058683 , @lebedev.ri wrote: > In D141899#4058173 , @LuoYuanke > wrote: > >> @zixuan-wu, changing x86_amx would break our internal code. May I know the >> motivation to cha

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. @zixuan-wu, changing x86_amx would break our internal code. May I know the motivation to change the type? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D141899/new/ https://reviews.llvm.org/D141899 __

[PATCH] D99565: [X86] Support replacing aligned vector moves with unaligned moves when avx is enabled.

2023-01-12 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke commandeered this revision. LuoYuanke edited reviewers, added: LiuChen3; removed: LuoYuanke. LuoYuanke added a subscriber: lebedev.ri. LuoYuanke added a comment. In D99565#4049330 , @lebedev.ri wrote: > This review seems to be stuck/dead, conside

[PATCH] D140281: [X86] Rename CMPCCXADD intrinsics.

2022-12-18 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140281/new/ https://reviews.llvm.org/D140281 ___ cfe-commits mailing list cfe-commits@lists

[PATCH] D138547: [X86][AMX] Fix typo of the headerfile.

2022-11-23 Thread LuoYuanke via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG55fceef61e0d: [X86][AMX] Fix typo of the headerfile. (authored by LuoYuanke). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138547/new/ https://reviews.llvm

[PATCH] D138547: [X86][AMX] Fix typo of the headerfile.

2022-11-23 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke created this revision. Herald added a project: All. LuoYuanke requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D138547 Files: clang/lib/Headers/amxfp16intrin.h In

[PATCH] D111778: [WIP][X86] Update CPU_SPECIFIC list.

2022-10-24 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. It seems @craig.topper supported __cpu_features2 in compiler-rt revision 94ccb2acbf2c5 . Anything else that we need to address before landing the patch? Repository: rG LLVM Github Monorepo CHANG

[PATCH] D132329: [X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics

2022-10-19 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D132329/new/ https://reviews.llvm.org/D132329

[PATCH] D132329: [X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics

2022-10-18 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/Headers/avx512bf16intrin.h:13 +#ifdef __SSE2__ + What is this macro check used for? Comment at: clang/test/CodeGen/X86/avx512bf16-error.c:14 +__bfloat16 bar(__bfloat16 a, __bfloat16 b) {

[PATCH] D136040: [X86] Support PREFETCHI instructions

2022-10-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/include/clang/Driver/Options.td:4651 def mno_popcnt : Flag<["-"], "mno-popcnt">, Group; +def mprefetchi : Flag<["-"], "mprefetchi">, Group; +def mno_prefetchi : Flag<["-"], "mno-prefetchi">, Group; I notice in l

[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-13 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/Basic/Targets/X86.cpp:781 +Builder.defineMacro("__AVXNECONVERT__"); + Builder.defineMacro("__AVXNECONVERT_SUPPORTED__"); if (HasAVXVNNI) Do we need it here? Repository: rG LLVM Github Monorepo CH

[PATCH] D132329: [X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics

2022-08-21 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/AST/MicrosoftMangle.cpp:2472 + case BuiltinType::BFloat16: +mangleArtificialTagType(TTK_Struct, "__bf16", {"__clang"}); This looks irrelative to the patch. Comment at: clang/test/Cod

[PATCH] D107082: [X86][RFC] Enable `_Float16` type support on X86 following the psABI

2022-06-10 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107082/new/ https://reviews.llvm.org/D107082

[PATCH] D107082: [X86][RFC] Enable `_Float16` type support on X86 following the psABI

2022-06-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/test/CodeGen/X86/fpclamptosat.ll:569 ; CHECK-NEXT:cvttss2si %xmm0, %rax ; CHECK-NEXT:ucomiss {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 +; CHECK-NEXT:movabsq $-9223372036854775808, %rcx # imm = 0x8000

[PATCH] D107082: [X86][RFC] Enable `_Float16` type support on X86 following the psABI

2022-06-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/test/Analysis/CostModel/X86/fptoi_sat.ll:852 +; SSE2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %f16u1 = call i1 @llvm.fptoui.sat.i1.f16(half undef) +; SSE2-NEXT: Cost Model: Found an estimated cost of 5 fo

[PATCH] D107082: [X86][RFC] Enable `_Float16` type support on X86 following the psABI

2022-06-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:616 +setOperationAction(ISD::FROUNDEVEN, MVT::f16, Promote); +setOperationAction(ISD::FP_ROUND, MVT::f16, Expand); +setOperationAction(ISD::FP_EXTEND, MVT::f32, Expand); -

[PATCH] D127050: [Clang][FP16] Add 4 builtins for _Float16

2022-06-04 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/builtin_Float16.c:7 +void test_float16_builtins(void) { + volatile _Float16 res; + pengfei wrote: > LuoYuanke wrote: > > Is _Float16 a legal type for target armv7a and aarch64? > Yes, see > https:/

[PATCH] D127050: [Clang][FP16] Add 4 builtins for _Float16

2022-06-04 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/builtin_Float16.c:7 +void test_float16_builtins(void) { + volatile _Float16 res; + Is _Float16 a legal type for target armv7a and aarch64? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D127050: [Clang][FP16] Add 4 builtins for _Float16

2022-06-04 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/include/clang/Basic/Builtins.def:145 BUILTIN(__builtin_huge_vall, "Ld", "nc") +BUILTIN(__builtin_huge_valf16, "x", "nc") BUILTIN(__builtin_huge_valf128, "LLd", "nc") Is the builtin sorted in alphabet order? R

[PATCH] D122567: [X86][AMX] enable amx cast intrinsics in FE.

2022-04-01 Thread LuoYuanke via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG979d876bb4e9: [X86][AMX] enable amx cast intrinsics in FE. (authored by LuoYuanke). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122567/new/ https://review

[PATCH] D122567: [X86][AMX] enable amx cast intrinsics in FE.

2022-03-29 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:5413-5415 +if (PTy->isX86_AMXTy()) + ArgValue = Builder.CreateIntrinsic(Intrinsic::x86_cast_vector_to_tile, + {ArgValue->getType()}, {ArgVal

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2022-03-28 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. Herald added a project: All. In D99152#3235546 , @lebedev.ri wrote: > What's the status here? Here is the patch D122567 . Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION htt

[PATCH] D122567: [X86][AMX] enable amx cast intrinsics in FE.

2022-03-28 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke created this revision. Herald added a subscriber: pengfei. Herald added a project: All. LuoYuanke requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. We have some discission in D99152 and llvm-dev an

[PATCH] D122104: [X86][regcall] Support passing / returning structures

2022-03-27 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122104/new/ https://reviews.llvm.org/D122104

[PATCH] D122104: [X86][regcall] Support passing / returning structures

2022-03-26 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/include/clang/CodeGen/CGFunctionInfo.h:590 + /// Log 2 of the maximum vector width. + unsigned MaxVectorWidth : 4; + I notice some code would indicate it is log 2 size with Log2 suffix in the variable name. Do

[PATCH] D122104: [X86][regcall] Support passing / returning structures

2022-03-20 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/include/clang/CodeGen/CGFunctionInfo.h:744 + void setMaxVectorWidth(unsigned Width) { + MaxVectorWidth = Width ? llvm::countTrailingZeros(Width) + 1 : 0; + } Use "Log2_32()"? Repository: rG LLVM Github

[PATCH] D120307: [X86] Add helper enum for ternary intrinsics

2022-03-06 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but pls wait for 1 or 2 days to see if there are any comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120307/new/

[PATCH] D115199: [WIP][X86][AMX] Support amxpreserve attribute in clang.

2021-12-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke updated this revision to Diff 393012. LuoYuanke added a comment. Herald added a subscriber: martong. Updating D115199 : [WIP][X86][AMX] Support amxpreserve attribute in clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://

[PATCH] D115199: [X86][AMX] Support amxpreserve attribute in clang.

2021-12-06 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke created this revision. Herald added a reviewer: aaron.ballman. LuoYuanke requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D115199 Files: clang/include/clang/Basic/A

[PATCH] D111037: [X86] Check if struct is blank before getting the inner types

2021-10-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D111037/new/ https://reviews.llvm.org/D111037

[PATCH] D111037: [X86] Check if struct is blank before getting the inner types

2021-10-07 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/X86/avx512fp16-abi.c:207 +} pr52011() { + // CHECK-C: define{{.*}} { float, double } @pr52011 +} Why not test CPP as well? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https:

[PATCH] D111037: [X86] Check if struct is blank before getting the inner types

2021-10-07 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/X86/avx512fp16-abi.c:203 +struct { + float a; + struct {}; Add more cases for the struct composed of _Float16, float, double, struct {}? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D109607: [X86] Refactor GetSSETypeAtOffset to fix pr51813

2021-09-15 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but pls wait 1 or 2 days for the comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109607/new/ https://reviews.llv

[PATCH] D109607: [X86] Refactor GetSSETypeAtOffset to fix pr51813

2021-09-15 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/X86/avx512fp16-abi.c:153 +struct float2 { + struct {} s; + float a; Add a test case for "{ struct {}; half; struct {}; half;}? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION h

[PATCH] D109607: [X86] Refactor GetSSETypeAtOffset to fix pr51813

2021-09-15 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:3421 +if (T0->isHalfTy()) + T1 = getFPTypeAtOffset(IRType, IROffset + 4, TD); +// If we can't get a second FP type, return a simple half or float. Not quite understanding w

[PATCH] D109658: [X86][FP16] Change the order of the operands in complex FMA intrinsics to allow swap between the mul operands.

2021-09-13 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. It seems in this patch the builtins interface is aligned to intrinsics interface. Since AVX512FP16 is pretty new, I assume nobody is using the GCC builtin. Can we ask GCC guys change their builtin interface? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST A

[PATCH] D109607: [X86] Refactor GetSSETypeAtOffset to fix pr51813

2021-09-12 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:3417 + llvm::Type *T1 = getFPTypeAtOffset(IRType, IROffset + NextFP, TD); + if (T1 == nullptr) { +if (NextFP == 2) Would you add comments on each case like previous code? ==

[PATCH] D109487: [X86] Support *_set1_pch(Float16 _Complex h)

2021-09-11 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109487/new/ https://reviews.llvm.org/D109487

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-27 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but may wait 1 or 2 days for the comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105269/new/ https://reviews.llv

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-27 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:47419 + : X86ISD::VFCMADDC; + // FIXME: How we handle when FMF of FADD is different from CFMUL's? + CFmul = DAG.getNode(newOp, SDLoc(N), CV

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-27 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:47419 + : X86ISD::VFCMADDC; + // FIXME: How we handle when FMF of FADD is different from CFMUL's? + CFmul = DAG.getNode(newOp, SDLoc(N), CV

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-26 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:13640 +(v4f32 (OpNode VR128X:$src1, VR128X:$src2)), +0, 0, 0, X86selects, "@earlyclobber $dst">, Sched<[sched.XMM]>; +defm rm : AVX512_maskable

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-26 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86InstrFoldTables.cpp:1852 + { X86::VFCMULCPHZrr, X86::VFCMULCPHZrm, 0 }, + { X86::VFCMULCSHZrr, X86::VFCMULCSHZrm, TB_NO_REVERSE }, { X86::VFMADDPD4Yrr,

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-26 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86InstrFoldTables.cpp:1852 + { X86::VFCMULCPHZrr, X86::VFCMULCPHZrm, 0 }, + { X86::VFCMULCSHZrr, X86::VFCMULCSHZrm, TB_NO_REVERSE }, { X86::VFMADDPD4Yrr,

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-26 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/X86/avx512fp16-builtins.c:4223 + +// CFC ADD PH + MADD? Comment at: clang/test/CodeGen/X86/avx512fp16-builtins.c:4315 + +// CF ADD PH + MADD? Co

[PATCH] D105268: [X86] AVX512FP16 instructions enabling 5/6

2021-08-21 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, may wait 1 or 2 days for comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105268/new/ https://reviews.llvm.org/D1

[PATCH] D108509: [X86][AMX] Add missing inline attributes in AMX intrinsics. NFCI

2021-08-21 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108509/new/ https://reviews.llvm.org/D108509

[PATCH] D108422: [NFC][clang] Move remaining part of X86Target.def to llvm/Support/X86TargetParser.def

2021-08-21 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. >> Thanks for reminding. We've supported -march=${CPU}, but forgot to update >> this table. We will update it. > > Shall we get this patch committed first before making any changes? Yes, committing the patch first looks good to me. Repository: rG LLVM Github Monore

[PATCH] D108422: [NFC][clang] Move remaining part of X86Target.def to llvm/Support/X86TargetParser.def

2021-08-20 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D108422#2957541 , @erichkeane wrote: > In D108422#2957528 , @RKSimon wrote: > >> There's nothing later than CannonLake here - does Intel need to at least >> reference up to Tiger/Ro

[PATCH] D105267: [X86] AVX512FP16 instructions enabling 4/6

2021-08-19 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks. May wait 1 or 2 days for the comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105267/new/ https://reviews

[PATCH] D105268: [X86] AVX512FP16 instructions enabling 5/6

2021-08-19 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. I understand now. Thanks, Craig. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105268/new/ https://reviews.llvm.org/D105268 ___ cfe-commits mailing list cfe-commits@lists.llvm.

[PATCH] D105268: [X86] AVX512FP16 instructions enabling 5/6

2021-08-19 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/include/clang/Basic/BuiltinsX86.def:2010 +TARGET_BUILTIN(__builtin_ia32_vfmaddph, "V8xV8xV8xV8x", "ncV:128:", "avx512fp16,avx512vl") +TARGET_BUILTIN(__builtin_ia32_vfmaddph256, "V16xV16xV16xV16x", "ncV:256:", "avx512fp16,avx512

[PATCH] D105267: [X86] AVX512FP16 instructions enabling 4/6

2021-08-18 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:1920 + setOperationAction(ISD::STRICT_FTRUNC, VT, Legal); + setOperationAction(ISD::FRINT, VT, Legal); + setOperationAction(ISD::STRICT_FRINT, VT, Legal); -

[PATCH] D105267: [X86] AVX512FP16 instructions enabling 4/6

2021-08-18 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:1920 + setOperationAction(ISD::STRICT_FTRUNC, VT, Legal); + setOperationAction(ISD::FRINT, VT, Legal); + setOperationAction(ISD::STRICT_FRINT, VT, Legal); -

[PATCH] D105267: [X86] AVX512FP16 instructions enabling 4/6

2021-08-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/include/clang/Basic/BuiltinsX86.def:1897 + +TARGET_BUILTIN(__builtin_ia32_rndscaleph_128_mask, "V8xV8xIiV8xUc", "ncV:128:", "avx512fp16,avx512vl") +TARGET_BUILTIN(__builtin_ia32_rndscaleph_256_mask, "V16xV16xIiV16xUs", "ncV:256

[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

2021-08-16 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but wait 1 or 2 days for the comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105265/new/ https://reviews.llvm.or

[PATCH] D105331: [CFE][X86] Enable complex _Float16 support

2021-08-16 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but wait 1 or 2 days to see if there is any comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105331/new/ https://

[PATCH] D105331: [CFE][X86] Enable complex _Float16.

2021-08-16 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. Could you add the linkage of ABI in the commit message? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105331/new/ https://reviews.llvm.org/D105331 ___ cfe-commits mailing list

[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

2021-08-16 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. Thank Craig for the clarification! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105265/new/ https://reviews.llvm.org/D105265 ___ cfe-commits mailing list cfe-commits@lists.llv

[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

2021-08-15 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:1955 setOperationAction(ISD::SCALAR_TO_VECTOR, MVT::v32f16, Custom); + setOperationAction(ISD::SINT_TO_FP, MVT::v32i16, Legal); + setOperationAction(ISD::STRICT

[PATCH] D105264: [X86] AVX512FP16 instructions enabling 2/6

2021-08-12 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but may wait 1 or 2 days for the comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105264/new/ https://reviews.llv

[PATCH] D105264: [X86] AVX512FP16 instructions enabling 2/6

2021-08-12 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/Headers/avx512vlfp16intrin.h:368 +_mm256_reduce_add_ph(__m256h __W) { + return __builtin_ia32_reduce_fadd_ph256(0.0f16, __W); +} From https://llvm.org/docs/LangRef.html#llvm-vector-reduce-add-intrinsic, -0.

[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

2021-08-12 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/Headers/avx512fp16intrin.h:1748 + +#define _mm_cvt_roundsh_i32(A, R) \ + (int)__builtin_ia32_vcvtsh2si32((__v8hf)(A), (int)(R)) Does it also return i32 in x86_64

[PATCH] D107946: [X86] Reverse *_set_ph and *_setr_ph 's set order.

2021-08-11 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. Any test case update? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107946/new/ https://reviews.llvm.org/D107946 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https:/

[PATCH] D105264: [X86] AVX512FP16 instructions enabling 2/6

2021-08-11 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86InstrFoldTables.cpp:4838 { X86::VMULSDZrr_Intk,X86::VMULSDZrm_Intk, TB_NO_REVERSE }, + { X86::VMULSHZrr_Intk,X86::VMULSHZrm_Intk, TB_NO_REVERSE }, { X86::VMU

[PATCH] D105264: [X86] AVX512FP16 instructions enabling 2/6

2021-08-11 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp:3197 + else if (PatchedName.endswith("sh")) +PatchedName = IsVCMP ? "vcmpsh" : "cmpsh"; + else if (PatchedName.endswith("ph")) There is no cmpsh? =

[PATCH] D105264: [X86] AVX512FP16 instructions enabling 2/6

2021-08-10 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/include/clang/Basic/BuiltinsX86.def:1860 +TARGET_BUILTIN(__builtin_ia32_minph512, "V32xV32xV32xIi", "ncV:512:", "avx512fp16") + +TARGET_BUILTIN(__builtin_ia32_minph256, "V16xV16xV16x", "ncV:256:", "avx512fp16,avx512vl

[PATCH] D105331: [CFE][X86] Enable complex _Float16.

2021-08-09 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. Would you check the failure of the test cases? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105331/new/ https://reviews.llvm.org/D105331 ___ cfe-commits mailing list cfe-commi

[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

2021-08-09 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but may wait 1 or 2 days for the comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105263/new/ https://reviews.llv

[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

2021-08-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:3471 + ContainsFloatAtOffset(IRType, IROffset + 4, getDataLayout())) +return llvm::FixedVectorType::get(llvm::Type::getHalfTy(getVMContext()), 4); + For 2 float, return <2xflo

[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

2021-08-06 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:4478 + let Predicates = [HasFP16] in { +def VMOVSHZrr_REV: AVX512<0x11, MRMDestReg, (outs VR128X:$dst), +(ins VR128X:$src1, VR128X:$src2), pengfei wrote: > craig.toppe

[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

2021-08-06 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/test/CodeGen/X86/vector-reduce-fmax-nnan.ll:374 +; SSE-NEXT:movl %edi, %ebp +; SSE-NEXT:movzwl %bx, %edi ; SSE-NEXT:callq __gnu_h2f_ieee@PLT Why this test case changes? Shall we add -mattr=+avx512fp16

[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

2021-08-06 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:82 + PatFrags ScalarIntMemFrags = !if (!eq (EltTypeName, "f16"), + !cast("sse_load_f16"), + !if (!eq (EltTypeName, "f32"), -

[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

2021-08-05 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:801 // 0b00010: implied 0F 38 leading opcode bytes // 0b00011: implied 0F 3A leading opcode bytes // 0b00100-0b1: Reserved for future use Add commen

[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

2021-08-04 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:3405 +/// half member at the specified offset. For example, {int,{half}} has a +/// float at offset 4. It is conservatively correct for this routine to return +/// false. float -> hal

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-07-14 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/test/CodeGen/X86/avx512cfma-intrinsics.ll:3 +; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl -mattr=+avx512bw -mattr=+avx512fp16 -mattr=+avx512vl | FileCheck %s + +declare <4 x float> @llvm.x86.avx512fp16.mask.vfmaddc.ph.1

[PATCH] D99675: [llvm][clang] Create new intrinsic llvm.arithmetic.fence to control FP optimization at expression level

2021-06-23 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but pls wait for 1 or 2 days to see if there is any more comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D99675/new/ https://

[PATCH] D103784: [X86] Support __tile_stream_loadd intrinsic for new AMX interface

2021-06-09 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM. Thank you! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D103784/new/ https://reviews.llvm.org/D103784 _

[PATCH] D103784: [X86] Support __tile_stream_loadd intrinsic for new AMX interface

2021-06-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86FastTileConfig.cpp:124 bool X86FastTileConfig::isTileLoad(MachineInstr &MI) { + return MI.getOpcode() == X86::PTILELOADDV || Also add the stream load for X86PreAMXConfig.cpp: isTileLoad().

[PATCH] D99675: [llvm][clang] Create new intrinsic llvm.arith.fence to control FP optimization at expression level

2021-06-03 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. We may add description on the intrinsic in docs/LangRef.rst. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D99675/new/ https://reviews.llvm.org/D99675 ___ cfe-commits mailing li

[PATCH] D101059: [X86][AMX] Add description for AMX new interface.

2021-04-27 Thread LuoYuanke via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGd6c6db2feaab: [X86][AMX] Add description for AMX new interface. (authored by LuoYuanke). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST A

[PATCH] D101059: [X86][AMX] Add description for AMX new interface.

2021-04-22 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke updated this revision to Diff 339597. LuoYuanke added a comment. Fix some descriptions. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101059/new/ https://reviews.llvm.org/D101059 Files: clang/lib/Headers/amxintrin.h Index: clang/lib/H

[PATCH] D101059: [X86][AMX] Add description for AMX new interface.

2021-04-22 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke created this revision. LuoYuanke requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D101059 Files: clang/lib/Headers/amxintrin.h Index: clang/lib/Headers/amxintrin.h

[PATCH] D99708: [X86] Enable compilation of user interrupt handlers.

2021-04-12 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM. But wait one or two days to see if there is more comments from Craig and HJ. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D99708/new/ https://reviews.llvm.org/D99708 ___

[PATCH] D99708: [X86] Enable compilation of user interrupt handlers.

2021-04-06 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. LGMT. Thank you! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D99708/new/ https://reviews.llvm.org/D99708 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-c

[PATCH] D99708: [X86] Enable compilation of user interrupt handlers.

2021-04-01 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D99708#2663989 , @craig.topper wrote: > A user interrupt is different than a regular interrupt right? It doesn't make > sense that we would change the behavior of the interrupt calling convention > just because the the user

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-31 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. > Unfortunately this is not possible to use an opaque type with the AMX > intrinsics at the moment, because of the way they are define. It is possible > to use opaque types with intrinsics in general though, e.g. see > https://llvm.godbolt.org/z/Ezhf6535c > > My point

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-29 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. > Whether to further optimizations are correct is a different problem, but we > need a specification for the builtins, intrinsics and the type before going > any further in that direction. > > I think you need to set the input to `LLVM IR`: > https://gcc.godbolt.org/z

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-29 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. > I think that point was not really clear during the discussion. Using `load > <256 x i32>` to lower `__tile_loadd() ` would indeed be incorrect. But I > don't think that's happening at the moment, at least going from a simple > example https://gcc.godbolt.org/z/KT5rc

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-24 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D99152#2647681 , @fhahn wrote: > I can't see any `load <256 x i32>` in the linked example, just a store. Could > you check the example? I create another example at https://gcc.godbolt.org/z/v6od5ceEz. In bar() function, you

  1   2   >