[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

2021-08-15 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:1955 setOperationAction(ISD::SCALAR_TO_VECTOR, MVT::v32f16, Custom); + setOperationAction(ISD::SINT_TO_FP, MVT::v32i16, Legal); + setOperationAction(ISD::STRICT

[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

2021-08-16 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. Thank Craig for the clarification! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105265/new/ https://reviews.llvm.org/D105265 ___ cfe-commits mailing list cfe-commits@lists.llv

[PATCH] D105331: [CFE][X86] Enable complex _Float16.

2021-08-16 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. Could you add the linkage of ABI in the commit message? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105331/new/ https://reviews.llvm.org/D105331 ___ cfe-commits mailing list

[PATCH] D105331: [CFE][X86] Enable complex _Float16 support

2021-08-16 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but wait 1 or 2 days to see if there is any comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105331/new/ https://

[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

2021-08-16 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but wait 1 or 2 days for the comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105265/new/ https://reviews.llvm.or

[PATCH] D105267: [X86] AVX512FP16 instructions enabling 4/6

2021-08-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/include/clang/Basic/BuiltinsX86.def:1897 + +TARGET_BUILTIN(__builtin_ia32_rndscaleph_128_mask, "V8xV8xIiV8xUc", "ncV:128:", "avx512fp16,avx512vl") +TARGET_BUILTIN(__builtin_ia32_rndscaleph_256_mask, "V16xV16xIiV16xUs", "ncV:256

[PATCH] D105267: [X86] AVX512FP16 instructions enabling 4/6

2021-08-18 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:1920 + setOperationAction(ISD::STRICT_FTRUNC, VT, Legal); + setOperationAction(ISD::FRINT, VT, Legal); + setOperationAction(ISD::STRICT_FRINT, VT, Legal); -

[PATCH] D105267: [X86] AVX512FP16 instructions enabling 4/6

2021-08-18 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:1920 + setOperationAction(ISD::STRICT_FTRUNC, VT, Legal); + setOperationAction(ISD::FRINT, VT, Legal); + setOperationAction(ISD::STRICT_FRINT, VT, Legal); -

[PATCH] D105268: [X86] AVX512FP16 instructions enabling 5/6

2021-08-19 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/include/clang/Basic/BuiltinsX86.def:2010 +TARGET_BUILTIN(__builtin_ia32_vfmaddph, "V8xV8xV8xV8x", "ncV:128:", "avx512fp16,avx512vl") +TARGET_BUILTIN(__builtin_ia32_vfmaddph256, "V16xV16xV16xV16x", "ncV:256:", "avx512fp16,avx512

[PATCH] D93594: [X86] Pass to transform amx intrinsics to scalar operation.

2021-02-27 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:88 + +template Can we just use `template `? I think it also can reduce the > branch. Why do we need a template instead of passing a parameter `bool IsLoad`? Repository: rG LLVM Github

[PATCH] D93594: [X86] Pass to transform amx intrinsics to scalar operation.

2021-02-27 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:88 + +template LuoYuanke wrote: > > pengfei wrote: > > > Can we just use `template `? I think it also can reduce the > > > branch. > > Why do we need a template instead of passing a param

[PATCH] D93594: [X86] Pass to transform amx intrinsics to scalar operation.

2021-03-02 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:311 + Value *ResElt = B.CreateAdd(EltC, SubVecR); + Value *NewVecC = B.CreateInsertElement(VecCPhi, ResElt, IdxC); + Value *NewVecD = B.CreateInsertElement(VecDPhi, ResElt, IdxC); --

[PATCH] D93594: [X86] Pass to transform amx intrinsics to scalar operation.

2021-03-04 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:82 + DTU.applyUpdatesPermissive({ + {DominatorTree::Delete, Preheader, Tmp}, + {DominatorTree::Insert, Header, Body}, pengfei wrote: > Do we need to remove the s

[PATCH] D93594: [X86] Pass to transform amx intrinsics to scalar operation.

2021-03-04 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. LGTM too. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D93594/new/ https://reviews.llvm.org/D93594 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.or

[PATCH] D132329: [X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics

2022-10-18 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/Headers/avx512bf16intrin.h:13 +#ifdef __SSE2__ + What is this macro check used for? Comment at: clang/test/CodeGen/X86/avx512bf16-error.c:14 +__bfloat16 bar(__bfloat16 a, __bfloat16 b) {

[PATCH] D132329: [X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics

2022-10-19 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D132329/new/ https://reviews.llvm.org/D132329

[PATCH] D111778: [WIP][X86] Update CPU_SPECIFIC list.

2022-10-24 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. It seems @craig.topper supported __cpu_features2 in compiler-rt revision 94ccb2acbf2c5 . Anything else that we need to address before landing the patch? Repository: rG LLVM Github Monorepo CHANG

[PATCH] D132329: [X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics

2022-08-21 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/AST/MicrosoftMangle.cpp:2472 + case BuiltinType::BFloat16: +mangleArtificialTagType(TTK_Struct, "__bf16", {"__clang"}); This looks irrelative to the patch. Comment at: clang/test/Cod

[PATCH] D135930: [X86] Add AVX-NE-CONVERT instructions.

2022-10-13 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/Basic/Targets/X86.cpp:781 +Builder.defineMacro("__AVXNECONVERT__"); + Builder.defineMacro("__AVXNECONVERT_SUPPORTED__"); if (HasAVXVNNI) Do we need it here? Repository: rG LLVM Github Monorepo CH

[PATCH] D136040: [X86] Support PREFETCHI instructions

2022-10-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/include/clang/Driver/Options.td:4651 def mno_popcnt : Flag<["-"], "mno-popcnt">, Group; +def mprefetchi : Flag<["-"], "mprefetchi">, Group; +def mno_prefetchi : Flag<["-"], "mno-prefetchi">, Group; I notice in l

[PATCH] D140281: [X86] Rename CMPCCXADD intrinsics.

2022-12-18 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140281/new/ https://reviews.llvm.org/D140281 ___ cfe-commits mailing list cfe-commits@lists

[PATCH] D143094: [clang] Change AMX macros to match names from GCC

2023-02-01 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D143094/new/ https://reviews.llvm.org/D143094 _

[PATCH] D143094: [clang] Change AMX macros to match names from GCC

2023-02-02 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. @pengfei, would you take a look to double confirm? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D143094/new/ https://reviews.llvm.org/D143094 ___ cfe-commits mailing list cfe-c

[PATCH] D99565: [X86] Support replacing aligned vector moves with unaligned moves when avx is enabled.

2023-01-12 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke commandeered this revision. LuoYuanke edited reviewers, added: LiuChen3; removed: LuoYuanke. LuoYuanke added a subscriber: lebedev.ri. LuoYuanke added a comment. In D99565#4049330 , @lebedev.ri wrote: > This review seems to be stuck/dead, conside

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. @zixuan-wu, changing x86_amx would break our internal code. May I know the motivation to change the type? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D141899/new/ https://reviews.llvm.org/D141899 __

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D141899#4058683 , @lebedev.ri wrote: > In D141899#4058173 , @LuoYuanke > wrote: > >> @zixuan-wu, changing x86_amx would break our internal code. May I know the >> motivation to cha

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. We need consider how to be compatible with the existing software if we want the change the IR type. There are some existing software that is based on the existing type. For example the AMX dialect of MLIR and TLX code are based on the x86_amx, it would break them if w

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D141899#4061150 , @zixuan-wu wrote: > In D141899#4058173 , @LuoYuanke > wrote: > >> @zixuan-wu, changing x86_amx would break our internal code. May I know the >> motivation to chang

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D141899#4061150 , @zixuan-wu wrote: > In D141899#4058173 , @LuoYuanke > wrote: > >> @zixuan-wu, changing x86_amx would break our internal code. May I know the >> motivation to chang

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-01-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D141899#4061237 , @zixuan-wu wrote: > With considering > https://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility I think > we need make consensus to choose one option from following 2 options. > > 1. Remove X8

[PATCH] D107082: [X86][RFC] Enable `_Float16` type support on X86 following the psABI

2022-06-10 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107082/new/ https://reviews.llvm.org/D107082

[PATCH] D127050: [Clang][FP16] Add 4 builtins for _Float16

2022-06-04 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/include/clang/Basic/Builtins.def:145 BUILTIN(__builtin_huge_vall, "Ld", "nc") +BUILTIN(__builtin_huge_valf16, "x", "nc") BUILTIN(__builtin_huge_valf128, "LLd", "nc") Is the builtin sorted in alphabet order? R

[PATCH] D127050: [Clang][FP16] Add 4 builtins for _Float16

2022-06-04 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/builtin_Float16.c:7 +void test_float16_builtins(void) { + volatile _Float16 res; + Is _Float16 a legal type for target armv7a and aarch64? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D127050: [Clang][FP16] Add 4 builtins for _Float16

2022-06-04 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/builtin_Float16.c:7 +void test_float16_builtins(void) { + volatile _Float16 res; + pengfei wrote: > LuoYuanke wrote: > > Is _Float16 a legal type for target armv7a and aarch64? > Yes, see > https:/

[PATCH] D107082: [X86][RFC] Enable `_Float16` type support on X86 following the psABI

2022-06-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:616 +setOperationAction(ISD::FROUNDEVEN, MVT::f16, Promote); +setOperationAction(ISD::FP_ROUND, MVT::f16, Expand); +setOperationAction(ISD::FP_EXTEND, MVT::f32, Expand); -

[PATCH] D107082: [X86][RFC] Enable `_Float16` type support on X86 following the psABI

2022-06-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/test/Analysis/CostModel/X86/fptoi_sat.ll:852 +; SSE2-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %f16u1 = call i1 @llvm.fptoui.sat.i1.f16(half undef) +; SSE2-NEXT: Cost Model: Found an estimated cost of 5 fo

[PATCH] D107082: [X86][RFC] Enable `_Float16` type support on X86 following the psABI

2022-06-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/test/CodeGen/X86/fpclamptosat.ll:569 ; CHECK-NEXT:cvttss2si %xmm0, %rax ; CHECK-NEXT:ucomiss {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 +; CHECK-NEXT:movabsq $-9223372036854775808, %rcx # imm = 0x8000

[PATCH] D138547: [X86][AMX] Fix typo of the headerfile.

2022-11-23 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke created this revision. Herald added a project: All. LuoYuanke requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D138547 Files: clang/lib/Headers/amxfp16intrin.h In

[PATCH] D138547: [X86][AMX] Fix typo of the headerfile.

2022-11-23 Thread LuoYuanke via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG55fceef61e0d: [X86][AMX] Fix typo of the headerfile. (authored by LuoYuanke). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138547/new/ https://reviews.llvm

[PATCH] D70157: Align branches within 32-Byte boundary

2019-12-06 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D70157#1773180 , @reames wrote: > Recording something so I don't forget it when we get back to the prefix > padding version. The write up on the bundle align mode stuff mentions a > concerning memory overhead for the featur

[PATCH] D141899: [IR][X86] Remove X86AMX type in LLVM IR instead of target extension

2023-02-26 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. > Deal. And because I am on busy for a long time and it is also better to let > intel guy handle x86-related feature, I am happy with the patch being > commandeered. @nikic's proposal looks a promising solution, we can investigate more about it. Repository: rG LLV

[PATCH] D147165: [Windows SEH] Fix catch+return crash for Windows -EHa

2023-03-30 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, pls wait for 1 or 2 days in case there are comments from others Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D147165/new/ https://re

[PATCH] D151537: [NFC] Update cpu_specific test to use a newer CPU

2023-05-29 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151537/new/ https://reviews.llvm.org/D151537 _

[PATCH] D151537: [NFC] Update cpu_specific test to use a newer CPU

2023-05-30 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D151537#4380763 , @erichkeane wrote: > I don't really see the justification here? Why do this change? If the > intent is to just test a newer architecture, we can add tests for that, not > change existing ones. KNL is d

[PATCH] D120307: [X86] Add helper enum for ternary intrinsics

2022-03-06 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but pls wait for 1 or 2 days to see if there are any comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120307/new/

[PATCH] D101059: [X86][AMX] Add description for AMX new interface.

2021-04-22 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke created this revision. LuoYuanke requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D101059 Files: clang/lib/Headers/amxintrin.h Index: clang/lib/Headers/amxintrin.h

[PATCH] D101059: [X86][AMX] Add description for AMX new interface.

2021-04-22 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke updated this revision to Diff 339597. LuoYuanke added a comment. Fix some descriptions. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101059/new/ https://reviews.llvm.org/D101059 Files: clang/lib/Headers/amxintrin.h Index: clang/lib/H

[PATCH] D101059: [X86][AMX] Add description for AMX new interface.

2021-04-27 Thread LuoYuanke via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGd6c6db2feaab: [X86][AMX] Add description for AMX new interface. (authored by LuoYuanke). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST A

[PATCH] D115199: [X86][AMX] Support amxpreserve attribute in clang.

2021-12-06 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke created this revision. Herald added a reviewer: aaron.ballman. LuoYuanke requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D115199 Files: clang/include/clang/Basic/A

[PATCH] D115199: [WIP][X86][AMX] Support amxpreserve attribute in clang.

2021-12-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke updated this revision to Diff 393012. LuoYuanke added a comment. Herald added a subscriber: martong. Updating D115199 : [WIP][X86][AMX] Support amxpreserve attribute in clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://

[PATCH] D98757: [AMX] Not fold constant bitcast into amx intrisic

2021-03-16 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. Would you add a test case for it? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D98757/new/ https://reviews.llvm.org/D98757 ___ cfe-commits mailing list cfe-commits@lists.llvm.o

[PATCH] D98757: [AMX] Not fold constant bitcast into amx intrisic

2021-03-16 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. > at clang/test/CodeGen/X86/amx_api.c Probably we need a .ll test case to for constant folding. Comment at: llvm/lib/Analysis/ConstantFolding.cpp:108 + // We won't fold bitcast for tile type, becasue there is no way to + // assigne a tmm reg from

[PATCH] D93594: [X86] Pass to transform amx intrinsics to scalar operation.

2021-03-16 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. I can reproduce the regression. I'll help to fix it. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D93594/new/ https://reviews.llvm.org/D93594 ___ cfe-commits mailing list cfe-c

[PATCH] D93594: [X86] Pass to transform amx intrinsics to scalar operation.

2021-03-17 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. The fix is uploaded at https://reviews.llvm.org/D98773. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D93594/new/ https://reviews.llvm.org/D93594 ___ cfe-commits mailing list cf

[PATCH] D87981: [X86] AMX programming model.

2021-03-18 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86PreTileConfig.cpp:90 +INITIALIZE_PASS_BEGIN(X86PreTileConfig, "tilepreconfig", + "Tile Register Configure", false, false) +INITIALIZE_PASS_DEPENDENCY(MachineDominatorTree) yu

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-23 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke created this revision. Herald added a subscriber: pengfei. LuoYuanke requested review of this revision. Herald added projects: clang, LLVM. Herald added subscribers: llvm-commits, cfe-commits. Introduce new intrinsic to cast vector and amx. This can prevent middle-end optimization on bit

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-23 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. @lebedev.ri, this patch is mainly for discussing the approach that Florian proposed, so I didn't polish my code. Nevertheless your comments for amx_cast.c is right. For __tile_loadd() is to load a 2d tile from memory. There is an extra parameter stride. As I explain i

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-23 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. @lebedev.ri, our goal is seeking a ideal solution, not arguing who is right. I hope there is no bias during the discussion. I hope Florian and James set a role model for you. They are trying to understand the problem and helping solve the problem. I don't know if it i

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-23 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D99152#2644017 , @lebedev.ri wrote: > load instruction loads > contigious bytes. > If that is not what is AMX is trying to use it for, then it is being used > incorrectly

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-23 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. > To be honest i don't really understand why `x86_amx` type is even there. > It seems to me that if you just directly used > `@llvm.x86.tileloadd64.internal` / `@llvm.x86.tilestored64.internal`, > and `s/x86_amx/<256 x i32>/`, none of these problems would be here. I ex

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-24 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. > IIUC you need this to transfer/convert data from a consecutive vector to an > `AMX` tile. To express that, emitting an intrinsic for the conversion instead > a `bit cast` seems the right thing to me. Yes. We need to transfer/convert data from a consecutive vector to

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-24 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D99152#2647681 , @fhahn wrote: > I can't see any `load <256 x i32>` in the linked example, just a store. Could > you check the example? I create another example at https://gcc.godbolt.org/z/v6od5ceEz. In bar() function, you

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-29 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. > I think that point was not really clear during the discussion. Using `load > <256 x i32>` to lower `__tile_loadd() ` would indeed be incorrect. But I > don't think that's happening at the moment, at least going from a simple > example https://gcc.godbolt.org/z/KT5rc

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-29 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. > Whether to further optimizations are correct is a different problem, but we > need a specification for the builtins, intrinsics and the type before going > any further in that direction. > > I think you need to set the input to `LLVM IR`: > https://gcc.godbolt.org/z

[PATCH] D99152: [AMX] Prototype for vector and amx bitcast.

2021-03-31 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. > Unfortunately this is not possible to use an opaque type with the AMX > intrinsics at the moment, because of the way they are define. It is possible > to use opaque types with intrinsics in general though, e.g. see > https://llvm.godbolt.org/z/Ezhf6535c > > My point

[PATCH] D99708: [X86] Enable compilation of user interrupt handlers.

2021-04-01 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D99708#2663989 , @craig.topper wrote: > A user interrupt is different than a regular interrupt right? It doesn't make > sense that we would change the behavior of the interrupt calling convention > just because the the user

[PATCH] D99708: [X86] Enable compilation of user interrupt handlers.

2021-04-06 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. LGMT. Thank you! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D99708/new/ https://reviews.llvm.org/D99708 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-c

[PATCH] D99708: [X86] Enable compilation of user interrupt handlers.

2021-04-12 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM. But wait one or two days to see if there is more comments from Craig and HJ. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D99708/new/ https://reviews.llvm.org/D99708 ___

[PATCH] D91927: [X86] Add x86_amx type for intel AMX.

2020-12-29 Thread LuoYuanke via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG981a0bd85811: [X86] Add x86_amx type for intel AMX. (authored by LuoYuanke). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION http

[PATCH] D92837: [X86] Support tilezero intrinsic and c interface for AMX.

2020-12-30 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke updated this revision to Diff 314087. LuoYuanke added a comment. Rebase. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D92837/new/ https://reviews.llvm.org/D92837 Files: clang/include/clang/Basic/BuiltinsX86_64.def clang/lib/Headers/a

[PATCH] D91927: [X86] Add x86_amx type for intel AMX.

2020-12-30 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. Thank @pengfei and @MaskRay. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D91927/new/ https://reviews.llvm.org/D91927 ___ cfe-commits mailing list cfe-commits@lists.llvm.org ht

[PATCH] D92837: [X86] Support tilezero intrinsic and c interface for AMX.

2020-12-30 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke updated this revision to Diff 314157. LuoYuanke added a comment. Add avx512f in test case. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D92837/new/ https://reviews.llvm.org/D92837 Files: clang/include/clang/Basic/BuiltinsX86_64.def c

[PATCH] D92837: [X86] Support tilezero intrinsic and c interface for AMX.

2020-12-30 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke updated this revision to Diff 314170. LuoYuanke added a comment. Rebase. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D92837/new/ https://reviews.llvm.org/D92837 Files: clang/include/clang/Basic/BuiltinsX86_64.def clang/lib/Headers/a

[PATCH] D92837: [X86] Support tilezero intrinsic and c interface for AMX.

2020-12-30 Thread LuoYuanke via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG08665b180568: Support tilezero intrinsic and c interface for AMX. (authored by LuoYuanke). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D94943: [X86][AMX] Fix the typo.

2021-01-18 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke created this revision. Herald added a subscriber: pengfei. LuoYuanke requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. The dpbsud should be dpbssd. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D94943 Files: c

[PATCH] D94943: [X86][AMX] Fix the typo.

2021-01-19 Thread LuoYuanke via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG7e1d2224b42b: [X86][AMX] Fix the typo. (authored by LuoYuanke). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.l

[PATCH] D111037: [X86] Check if struct is blank before getting the inner types

2021-10-07 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/X86/avx512fp16-abi.c:203 +struct { + float a; + struct {}; Add more cases for the struct composed of _Float16, float, double, struct {}? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D111037: [X86] Check if struct is blank before getting the inner types

2021-10-07 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/X86/avx512fp16-abi.c:207 +} pr52011() { + // CHECK-C: define{{.*}} { float, double } @pr52011 +} Why not test CPP as well? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https:

[PATCH] D111037: [X86] Check if struct is blank before getting the inner types

2021-10-08 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D111037/new/ https://reviews.llvm.org/D111037

[PATCH] D105268: [X86] AVX512FP16 instructions enabling 5/6

2021-08-19 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. I understand now. Thanks, Craig. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105268/new/ https://reviews.llvm.org/D105268 ___ cfe-commits mailing list cfe-commits@lists.llvm.

[PATCH] D105267: [X86] AVX512FP16 instructions enabling 4/6

2021-08-19 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks. May wait 1 or 2 days for the comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105267/new/ https://reviews

[PATCH] D108422: [NFC][clang] Move remaining part of X86Target.def to llvm/Support/X86TargetParser.def

2021-08-20 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. In D108422#2957541 , @erichkeane wrote: > In D108422#2957528 , @RKSimon wrote: > >> There's nothing later than CannonLake here - does Intel need to at least >> reference up to Tiger/Ro

[PATCH] D108422: [NFC][clang] Move remaining part of X86Target.def to llvm/Support/X86TargetParser.def

2021-08-21 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. >> Thanks for reminding. We've supported -march=${CPU}, but forgot to update >> this table. We will update it. > > Shall we get this patch committed first before making any changes? Yes, committing the patch first looks good to me. Repository: rG LLVM Github Monore

[PATCH] D108509: [X86][AMX] Add missing inline attributes in AMX intrinsics. NFCI

2021-08-21 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108509/new/ https://reviews.llvm.org/D108509

[PATCH] D105268: [X86] AVX512FP16 instructions enabling 5/6

2021-08-21 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, may wait 1 or 2 days for comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105268/new/ https://reviews.llvm.org/D1

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-26 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/X86/avx512fp16-builtins.c:4223 + +// CFC ADD PH + MADD? Comment at: clang/test/CodeGen/X86/avx512fp16-builtins.c:4315 + +// CF ADD PH + MADD? Co

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-26 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86InstrFoldTables.cpp:1852 + { X86::VFCMULCPHZrr, X86::VFCMULCPHZrm, 0 }, + { X86::VFCMULCSHZrr, X86::VFCMULCSHZrm, TB_NO_REVERSE }, { X86::VFMADDPD4Yrr,

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-26 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86InstrFoldTables.cpp:1852 + { X86::VFCMULCPHZrr, X86::VFCMULCPHZrm, 0 }, + { X86::VFCMULCSHZrr, X86::VFCMULCSHZrm, TB_NO_REVERSE }, { X86::VFMADDPD4Yrr,

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-26 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:13640 +(v4f32 (OpNode VR128X:$src1, VR128X:$src2)), +0, 0, 0, X86selects, "@earlyclobber $dst">, Sched<[sched.XMM]>; +defm rm : AVX512_maskable

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-27 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:47419 + : X86ISD::VFCMADDC; + // FIXME: How we handle when FMF of FADD is different from CFMUL's? + CFmul = DAG.getNode(newOp, SDLoc(N), CV

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-27 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:47419 + : X86ISD::VFCMADDC; + // FIXME: How we handle when FMF of FADD is different from CFMUL's? + CFmul = DAG.getNode(newOp, SDLoc(N), CV

[PATCH] D105269: [X86] AVX512FP16 instructions enabling 6/6

2021-08-27 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but may wait 1 or 2 days for the comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105269/new/ https://reviews.llv

[PATCH] D109487: [X86] Support *_set1_pch(Float16 _Complex h)

2021-09-11 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109487/new/ https://reviews.llvm.org/D109487

[PATCH] D109607: [X86] Refactor GetSSETypeAtOffset to fix pr51813

2021-09-12 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:3417 + llvm::Type *T1 = getFPTypeAtOffset(IRType, IROffset + NextFP, TD); + if (T1 == nullptr) { +if (NextFP == 2) Would you add comments on each case like previous code? ==

[PATCH] D109658: [X86][FP16] Change the order of the operands in complex FMA intrinsics to allow swap between the mul operands.

2021-09-13 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added a comment. It seems in this patch the builtins interface is aligned to intrinsics interface. Since AVX512FP16 is pretty new, I assume nobody is using the GCC builtin. Can we ask GCC guys change their builtin interface? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST A

[PATCH] D109607: [X86] Refactor GetSSETypeAtOffset to fix pr51813

2021-09-15 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:3421 +if (T0->isHalfTy()) + T1 = getFPTypeAtOffset(IRType, IROffset + 4, TD); +// If we can't get a second FP type, return a simple half or float. Not quite understanding w

[PATCH] D109607: [X86] Refactor GetSSETypeAtOffset to fix pr51813

2021-09-15 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke added inline comments. Comment at: clang/test/CodeGen/X86/avx512fp16-abi.c:153 +struct float2 { + struct {} s; + float a; Add a test case for "{ struct {}; half; struct {}; half;}? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION h

[PATCH] D109607: [X86] Refactor GetSSETypeAtOffset to fix pr51813

2021-09-15 Thread LuoYuanke via Phabricator via cfe-commits
LuoYuanke accepted this revision. LuoYuanke added a comment. This revision is now accepted and ready to land. LGTM, but pls wait 1 or 2 days for the comments from others. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D109607/new/ https://reviews.llv

<    1   2