[PATCH] D149716: clang: Use new frexp intrinsic for builtins and add f16 version

2023-06-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 85bdea023f5116f789095b606554739403042a21 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D149716/new/ https://reviews.llvm.org/D149716 __

[PATCH] D154000: HIP: Directly call round builtins

2023-06-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: JonChesterfield, yaxunl, jhuber6. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. Pretty sure these lround->round cases are just wrong https://reviews.llvm.org/D154000 Files: clang/li

[PATCH] D154133: [amdgpu] start documenting amdgpu support by clang

2023-06-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/docs/AMDGPUSupport.rst:54 + - Defined if FMAF instruction is available (deprecated). + * - ``FP_FAST_FMAF`` + - Defined if fast FMAF instruction is available. The fma macro was actually from the opencl spe

[PATCH] D154145: [HIP] Fix -mllvm option for device lld linker

2023-06-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/ToolChains/HIPAMD.cpp:164 +StringRef ArgVal = Arg->getValue(1); +if (ArgVal.startswith("-mllvm=")) { + ArgVal = ArgVal.substr(strlen("-mllvm=")); StringRef Prefix("-mllvm=") and then use the

[PATCH] D154207: [AMDGPU] Rename predefined macro __AMDGCN_WAVEFRONT_SIZE

2023-06-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Basic/Targets/AMDGPU.cpp:318 + Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE__", Twine(WavefrontSize)); + // ToDo: deprecate this macro for naming consistency. If you're renaming it anyway, might as well go f

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-06-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Unrelated but can we get this to start reporting xnack and ecc? Comment at: clang/tools/amdgpu-arch/AMDGPUArchByHIP.cpp:80 +if (err != hipSuccess) { + llvm::errs() << "Failed to get device id for ordinal " << i << "\n"; + return 1;

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-06-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/tools/amdgpu-arch/AMDGPUArchByHIP.cpp:91 +} +printf("%s\n", prop.gcnArchName); + } llvm::outs CHANGES SINCE LAST ACTION https://reviews.llvm.org/D153725/new/ https://reviews.llvm.org/D153725 _

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-06-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/tools/amdgpu-arch/AMDGPUArch.cpp:47 - // Attempt to load the HSA runtime. - if (llvm::Error Err = loadHSA()) { -logAllUnhandledErrors(std::move(Err), llvm::errs()); -return 1; - } - - hsa_status_t Status = hsa_init();

[PATCH] D152857: OpenMP: Don't use target regions in library function test

2023-06-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D152857/new/ https://reviews.llvm.org/D152857 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D152829: clang: Add start of header test for __clang_hip_libdevice_declares

2023-06-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 40bb302c451ec1a8f6a2b8238e0a56448b8e1a12 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D152829/new/ https://reviews.llvm.org/D152829 __

[PATCH] D152857: OpenMP: Don't use target regions in library function test

2023-06-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 917eddfdcb15bddf67a54ede1f1643d5fc83628d CHANGES SINCE LAST ACTION https://reviews.llvm.org/D152857/new/ https://reviews.llvm.org/D152857 __

[PATCH] D154133: [amdgpu] start documenting amdgpu support by clang

2023-06-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/docs/AMDGPUSupport.rst:20 + +Clang supports OpenCL, HIP and OpenMP on amdgpu target. + "on amdgpu target" doesn't sound grammatical CHANGES SINCE LAST ACTION https://reviews.llvm.org/D154133/new/ https://review

[PATCH] D154207: [AMDGPU] Rename predefined macro __AMDGCN_WAVEFRONT_SIZE

2023-06-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. lgtm except for dropping test coverage Comment at: clang/test/Driver/hip-macros.hip:20 // RUN: -mwavefrontsize64 %s 2>&1 | FileCheck --check-prefixes=WAVE64 %s -// WAVE64-DAG: #define __AMDGCN_WAVEFRONT_SIZE 64 -// WAVE32-DAG: #define __AMDGCN_WAVEFRO

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-03-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142907/new/ https://reviews.llvm.org/D142907 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141700: AMDGPU: Move enqueued block handling into clang

2023-04-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp:299 + + Attrs.mRuntimeHandle = getEnqueuedBlockSymbolName(TM, Func); } kzhuravl wrote: > Do we really need/want to update code object v2? as long as the code is here

[PATCH] D147572: [Clang][OpenMP] Fix failure with team-wide allocated variable

2023-04-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:303 +if (GV->hasInitializer() && !(isa(GV->getInitializer()) || + isa(GV->getInitializer( { OutContext.reportError({}, Isa covers

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-04-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142907/new/ https://reviews.llvm.org/D142907 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D147732: [AMDGPU] Add f32 permlane{16, x16} builtin variants

2023-04-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added a comment. This revision now requires changes to proceed. There is a benefit to not having bitcast noise in the IR Comment at: llvm/include/llvm/IR/IntrinsicsAMDGPU.td:1962-1963 +// llvm.amdgcn.permlanex16.f32 +def i

[PATCH] D147962: [RFC][clang] Pull experimental targets' info out of TargetInfo.cpp (NFC)

2023-04-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. Herald added a subscriber: wdng. Seems like an obvious organizational improvement Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D147962/new/ http

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142907/new/ https://reviews.llvm.org/D142907 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D148851: Disable llvm-symbolizer on some of the driver tests that are timing out

2023-05-19 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I also think we need to revert or disable https://github.com/llvm/llvm-project/commit/cead4eceb01b935fae07bf4a7e91911b344d2fec The symbolizer is unusably slow with it Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D148851/new

[PATCH] D151087: [Clang] Permit address space casts with 'reinterpret_cast' in C++

2023-05-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D151087#4360606 , @jhuber6 wrote: > In D151087#4360577 , @ebevhan wrote: > >> What would be the semantics of such an operation if the address spaces are >> disjoint? Or, if the underlyi

[PATCH] D151087: [Clang] Permit address space casts with 'reinterpret_cast' in C++

2023-05-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D151087#4360695 , @jhuber6 wrote: > I don't think that's something we can diagnose here with just the address > space number. it would require information from the underlying target for the > expected pointer qualities to the

[PATCH] D151087: [Clang] Permit address space casts with 'reinterpret_cast' in C++

2023-05-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D151087#4360932 , @jhuber6 wrote: > How are they broken? The expectation is just that they line up with what the > backend defines them as, which should be a stable target. We could > potentially use target info to map the num

[PATCH] D151087: [Clang] Permit address space casts with 'reinterpret_cast' in C++

2023-05-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D151087#4361237 , @ebevhan wrote: > What is now a reinterpret_cast? An address space conversion, or a bitcast? > It's not as straightforward as it might seem. This is the most straightforward part. It's a bitcast. Repository

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-05-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:219 + +static inline size_t alignUp(size_t Value, uint Alignment) { + return (Value + Alignment - 1) & ~(Alignment - 1); MathExtras already has alignTo Co

[PATCH] D151087: [Clang] Permit address space casts with 'reinterpret_cast' in C++

2023-05-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D151087#4363503 , @ebevhan wrote: > In D151087#4362059 , @aaron.ballman > wrote: > >> Based on all this, I think we should go with `__addrspace_cast` as a named >> cast and not allow t

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-05-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenHIP/printf_nonhostcall.cpp:170-171 +// CHECK-NEXT:[[PRINTBUFFNEXTPTR5:%.*]] = getelementptr i8, ptr addrspace(1) [[PRINTBUFFNEXTPTR4]], i32 8 +// CHECK-NEXT:[[TMP13:%.*]] = bitcast double [[TMP4]] to i64 +// CHE

[PATCH] D151349: [HIP] emit macro `__HIP_NO_IMAGE_SUPPORT`

2023-05-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. You seem to be defining a new subtarget feature without actually defining the underlying feature. I thought the issue was specific missing image instructions, which is already covered by extended-image-insts Comment at: clang/lib/Basic/Targets/AMDGPU.h

[PATCH] D151349: [HIP] emit macro `__HIP_NO_IMAGE_SUPPORT`

2023-05-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/OpenMP/amdgcn-attributes.cpp:36 // DEFAULT: attributes #0 = { convergent noinline norecurse nounwind optnone "kernel" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "uniform-work-group-size"="true" } -// CPU: at

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-05-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Driver/Options.td:1030 NegFlag>; +def mprintf_kind_EQ : Joined<["-"], "mprintf-kind=">, Group, + HelpText<"Specify the printf lowering scheme (AMDGPU only), allowed values are " I'm a bit worried

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-05-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:211 +struct StringData { + std::string Str = ""; + bool isConst = true; arsenm wrote: > Don't need = "" Can't you just use the raw StringRef out of getConstantStringInfo

[PATCH] D151349: [HIP] emit macro `__HIP_NO_IMAGE_SUPPORT`

2023-05-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D151349#4377792 , @yaxunl wrote: > using ISA version to determine whether image is supported That’s backward. You can track the feature in clang separately and just not emit it. We do that for a few other features. You just st

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-06-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Driver/Options.td:1030 NegFlag>; +def mprintf_kind_EQ : Joined<["-"], "mprintf-kind=">, Group, + HelpText<"Specify the printf lowering scheme (AMDGPU only), allowed values are " vikramRH wrote: >

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-06-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Should also be mentioned in the release notes Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150427/new/ https://reviews.llvm.org/D150427 ___ cfe-commits mailing list cfe-commits@l

[PATCH] D112932: Use llvm.is_fpclass to implement FP classification functions

2023-06-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D112932#4043545 , @arsenm wrote: > This change itself LGTM but I think it should wait until after we get more > optimizations in to go back to fcmp, and after the release branch I think these optimizations are mostly in a good

[PATCH] D112932: Use llvm.is_fpclass to implement FP classification functions

2023-06-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D112932#4005339 , @sepavloff wrote: > Remove __builtin_isfpclass I am interested in adding a __builtin_isfpclass, just in a separate patch Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llv

[PATCH] D112932: Use llvm.is_fpclass to implement FP classification functions

2023-06-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2185-2187 + auto TestV = llvm::ConstantInt::get(CGF->Int32Ty, Test); + Function *F = CGF->CGM.getIntrinsic(Intrinsic::is_fpclass, V->getType()); + return CGF->Builder.CreateCall(F, {V, TestV}); -

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-02-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D142907#4119339 , @andrew.w.kaylor wrote: > In general, it seems like the denormal mode should be considered part of the > floating point environment (though as far as I know the C standard, at least, > doesn't document it as

[PATCH] D142507: [AMDGPU] Split dot7 feature

2023-02-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Basic/BuiltinsAMDGPU.def:239 -TARGET_BUILTIN(__builtin_amdgcn_fdot2, "fV2hV2hfIb", "nc", "dot7-insts") +TARGET_BUILTIN(__builtin_amdgcn_fdot2, "fV2hV2hfIb", "nc", "dot10-insts") TARGET_BUILTIN(__builtin_amdgcn_fdot2

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-02-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D142907#4132318 , @jcranmer-intel wrote: > Not entirely sure where the best place to effect this (I think somewhere in > the clang driver code?), but on further reflection, it feels like strict > fp-model in clang should set

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-02-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D142907#4132543 , @kpn wrote: > What's the plan for tying this to strictfp? Because I don't it should be tied > to cases where we use the constrained intrinsics but the exceptions are > ignored and the default rounding is in s

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-02-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/docs/LangRef.rst:2166 + +If the mode is ``"dynamic"``, the behavior is derived from the +dynamic state of the floating-point environment. Transformations pengfei wrote: > 1. Does it mean users must specify `d

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-02-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D142907#4132836 , @jcranmer-intel wrote: > In D142907#4132430 , @arsenm wrote: > >> I was thinking of changing the default in general to dynamic. I was going to >> at least change the

[PATCH] D144505: [Clang] Add options in LTO mode when cross compiling for AMDGPU

2023-02-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/Driver/amdgpu-toolchain.c:7 // RUN: %clang -### -g --target=amdgcn-mesa-mesa3d -mcpu=kaveri %s 2>&1 | FileCheck -check-prefix=DWARF_VER %s // AS_LINK: "-cc1as" should add a test for thinlto? Repository:

[PATCH] D142934: clang: Use ptrmask for pointer alignment

2023-02-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 499797. arsenm added a comment. Test updates CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142934/new/ https://reviews.llvm.org/D142934 Files: clang/lib/CodeGen/TargetInfo.cpp clang/test/CodeGen/PowerPC/ppc-varargs-struct.c clang/test/CodeGen/

[PATCH] D142934: clang: Use ptrmask for pointer alignment

2023-02-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 499798. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142934/new/ https://reviews.llvm.org/D142934 Files: clang/lib/CodeGen/TargetInfo.cpp clang/test/CodeGen/PowerPC/ppc-varargs-struct.c clang/test/CodeGen/arm-abi-vector.c clang/test/CodeGen/a

[PATCH] D142934: clang: Use ptrmask for pointer alignment

2023-02-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 499799. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142934/new/ https://reviews.llvm.org/D142934 Files: clang/lib/CodeGen/TargetInfo.cpp clang/test/CodeGen/PowerPC/ppc-varargs-struct.c clang/test/CodeGen/arm-abi-vector.c clang/test/CodeGen/a

[PATCH] D140992: clang: Add __builtin_elementwise_fma

2023-02-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm marked an inline comment as done. arsenm added inline comments. Comment at: clang/lib/Sema/SemaChecking.cpp:2615 QualType ArgTy = TheCall->getArg(0)->getType(); -QualType EltTy = ArgTy; - -if (auto *VecTy = EltTy->getAs()) - EltTy = VecTy->getElementType(

[PATCH] D140992: clang: Add __builtin_elementwise_fma

2023-02-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 499805. arsenm added a comment. Loop merge, documentation CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140992/new/ https://reviews.llvm.org/D140992 Files: clang/docs/LanguageExtensions.rst clang/include/clang/Basic/Builtins.def clang/include/

[PATCH] D149982: AMDGPU: Add basic gfx941 target

2023-05-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added inline comments. This revision is now accepted and ready to land. Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:1231-1261 + [FeatureGFX9, + FeatureGFX90AInsts, + FeatureGFX940Insts, + FeatureFmaMixInsts, + FeatureLDSBankCount32,

[PATCH] D149986: AMDGPU: Force sc0 and sc1 on stores for gfx940 and gfx941

2023-05-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Should this be a feature set by default in the subtarget constructor instead? Should you be able to turn this off? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D149986/new/ https://reviews.llvm.org/D149986

[PATCH] D150043: [InferAddressSpaces] Handle vector of pointers type & Support intrinsic masked gather/scatter

2023-05-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp:289 + +static bool hasSameElementOfPtrOrVecPtrs(Type *Ty1, Type *Ty2) { + assert(isPtrOrVecOfPtrsType(Ty1) && isPtrOrVecOfPtrsType(Ty2)); arsenm wrote: > Ditto, only opaq

[PATCH] D149986: AMDGPU: Force sc0 and sc1 on stores for gfx940 and gfx941

2023-05-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/TargetParser/TargetParser.cpp:330-332 case GK_GFX940: + Features["force-store-sc0-sc1"] = true; + [[fallthrough]]; I don't see a reason to set this here. There's no need to expose this to the IR.

[PATCH] D145343: [AMDGPU] Emit predefined macro `__AMDGCN_CUMODE__`

2023-05-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Basic/DiagnosticDriverKinds.td:121-123 +def warn_drv_unsupported_option_for_processor : Warning< + "ignoring '%0' option as it is not currently supported for processor '%1'">, + InGroup; I'm surprise

[PATCH] D145343: [AMDGPU] Emit predefined macro `__AMDGCN_CUMODE__`

2023-05-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:146 +return false; + return TargetFeature == "no-cumode"; +} yaxunl wrote: > arsenm wrote: > > I don't understand the use of "no-cumode". Where is this defined? > This funct

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-05-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Basic/LangOptions.def:274 LANGOPT(OffloadingNewDriver, 1, 0, "use the new driver for generating offloading code.") +ENUM_LANGOPT(AMDGPUPrintfKindVal, AMDGPUPrintfKind, 2, AMDGPUPrintfKind::Buffered, "printf lowering

[PATCH] D150043: [InferAddressSpaces] Handle vector of pointers type & Support intrinsic masked gather/scatter

2023-05-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150043/new/ https://reviews.llvm.org/D150043 ___ cfe-commits mailing list cfe-commits@l

[PATCH] D150043: [InferAddressSpaces] Handle vector of pointers type & Support intrinsic masked gather/scatter

2023-05-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/test/Transforms/InferAddressSpaces/masked-gather-scatter.ll:3 + +; CHECK-LABEL: @masked_gather_inferas( +; CHECK: tail call <4 x i32> @llvm.masked.gather.v4i32.v4p1 CaprYang wrote: > arsenm wrote: > > Generate full c

[PATCH] D150723: clang/openmp: Fix alignment for ThreadID Address variables

2023-05-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Needs test Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150723/new/ https://reviews.llvm.org/D150723 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.or

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-05-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Basic/TargetOptions.h:99 + +/// pritnf lowering scheme involving implicit printf buffers, +Buffered = 1, Typo pritnf Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.c

[PATCH] D150043: [InferAddressSpaces] Handle vector of pointers type & Support intrinsic masked gather/scatter

2023-05-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 44096e6904e10bb313fef2f6aaff25c25d1325f7 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150043/new/ https://reviews.llvm.org/D150043 __

[PATCH] D149716: clang: Use new frexp intrinsic for builtins and add f16 version

2023-05-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D149716/new/ https://reviews.llvm.org/D149716 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D148796: [AMDGPU][GFX908] Add builtin support for global add atomic f16/f32

2023-04-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I thought we had separate _rtn* builtins for this? Comment at: clang/lib/Sema/SemaChecking.cpp:4484 +auto TargetID = Context.getTargetInfo().getTargetID(); +if (!TargetID || TargetID->find("gfx908") == std::string::npos) + return false;

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-04-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142907/new/ https://reviews.llvm.org/D142907 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-04-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D142907#4288247 , @efriedma wrote: > Given that, I don't follow the whole "merging" thing... we should just be > setting whatever mode is active. The attribute setting should not depend on > whether the function is interposab

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-04-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D142907#4288555 , @efriedma wrote: > If you have a library function that's built with > "denormal-fp-math"="dynamic,dynamic", you can link it into code built in any > mode, and LTO should be able to propagate that mode from th

[PATCH] D148769: Split out `CodeGenTypes` from `CodeGen` for LLT/MVT

2023-04-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. Herald added a subscriber: wdng. I feel like this should have documentation explaining the library split, but not sure where the best place to put that would be Repository: rG LLVM Github M

[PATCH] D142393: [OpenMP] Add 'amdgpu-flat-work-group-size' to OpenMP kernels

2023-04-26 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Basic/LangOptions.def:271 LANGOPT(GPUAllowDeviceInit, 1, 0, "allowing device side global init functions for HIP") -LANGOPT(GPUMaxThreadsPerBlock, 32, 1024, "default max threads per block for kernel launch bounds for

[PATCH] D138394: HIP: Directly call fma builtins

2023-04-26 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 43fd46fda3c90b014e8a73c62f67af9543ea4d59 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138394/new/ https://reviews.llvm.org/D138394 __

[PATCH] D142907: LangRef: Add "dynamic" option to "denormal-fp-math"

2023-04-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. bc37be1855773c1dcf8c6bf577a096a81fd58652 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142907/new/ https://reviews.llvm.org/D142907 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[PATCH] D145343: [AMDGPU] Emit predefined macro `__AMDGCN_CUMODE`

2023-05-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Basic/Targets/AMDGPU.cpp:318 Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE", Twine(WavefrontSize)); + Builder.defineMacro("__AMDGCN_CUMODE", Twine(CUMode)); } Why do we sometimes use __ on both sides, and so

[PATCH] D145441: [AMDGPU] Define data layout entries for buffers

2023-05-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. LGTM. It may not be the end step and may be annoying to make further changes from this point, but on its own this isn't worse than before Comment at: llvm/lib/Target/AMDGPU/

[PATCH] D149716: clang: Use new frexp intrinsic for builtins and add f16 version

2023-05-02 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, jcranmer-intel, kpn, andrew.w.kaylor, sepavloff, tra. Herald added subscribers: kosarev, jdoerfert, pengfei, jvesely. Herald added a project: All. arsenm requested review of this revision. Herald added subscribers: llvm-commits, wdng. H

[PATCH] D149716: clang: Use new frexp intrinsic for builtins and add f16 version

2023-05-02 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 518947. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D149716/new/ https://reviews.llvm.org/D149716 Files: clang/include/clang/Basic/Builtins.def clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/aix-builtin-mapping.c clang/test/CodeGen/builti

[PATCH] D151349: [HIP] emit macro `__HIP_NO_IMAGE_SUPPORT`

2023-06-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D151349#4381569 , @yaxunl wrote: > In D151349#4381471 , @arsenm wrote: > >> In D151349#4377792 , @yaxunl wrote: >> >>> using ISA version to dete

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-06-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:371 +// the offsets. +uint64_t DstAlign = (i == 0) ? 4 : 8; +Builder.CreateMemCpy(PtrToStore, /*DstAlign*/ Align(DstAlign), Args[i], I don't follow th

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-06-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenHIP/printf_nonhostcall.cpp:137 + +__device__ float f1 = 3.14f; +__device__ double f2 = 2.71828; Also half Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D1

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-06-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Also should be noted in the release notes Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150427/new/ https://reviews.llvm.org/D150427 ___ cfe-commits mailing list cfe-commits@lists

[PATCH] D152251: [clang][CodeGen] Fix GPU-specific attributes being dropped by bitcode linking

2023-06-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenCUDA/Inputs/ocml-sample-target-attrs.cl:2 +__attribute__((target("gfx11-insts"))) +unsigned do_intrin_stuff(void) +{ Sound really be ulong Comment at: clang/test/CodeGenCUDA/link-bui

[PATCH] D152251: [clang][CodeGen] Fix GPU-specific attributes being dropped by bitcode linking

2023-06-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added inline comments. This revision is now accepted and ready to land. Comment at: clang/test/CodeGenCUDA/link-builtin-bitcode-gpu-attrs-preserved.cu:34 +// CHECK: define {{.*}} i32 @do_intrin_stuff() #[[ATTR:[0-9]+]] +// CHECK: attributes

[PATCH] D152226: [FunctionAttrs] Propagate some func/arg/ret attributes from caller to callsite (WIP)

2023-06-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/include/llvm/Transforms/Utils/InferCallsiteAttrs.h:1 +#ifndef LLVM_TRANSFORMS_UTILS_INFERCALLSITEATTRS_H +#define LLVM_TRANSFORMS_UTILS_INFERCALLSITEATTRS_H Missing license header Comment at: llvm/

[PATCH] D138473: clang/HIP: Inline frexp/frexpf implementations

2023-06-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Herald added a subscriber: jplehr. After D149716 this can switch to use the new builtin, and thus restores the ye olde southern islands workaround CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138473/new/ https://reviews.llvm.o

[PATCH] D138395: HIP: Directly call fmin/fmax builtins

2023-06-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138395/new/ https://reviews.llvm.org/D138395 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D138399: HIP: Directly call isinf builtins

2023-06-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138399/new/ https://reviews.llvm.org/D138399 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D138396: HIP: Directly call signbit builtins

2023-06-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138396/new/ https://reviews.llvm.org/D138396 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D138473: clang/HIP: Inline frexp/frexpf implementations

2023-06-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. e1fa30d005afa32be1a8d490e99652cd56440826 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138473/new/ https://reviews.llvm.org/D138473 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[PATCH] D152312: HIP: Use frexp builtins in math headers

2023-06-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, JonChesterfield. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. https://reviews.llvm.org/D152312 Files: clang/lib/Headers/__clang_hip_math.h clang/test/Headers/__clang_hip_ma

[PATCH] D142823: Intrinsics: Allow tablegen to mark parameters with dereferenceable

2023-06-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 529074. arsenm added a comment. Split out amdgpu parts CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142823/new/ https://reviews.llvm.org/D142823 Files: llvm/include/llvm/IR/Intrinsics.td llvm/test/TableGen/intrin-side-effects.td llvm/test/Tab

[PATCH] D152351: [clang] Add __builtin_isfpclass

2023-06-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGen/isfpclass.c:1 +// RUN: %clang_cc1 -triple x86_64-linux-gnu -S -O1 -emit-llvm %s -o - | FileCheck %s + Use generated checks? Comment at: clang/test/CodeGen/isfpclass.c:2 +// RUN: %cla

[PATCH] D152351: [clang] Add __builtin_isfpclass

2023-06-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Also should get mentioned in the builtin docs and release notes Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D152351/new/ https://reviews.llvm.org/D152351 ___ cfe-commits mailing

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-06-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:386-387 +} else { + auto IntTy = dyn_cast(Args[i]->getType()); + if (IntTy && IntTy->getBitWidth() == 32) +WhatToStore.push_back( vikramRH wrote: > arse

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-06-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/docs/ReleaseNotes.rst:589 + ``hostcall`` - printing happens during kernel execution via series of hostcalls, + The scheme requires the system to support pcie atomics.(default) + ``buffered`` - Scheme uses a debug buffer to popul

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-06-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenHIP/printf_nonhostcall.cpp:240 + return printf(s, 10); +} Test _BitInt for small and odd types, plus i128 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D1

[PATCH] D138395: HIP: Directly call fmin/fmax builtins

2023-06-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 66d4dbae02db2c0feb32ea844500f9758f078dd7 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138395/new/ https://reviews.llvm.org/D138395 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[PATCH] D138396: HIP: Directly call signbit builtins

2023-06-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. b51ae6d31f86de2e47b1912eb045ea37a0c5de23 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138396/new/ https://reviews.llvm.org/D138396 __

[PATCH] D138399: HIP: Directly call isinf builtins

2023-06-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. d14ac1d11a5cb3994c63ecaaa5c950636a289fa1 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138399/new/ https://reviews.llvm.org/D138399 __

[PATCH] D138504: clang/HIP: Remove __llvm_amdgcn_* wrapper hacks

2023-06-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138504/new/ https://reviews.llvm.org/D138504 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D150427: [AMDGPU] Non hostcall printf support for HIP

2023-06-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. lgtm with nit Comment at: llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp:280 +} else { + auto AllocSize = M->getDataLayout().getTypeAllocSize(Args[i]->getType()); +

[PATCH] D152351: [clang] Add __builtin_isfpclass

2023-06-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Typo builting in commit message Comment at: clang/docs/LanguageExtensions.rst:3418 + +This function never raises floating-point exceptions. + Maybe also mention it doesn't canonicalize its input Comment at: clang/test/

<    6   7   8   9   10   11   12   13   >