[PATCH] D106891: [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/AtomicExpandPass.cpp:622 +return OptimizationRemark(DEBUG_TYPE, "Passed", AI->getFunction()) + << "A compare and swap loop was generated for an " + << AI->getOperationName(AI->getOperat

[PATCH] D106891: [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I don’t think constructing in the pass is the solution. Why exactly is this introducing such a big slowdown? Comment at: llvm/lib/CodeGen/AtomicExpandPass.cpp:180 + ORE = std::make_shared(&F); auto &TM = TPC->getTM(); There’s basi

[PATCH] D106891: [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D106891/new/ https://reviews.llvm.org/D106891 ___ cfe-commits mailing list cfe-commits@lists.llvm.

[PATCH] D104946: [AMDGPU] Add builtin functions image_bvh_intersect_ray

2021-06-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Basic/BuiltinsAMDGPU.def:221-224 +TARGET_BUILTIN(__builtin_amdgcn_image_bvh_intersect_ray, "V4UiUifV4fV4fV4fV4Ui", "nc", "gfx10-insts") +TARGET_BUILTIN(__builtin_amdgcn_image_bvh_intersect_ray_h, "V4UiUifV4fV4hV4hV4U

[PATCH] D112041: [InferAddressSpaces] Support assumed addrspaces from addrspace predicates.

2021-11-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp:240 + Optional getPredicatedAddrSpace(const Value &V, Value *Opnd) const; + The pass is already using UninitializedAddressSpace as a sentinal value; just use that inst

[PATCH] D80804: [AMDGPU] Introduce Clang builtins to be mapped to AMDGCN atomic inc/dec intrinsics

2021-11-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:14563 -if (isa(Order)) { - int ord = cast(Order)->getZExtValue(); +switch (BuiltinID) { +case AMDGPU::BI__builtin_amdgcn_atomic_inc32: RKSimon wrote: > @saiislam @arsenm

[PATCH] D55067: [HIP] Fix offset of kernel argument for AMDGPU target

2021-11-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added a comment. This revision now requires changes to proceed. Is this still relevant? We want to move towards consistently using byref for kernel arguments anyway CHANGES SINCE LAST ACTION https://reviews.llvm.org/D55067/new/ https://review

[PATCH] D113538: OpenMP: Start calling setTargetAttributes for generated kernels

2021-11-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: jdoerfert, JonChesterfield, ronlieb, gregrodgers. Herald added subscribers: guansong, yaxunl, jvesely. arsenm requested review of this revision. Herald added subscribers: sstefan1, wdng. This wasn't setting any of the attributes the target woul

[PATCH] D113538: OpenMP: Start calling setTargetAttributes for generated kernels

2021-11-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D113538#3121062 , @JonChesterfield wrote: > That seems important. What was the symptom of failing to set these? We may > now be redundantly setting some, e.g. > I think convergent is set somewhere else before this patch. A b

[PATCH] D113538: OpenMP: Start calling setTargetAttributes for generated kernels

2021-11-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 386137. arsenm added a comment. Also test non-kernel CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113538/new/ https://reviews.llvm.org/D113538 Files: clang/lib/CodeGen/CGOpenMPRuntime.cpp clang/lib/CodeGen/TargetInfo.cpp clang/test/OpenMP/amd

[PATCH] D98146: OpaquePtr: Turn inalloca into a type attribute

2021-03-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/IR/Attributes.cpp:490 if (Type *Ty = getValueAsType()) { - raw_string_ostream OS(Result); + // FIXME: This should never be null Result += '('; dblaikie wrote: > Is it? Could you replace this

[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D110257#3133866 , @hsmhsm wrote: > This is not something specific to AMDGPU backend, but AMDGPU backend at > present requires this canonical form. I must emphasize this is not a hard requirement, just a nice to have Reposit

[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D110257#3134001 , @JonChesterfield wrote: > So you won't articulate or document the new invariant and you think there's a > llvm-dev discussion that says we can't verify the invariant which you won't > reference, but means yo

[PATCH] D113538: OpenMP: Start calling setTargetAttributes for generated kernels

2021-11-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113538/new/ https://reviews.llvm.org/D113538 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D113538: OpenMP: Start calling setTargetAttributes for generated kernels

2021-11-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9288 + + const bool IsOpenMP = M.getLangOpts().OpenMP && !FD; + if ((IsOpenCLKernel || IsHIPKernel || IsOpenMP) && JonChesterfield wrote: > JonChesterfield wrote: > > JonChesterfield wro

[PATCH] D114533: LLVM IR should allow bitcast between address spaces with the same size.

2021-11-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Patch description should include this avoids a need to introduce ptrtoint/inttoptr pairs Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D114533/new/ https://reviews.llvm.org/D114533 __

[PATCH] D114533: LLVM IR should allow bitcast between address spaces with the same size.

2021-11-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D114533#3151482 , @lebedev.ri wrote: > In D114533#3151423 , @arsenm wrote: > >> Patch description should include this avoids a need to introduce >> ptrtoint/inttoptr pairs > > That is

[PATCH] D114533: LLVM IR should allow bitcast between address spaces with the same size.

2021-11-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D114533#3153924 , @jrtc27 wrote: > This seems like it should not apply to non-integral address spaces? No, it shouldn’t care about the individual address spaces. It’s a reinterpret of the bits, the type meanings don’t matter

[PATCH] D114865: [AMDGPU][OpenMP] Use -amdgpu-fixed-function-abi

2021-12-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added a comment. This revision now requires changes to proceed. I have a patch to enable this by default and I do not want to spread uses of this flag around Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm

[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D114957#3166858 , @yaxunl wrote: > In D114957#3166817 , @foad wrote: > >> This is a flag-day change to the signatures of the LLVM intrinsics and the >> OpenCL builtins. Is that OK? > >

[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I think this macro is purely terrible and should not be added (and at least should be all caps?). If we can't just hard break users, I would rather just leave the builtin signatures broken Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://review

[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D114957#3167700 , @arsenm wrote: > I think this macro is purely terrible and should not be added (and at least > should be all caps?). If we can't just hard break users, I would rather just > leave the builtin signatures broke

[PATCH] D113538: OpenMP: Start calling setTargetAttributes for generated kernels

2021-12-02 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. Committed as 6c27d389c8a00040aad998fe959f38ba709a8750 , recommitted as 2f0a5714184cca9325004506a22a2a3193c825aa

[PATCH] D115153: clang/AMDGPU: Don't set implicit arg attribute to default size

2021-12-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, JonChesterfield. Herald added subscribers: jdoerfert, kerbowa, t-tye, tpr, dstuttard, nhaehnle, jvesely, kzhuravl. arsenm requested review of this revision. Herald added subscribers: sstefan1, wdng. Herald added a reviewer: jdoerfert.

[PATCH] D136145: [IR][RFC] Restrict read only when cache type of llvm.prefetch is instruction

2022-10-19 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/IR/Verifier.cpp:5180 +int RW = cast(Call.getArgOperand(1))->getZExtValue(); +int Locality = cast(Call.getArgOperand(2))->getZExtValue(); +int Data = cast(Call.getArgOperand(3))->getZExtValue(); Should

[PATCH] D136959: clang: Improve errors for DiagnosticInfoResourceLimit

2022-10-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: nickdesaulniers, yaxunl, aaron.ballman, qcolombet, aeubanks, olista01, dnovillo, echristo, MaskRay. Herald added subscribers: kosarev, StephenFan, tpr. Herald added a project: All. arsenm requested review of this revision. Herald added a subscr

[PATCH] D136959: clang: Improve errors for DiagnosticInfoResourceLimit

2022-10-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 0ebd4638af1f71788ca55f521ed8e1ed8cab518d CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136959/new/ https://reviews.llvm.org/D136959 __

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-09-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9468 + // Control constants for math operations. + AddGlobal("__oclc_wavefrontsize64", Wavefront64, /*Size=*/8); + AddGlobal("__oclc_daz_opt", DenormAreZero, /*Size=*/8); yaxunl wrote:

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2022-09-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D82087#3797883 , @jdoerfert wrote: > Can we land this? I'd like to use the new intrinsics as I don't understand > the old ones. What do you think about using the two separate builtins, vs. one magic builtin that auto-changes t

[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:590-591 // times 100. -// ToDo: Enable module flag for all code object version when ROCm device -// library is ready. -if (getTarget().getTargetOpts().CodeObjectVersion == TargetOptions

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-09-26 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D130096#3815529 , @jhuber6 wrote: > The best solution would be to handle these per-TU variables in the backend. > Or maybe even all of these could be placed in the backend where the code > paths that currently require a contro

[PATCH] D129298: Add denormal-fp-math attribute for f16

2022-09-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Reverse ping Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129298/new/ https://reviews.llvm.org/D129298 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.

[PATCH] D128907: [Clang] Disable noundef attribute for languages which allow uninitialized function arguments

2022-09-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Can this be abandoned now? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D128907/new/ https://reviews.llvm.org/D128907 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https

[PATCH] D88976: [clang] Use correct address space for global variable debug info

2022-09-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Herald added subscribers: mattd, gchakrabarti, asavonic. Herald added a project: All. ping? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D88976/new/ https://reviews.llvm.org/D88976 __

[PATCH] D134872: AMDGPU: Add __builtin_amdgcn_permlane64

2022-09-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, AMDGPU. Herald added subscribers: kosarev, kerbowa, t-tye, tpr, dstuttard, jvesely, kzhuravl. Herald added a project: All. arsenm requested review of this revision. Herald added subscribers: llvm-commits, wdng. Herald added a project: L

[PATCH] D135155: [AMDGPU] Annotate the intrinsics to be default and nocallback

2022-10-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/include/llvm/IR/IntrinsicsAMDGPU.td:1581 ClangBuiltin<"__builtin_amdgcn_ds_swizzle">, - Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty], -[IntrNoMem, IntrConvergent, IntrWillReturn, + DefaultAttrsIntrinsic<[llvm_

[PATCH] D135155: [AMDGPU] Annotate the intrinsics to be default and nocallback

2022-10-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added inline comments. This revision is now accepted and ready to land. Comment at: llvm/include/llvm/IR/IntrinsicsAMDGPU.td:1226 def int_amdgcn_raw_tbuffer_store : Intrinsic < [], Default Comment a

[PATCH] D135374: [OpenMP][AMDGPU] Add 'uniform-work-group' attribute to OpenMP kernels

2022-10-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9424-9431 const bool IsHIPKernel = M.getLangOpts().HIP && FD && FD->hasAttr(); + const bool IsOpenMPkernel = + M.getLangOpts().OpenMPIsDevice && + (F->getCallingConv() == llvm::Calling

[PATCH] D135374: [OpenMP][AMDGPU] Add 'uniform-work-group' attribute to OpenMP kernels

2022-10-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9424-9431 const bool IsHIPKernel = M.getLangOpts().HIP && FD && FD->hasAttr(); + const bool IsOpenMPkernel = + M.getLangOpts().OpenMPIsDevice && + (F->getCallingConv() == llvm::Calling

[PATCH] D135374: [OpenMP][AMDGPU] Add 'uniform-work-group' attribute to OpenMP kernels

2022-10-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9424-9431 const bool IsHIPKernel = M.getLangOpts().HIP && FD && FD->hasAttr(); + const bool IsOpenMPkernel = + M.getLangOpts().OpenMPIsDevice && + (F->getCallingConv() == llvm::Calling

[PATCH] D135614: [OpenMP][CUDA][AMDGPU] Accept case insensitive subarchitecture names

2022-10-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I don't really have an opinion here. I'd probably lean towards a "did you mean" kind of warning Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135614/new/ https://reviews.llvm.org/D135614 ___

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-10-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D130096#3850472 , @jhuber6 wrote: > I don't like the fact that we need to have two different kinds of control > constants, one per-TU and others per-link job. I'm wondering how difficult it > would be to make the fast versions

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-10-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D130096#3850550 , @b-sumner wrote: > There's the "small matter" of implementing the new device library functions. > Why is all that more likeable than two kinds of control constants? Different functions providing different be

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-10-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D130096#3850708 , @b-sumner wrote: >> Different functions providing different behaviors can be handled at link >> time like any other function, instead of the same functions providing >> different behaviors per translation uni

[PATCH] D135551: [clang] replace `assert(0)` with `llvm_unreachable` NFC

2022-10-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/AST/Interp/ByteCodeExprGen.cpp:598 } else { -assert(false && "Unhandled type in array initializer initlist"); +llvm_unreachable("Unhandled type in array initializer initlist"); } aa

[PATCH] D134872: AMDGPU: Add __builtin_amdgcn_permlane64

2022-10-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. f59f116bd5c357b1cb4b04693c88d41484e168d5 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D134872/new/ https://reviews.llvm.org/D134872 __

[PATCH] D140294: clang: Replace implementation of __builtin_isnormal

2022-12-19 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: bkramer, foad, sepavloff, andrew.w.kaylor, kpn. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. This was doing an explicit non-canonical isnan check, then two unordered comparisons. We can

[PATCH] D139629: clang: Stop emitting "strictfp"

2022-12-19 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:2135 -llvm::AttrBuilder FuncAttrs(F->getContext()); -FuncAttrs.addAttribute("strictfp"); -F->addFnAttrs(FuncAttrs); kpn wrote: > arsenm wrote: > > andrew.w.kaylor wrote: > >

[PATCH] D139564: clang: Don't emit "frame-pointer"="none"

2022-12-19 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 483964. arsenm added a comment. Add unreachable CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139564/new/ https://reviews.llvm.org/D139564 Files: clang/include/clang/Basic/CodeGenOptions.h clang/lib/CodeGen/CGCall.cpp clang/test/CodeGen/libcal

[PATCH] D138392: clang/HIP: Fix broken implementations of __make_mantissa* functions

2022-12-19 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested review of this revision. arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138392/new/ https://reviews.llvm.org/D138392 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/c

[PATCH] D112932: Use llvm.is_fpclass to implement FP classification functions

2022-12-19 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. This will produce worse codegen in many situations. I'm working on a stack of patches to convert is.fpclass back to fcmp when legal Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112932/new/ https://reviews.llvm.org/D112932

[PATCH] D138393: HIP: Directly call fabs builtins

2022-12-19 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 81616561c1b376af85365c9eaf94d49ad184c623 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138393/new/ https://reviews.llvm.org/D138393 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[PATCH] D127221: [Clang] Enable -print-pipeline-passes in clang.

2022-12-19 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. Herald added a subscriber: wdng. I have no idea how you're supposed to work with clang without this Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/

[PATCH] D140294: clang: Replace implementation of __builtin_isnormal

2022-12-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D140294#4007073 , @sepavloff wrote: > This change can have negative consequences in some cases. Some targets have > dedicated instruction to test FP class and often this instruction is faster > than arithmetic operations. Repl

[PATCH] D138868: AMDGPU/clang: Remove target features from address space test builtins

2022-12-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138868/new/ https://reviews.llvm.org/D138868 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D140467: [X86][Reduce] Preserve fast math flags when change it. NFCI

2022-12-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Needs tests. I couldn’t find any for the base builtins either Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140467/new/ https://reviews.llvm.org/D140467 ___ cfe-commits mailing li

[PATCH] D140467: [X86][Reduce] Preserve fast math flags when change it. NFCI

2022-12-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added a comment. This revision now requires changes to proceed. In D140467#4010378 , @pengfei wrote: > Use FastMathFlagGuard instead, thanks @foad! > > In D140467#4010296

[PATCH] D140467: [X86][Reduce] Preserve fast math flags when change it. NFCI

2022-12-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D140467#4010507 , @arsenm wrote: > In D140467#4010378 , @pengfei wrote: > >> Use FastMathFlagGuard instead, thanks @foad! >> >> In D140467#4010296

[PATCH] D140467: [X86][Reduce] Preserve fast math flags when change it. NFCI

2022-12-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D140467#4010675 , @pengfei wrote: > As I have explained, users are not suggested to use these builtins given we > have provided the more stable, well documented corresponding intrinsics. The > only case user has to use it is t

[PATCH] D139701: [Clang] Emit "min-legal-vector-width" attribute for X86 only

2022-12-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/docs/LangRef.rst:2235-2241 -``"min-legal-vector-width"=""`` -This attribute indicates the minimum legal vector width required by the -calling convension. It is the maximum width of vector arguments and -returnings in the

[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

2022-12-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:37 +return 1; + printf("CUDA error: %s\n", ErrStr); + return 1; stderr? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140433/new/

[PATCH] D139640: clang: Add __builtin_elementwise canonicalize and copysign

2022-12-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 437346abe18ec4fc982ae36f6821487dafc1a06e Comment at: clang/docs/LanguageExtensions.rst:644 magnitude than x

[PATCH] D138392: clang/HIP: Fix broken implementations of __make_mantissa* functions

2022-12-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 4086ea331cad827d74542e52a86b7d7933376e7b CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138392/new/ https://reviews.llvm.org/D138392 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[PATCH] D138394: HIP: Directly call fma builtins

2022-12-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138394/new/ https://reviews.llvm.org/D138394 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D140467: [X86][Reduce] Preserve fast math flags when change it. NFCI

2022-12-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D140467#4013107 , @pengfei wrote: > Add test case to check FastMathFlagGuard works. > > Not to mention above cases. So it doesn't sound feasible to me. Testing is always feasible. You could even just generate all the combinatio

[PATCH] D140467: [X86][Reduce] Preserve fast math flags when change it. NFCI

2022-12-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGen/builtins-x86-reduce.c:8 +} + +// CHECK: fadd pengfei wrote: > arsenm wrote: > > Should test the builtins from both sets > Do you mean this? Almost. You added the guard to 4 switch cases, so I would expe

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2022-12-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 484980. arsenm added a comment. Rebase. Use _w32/_w64 suffixes since some other wave specific builtins seem to have gone with that convention CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82087/new/ https://reviews.llvm.org/D82087 Files: clang/in

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2022-12-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenOpenCL/amdgpu-features.cl:7 +// RUN: %clang_cc1 -triple amdgcn -S -emit-llvm -o - %s | FileCheck --check-prefix=NOCPU %s +// RUN: %clang_cc1 -triple amdgcn -target-feature +wavefrontsize32 -S -emit-llvm -o - %s | FileC

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2022-12-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm planned changes to this revision. arsenm added a comment. This doesn't work correctly for unspecified wavesize for non-wave32 targets CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82087/new/ https://reviews.llvm.org/D82087 ___ cfe-comm

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2022-12-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 485110. arsenm added a comment. Fix unknown target handling, diagnose some more of the errors CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82087/new/ https://reviews.llvm.org/D82087 Files: clang/include/clang/Basic/BuiltinsAMDGPU.def clang/lib/

[PATCH] D140639: clang: Fix handling of __builtin_elementwise_copysign

2022-12-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added a reviewer: fhahn. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. I realized the handling of copysign made no sense at all. Only the type of the first operand should really matter, and it shou

[PATCH] D114865: [AMDGPU][OpenMP] Use -amdgpu-fixed-function-abi

2022-12-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm resigned from this revision. arsenm added a comment. This revision is now accepted and ready to land. Herald added subscribers: kosarev, MaskRay. Herald added a project: All. This should be abandoned, the flag is gone Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION http

[PATCH] D138868: AMDGPU/clang: Remove target features from address space test builtins

2022-12-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenOpenCL/builtins-amdgcn-flat-address-space.cl:8 +// be initialized to something useful. The proper way to diagnose invalid flat +// usage is to forbid flat pointers on unsupported targets. + Joe_Nash wrot

[PATCH] D138868: AMDGPU/clang: Remove target features from address space test builtins

2022-12-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenOpenCL/builtins-amdgcn-flat-address-space.cl:8 +// be initialized to something useful. The proper way to diagnose invalid flat +// usage is to forbid flat pointers on unsupported targets. + Joe_Nash wrot

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2022-12-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. f4bcd7f598331457cfe74e459b489d4098369511 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82087/new/ https://reviews.llvm.org/D82087 ___ cfe-commits mailing list cfe-commits@lists.llvm.org htt

[PATCH] D138868: AMDGPU/clang: Remove target features from address space test builtins

2022-12-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. e630d9b299822810bba8f3d0457004d1b4c39bef CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138868/new/ https://reviews.llvm.org/D138868 __

[PATCH] D138870: clang/AMDGPU: Remove flat-address-space from feature map

2022-12-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 485650. arsenm added a comment. Rebase CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138870/new/ https://reviews.llvm.org/D138870 Files: clang/lib/Basic/Targets/AMDGPU.cpp clang/test/CodeGenOpenCL/amdgpu-features.cl clang/test/OpenMP/amdgcn-at

[PATCH] D138870: clang/AMDGPU: Remove flat-address-space from feature map

2022-12-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D138870#4020204 , @Joe_Nash wrote: > The code looks fine, but as you say, the change visible in user code and > could break something. Do you want to handle that somehow? Maybe wait for > @b-sumner OpenMP assumes flat pointer

[PATCH] D139564: clang: Don't emit "frame-pointer"="none"

2023-01-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139564/new/ https://reviews.llvm.org/D139564 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139564: clang: Don't emit "frame-pointer"="none"

2023-01-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. ce6ae0b2a26b1ec2f770b2b9474cc4486d60c586 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139564/new/ https://reviews.llvm.org/D139564 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[PATCH] D140992: clang: Add __builtin_elementsize_fma

2023-01-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: fhahn, junaire, bob80905, python3kgae, RKSimon, aaron.ballman, erichkeane, scanon. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. I didn't understand why the other builtins have promotio

[PATCH] D140992: clang: Add __builtin_elementwise_fma

2023-01-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Sema/SemaChecking.cpp:2615 QualType ArgTy = TheCall->getArg(0)->getType(); -QualType EltTy = ArgTy; - -if (auto *VecTy = EltTy->getAs()) - EltTy = VecTy->getElementType(); -if (!EltTy->isFloatingType()) { -

[PATCH] D141008: [Clang][SPIR-V] Emit target extension types for OpenCL types on SPIR-V.

2023-01-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenOpenCL/cast_image.cl:2 // RUN: %clang_cc1 -no-opaque-pointers -emit-llvm -o - -triple amdgcn--amdhsa %s | FileCheck --check-prefix=AMDGCN %s -// RUN: %clang_cc1 -no-opaque-pointers -emit-llvm -o - -triple spir-unknown

[PATCH] D141013: amdgpu-arch: Prefer hsa/hsa.h over hsa.h

2023-01-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, JonChesterfield. Herald added subscribers: kosarev, kerbowa, tpr, dstuttard, jvesely, kzhuravl. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. The header now prints a warning if y

[PATCH] D141013: amdgpu-arch: Prefer hsa/hsa.h over hsa.h

2023-01-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 39a83ebd47fb56da5c1bd9dd976b70d551656cab CHANGES SINCE LAST ACTION https://reviews.llvm.org/D141013/new/ https://reviews.llvm.org/D141013 __

[PATCH] D138870: clang/AMDGPU: Remove flat-address-space from feature map

2023-01-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 486561. arsenm added a comment. Rebase CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138870/new/ https://reviews.llvm.org/D138870 Files: clang/lib/Basic/Targets/AMDGPU.cpp clang/test/CodeGenOpenCL/amdgpu-features.cl clang/test/OpenMP/amdgcn-at

[PATCH] D140639: clang: Fix handling of __builtin_elementwise_copysign

2023-01-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140639/new/ https://reviews.llvm.org/D140639 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D140639: clang: Fix handling of __builtin_elementwise_copysign

2023-01-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D140639#4028883 , @erichkeane wrote: > 1 nit, and 1 trying to see what is going on. I don't have a good feeling > what the purpose of this builtin is, The point of every builtin is direct access to llvm intrinsics, in this c

[PATCH] D140639: clang: Fix handling of __builtin_elementwise_copysign

2023-01-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Sema/SemaChecking.cpp:2674 + +if (MagnitudeTy.getCanonicalType() != SignTy.getCanonicalType()) { + return Diag(Sign.get()->getBeginLoc(), erichkeane wrote: > curleys not used for single-statement if-sta

[PATCH] D141078: [CUDA][HIP] Support '--offload-arch=native' for the new driver

2023-01-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/Driver.cpp:4275 +TC->getDriver().Diag(diag::err_drv_undetermined_gpu_arch) +<< (TC->getTriple().isNVPTX() ? "NVPTX" : "AMDGPU") +<< llvm::toString(GPUsOrErr.takeError()) << "--o

[PATCH] D141078: [CUDA][HIP] Support '--offload-arch=native' for the new driver

2023-01-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/Driver.cpp:4275 +TC->getDriver().Diag(diag::err_drv_undetermined_gpu_arch) +<< (TC->getTriple().isNVPTX() ? "NVPTX" : "AMDGPU") +<< llvm::toString(GPUsOrErr.takeError()) << "--o

[PATCH] D140639: clang: Fix handling of __builtin_elementwise_copysign

2023-01-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Sema/SemaChecking.cpp:2674 + +if (MagnitudeTy.getCanonicalType() != SignTy.getCanonicalType()) { + return Diag(Sign.get()->getBeginLoc(), erichkeane wrote: > arsenm wrote: > > erichkeane wrote: > > > cu

[PATCH] D140639: clang: Fix handling of __builtin_elementwise_copysign

2023-01-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Sema/SemaChecking.cpp:2674 + +if (MagnitudeTy.getCanonicalType() != SignTy.getCanonicalType()) { + return Diag(Sign.get()->getBeginLoc(), arsenm wrote: > erichkeane wrote: > > arsenm wrote: > > > erichk

[PATCH] D140639: clang: Fix handling of __builtin_elementwise_copysign

2023-01-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 486643. arsenm added a comment. Return S.Diag CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140639/new/ https://reviews.llvm.org/D140639 Files: clang/lib/Sema/SemaChecking.cpp clang/test/CodeGen/builtins-elementwise-math.c clang/test/Sema/buil

[PATCH] D138870: clang/AMDGPU: Remove flat-address-space from feature map

2023-01-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 81849497b42e1a865af8aff65ab768e56a301c87 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138870/new/ https://reviews.llvm.org/D138870 __

[PATCH] D112932: Use llvm.is_fpclass to implement FP classification functions

2023-01-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Basic/Builtins.def:482-484 +BUILTIN(__builtin_issubnormal, "i.", "FnctE") +BUILTIN(__builtin_iszero, "i.", "FnctE") +BUILTIN(__builtin_issignaling, "i.", "FnctE") Sorry, I should have clarified, a

[PATCH] D140639: clang: Fix handling of __builtin_elementwise_copysign

2023-01-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Herald added a subscriber: StephenFan. ping, I think this should get in before the branch date to fix the current broken behavior before this is in a release CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140639/new/ https://reviews.llvm.org/D140639

[PATCH] D140639: clang: Fix handling of __builtin_elementwise_copysign

2023-01-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 2ad4c3c88d884684a3efb42181e87fe305df51bd CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140639/new/ https://reviews.llvm.org/D140639 __

[PATCH] D141447: clang/OpenCL: Don't use a Function for the block type

2023-01-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, Anastasia. Herald added subscribers: kosarev, tpr. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. The AMDGPU value for this is not really a function. Currently we're emitting IR t

[PATCH] D141449: clang/OpenCL: Fix not setting convergent on block invoke kernels

2023-01-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: Anastasia, yaxunl, jdoerfert. Herald added subscribers: kosarev, kerbowa, jvesely. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. Yet another example how convergent not being the default

<    1   2   3   4   5   6   7   8   9   10   >