[PATCH] D155773: [llvm][MemoryBuiltins] Add alloca support to getInitialValueOfAllocation

2023-08-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp:809-811 + Updater.AddAvailableValue( + Alloca.getParent(), + getInitialValueOfAllocation(&Alloca, nullptr, VectorTy)); This is very specifically handling alloca, n

[PATCH] D156737: clang: Add __builtin_elementwise_sqrt

2023-08-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 549490. arsenm added a comment. Release note CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156737/new/ https://reviews.llvm.org/D156737 Files: clang/docs/LanguageExtensions.rst clang/docs/ReleaseNotes.rst clang/include/clang/Basic/Builtins.def

[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

2023-08-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/TargetPassConfig.cpp:1281-1282 +else + WithColor::warning() + << "-fsplit-machine-functions is only valid for X86.\n"; } You cannot spam warnings here. The other instance of printing

[PATCH] D156737: clang: Add __builtin_elementwise_sqrt

2023-08-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 9e3d9c9eae03910d93e2312e1e0845433c779998 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156737/new/ https://reviews.llvm.org/D156737 __

[PATCH] D157911: clang: Add __builtin_exp10* and use new llvm.exp10 intrinsic

2023-08-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: jcranmer-intel, kpn, sepavloff, andrew.w.kaylor, foad, bob80905. Herald added a subscriber: StephenFan. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. https://reviews.llvm.org/D157911 F

[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

2023-08-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/TargetPassConfig.cpp:1281-1282 +else + WithColor::warning() + << "-fsplit-machine-functions is only valid for X86.\n"; } shenhan wrote: > arsenm wrote: > > You cannot spam warnings he

[PATCH] D157917: clang/HIP: Use abs builtins instead of implementing them

2023-08-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, AlexVlx, JonChesterfield, jhuber6, doru1004. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. InstCombine already put these back together so there's no visible change in the -O1 tes

[PATCH] D157738: [OpenMP] Emit offloading entries for indirect target variables

2023-08-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:1996-1997 +llvm::GlobalValue *GV) { + std::optional ActiveAttr = + OMPDeclareTargetDeclAttr::getActiveAttr(FD); + not a huge fan

[PATCH] D157917: clang/HIP: Use abs builtins instead of implementing them

2023-08-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 43f314f5e6cebe02ff63d5197c8e5c25204b20d2 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157917/new/ https://reviews.llvm.org/D157917 __

[PATCH] D76283: [IRBuilder] Use preferred target type for len argument of memory intrinsic functions

2023-08-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added a comment. This revision now requires changes to proceed. Herald added a project: All. I think any size type should be valid for the intrinsic. Legalization should have to cast the type to the target libcall if that's how it chooses to impl

[PATCH] D76283: [IRBuilder] Use preferred target type for len argument of memory intrinsic functions

2023-08-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/include/llvm/IR/IRBuilder.h:438-446 + ConstantInt *getIntPtrSize(Value *Ptr, uint64_t Size) { +assert(BB && "Must have a basic block to retrieve the module!"); + +Module *M = BB->getParent()->getParent(); +auto *PtrType

[PATCH] D145648: [clang][Driver] recognize `-ffp-contract=fast-honor-pragmas`

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. Herald added a subscriber: wdng. LGTM, not recognizing this in the driver is incomplete CHANGES SINCE LAST ACTION https://reviews.llvm.org/D145648/new/ https://reviews.llvm.org/D145648 ___

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/ExpandVAIntrinsics.cpp:38 + +#include + Don't need Comment at: llvm/lib/CodeGen/ExpandVAIntrinsics.cpp:44-47 +static cl::opt +ApplyToAllOverride(DEBUG_TYPE "-all", cl::init(false),

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/test/CodeGen/Generic/expand-variadic-intrinsics.ll:76 +} + + arsenm wrote: > Needs some indirect variadic call tests Also some metadata and signext/zeroext preservation tests Repository: rG LLVM Github Monorepo

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/test/CodeGen/Generic/expand-variadic-intrinsics.ll:76 +} + + arsenm wrote: > arsenm wrote: > > Needs some indirect variadic call tests > Also some metadata and signext/zeroext preservation tests Also a case where the

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17057 + Constant *Offset, *OffsetOld; + Value *DP, *DP1; + Spell out to DispatchPtr? Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1206-1208 + getTargetCodeGenInfo()

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenCUDA/amdgpu-code-object-version-linking.cu:40-43 +__device__ void bar(int *out) +{ + *out = __builtin_amdgcn_workgroup_size_x(); +} test all the builtins? Repository: rG LLVM Github Monorepo CHANGE

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: libc/config/gpu/entrypoints.txt:84-85 # stdio.h entrypoints +libc.src.stdio.snprintf +libc.src.stdio.vsnprintf libc.src.stdio.puts Split of the libc stuff into a separate patch, the lowering pass should

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:22 +// 5/ Delete the remaining parts of the original functions +// +//===--===// Can you expand on the ABI requirem

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:208-209 + +StructType *VarargsTy = StructType::create( +Ctx, LocalVarTypes, (Twine(NF->getName()) + ".vararg").str()); + Should we go for a packed struct forced to align 4

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:297 +NF->copyAttributesFrom(&F); +NF->setComdat(F.getComdat()); +F.getParent()->getFunctionList().insert(F.getIterator(), NF); Test the comdat? Weird that copyAttributesFr

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:74-77 +Value *Mask = ConstantInt::get(IntPtrTy, ~(DataAlignMinusOne)); +Value *vaListAligned = Builder.CreateIntToPtr( +Builder.CreateAnd(Builder.CreatePtrToInt(Incr, IntPtrTy), Mask),

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:296 +// Note - same attribute handling as DeadArgumentElimination +NF->copyAttributesFrom(&F); +NF->setComdat(F.getComdat()); This might be missing copying the linkage R

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/CodeGen/DesugarVariadics.cpp:145 +for (Function &F : llvm::make_early_inc_range(M)) + if (Apply || canTransformFunctionInIsolation(F)) +Changed |= runOnFunction(F); I think you need to guard agai

[PATCH] D158367: [AMDGPU] Add target feature gds/gws to clang

2023-08-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/TargetParser/TargetParser.cpp:289 Features["image-insts"] = true; + Features["gds"] = true; + Features["gws"] = true; Gds feature is unused CHANGES SINCE LAST ACTION https://reviews.llvm.org

[PATCH] D158367: [AMDGPU] Add target feature gds/gws to clang

2023-08-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/TargetParser/TargetParser.cpp:289 Features["image-insts"] = true; + Features["gds"] = true; + Features["gws"] = true; yaxunl wrote: > arsenm wrote: > > Gds feature is unused > I am thinking to k

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17067 + +Value *Iscov5 = CGF.Builder.CreateICmpSGE( +ABIVersion, Capitalization is weird, IsCOV5? Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17082-17083 +

[PATCH] D156357: clang: Add elementwise bitreverse builtin

2023-07-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/docs/LanguageExtensions.rst:634 the most negative integer remains the most negative integer - T __builtin_elementwise_fma(T x, T y, T z) fused multiply add, (x * y) + z.

[PATCH] D156539: [Clang][CodeGen] `__builtin_alloca`s should care about address spaces too

2023-07-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156539#4542836 , @rjmccall wrote: > We should probably write this code to work properly in case we add a target > that makes `__builtin_alloca` return a pointer in the private address space. > Could you recover the target AS

[PATCH] D156539: [Clang][CodeGen] `__builtin_alloca`s should care about address spaces too

2023-07-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:3540 + return RValue::get(Builder.CreateAddrSpaceCast(AI, CGM.Int8PtrTy)); +else + return RValue::get(AI); No return after else Comment at: clang/lib/CodeGe

[PATCH] D86154: AMDGPU: Add llvm.amdgcn.{read,readfirst,write}lane2 intrinsics with type overloads

2023-07-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added a comment. This revision now requires changes to proceed. Herald added subscribers: nlopes, StephenFan. Herald added a project: All. Should be obsoleted by D147732 Repository: rG LLVM Github Monorepo CH

[PATCH] D156539: [Clang][CodeGen] `__builtin_alloca`s should care about address spaces too

2023-07-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGen/dynamic-alloca-with-address-space.c:1 +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm %s -o - | FileCheck %s + Can you add an opencl 1.2 and 2.0 run line too Comment at: clan

[PATCH] D156539: [Clang][CodeGen] `__builtin_alloca`s should care about address spaces too

2023-07-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGen/dynamic-alloca-with-address-space.c:1 +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm %s -o - | FileCheck %s + AlexVlx wrote: > arsenm wrote: > > Can you add an opencl 1.2 and 2.0 run line too

[PATCH] D156737: clang: Add __builtin_elementwise_sqrt

2023-07-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, fhahn, bob80905. Herald added subscribers: StephenFan, Anastasia. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. This will be used in the opencl builtin headers to provide direct

[PATCH] D156743: [wip] clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-07-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, svenvh, Anastasia. Herald added a subscriber: Naghasan. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. We want the !fpmath metadata to be attached to the sqrt intrinsic to make it

[PATCH] D156737: clang: Add __builtin_elementwise_sqrt

2023-07-31 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2548 +case Builtin::BI__builtin_sqrtf128: +case Builtin::BI__builtin_elementwise_sqrt: { llvm::Value *Call = emitUnaryMaybeConstrainedFPBuiltin( bob80905 wrote: > Nit: I thin

[PATCH] D156816: [Clang] Make generic aliases to OpenCL address spaces

2023-08-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I don't really see the point of doing this. These introduce ambiguous terminology. The reason you need the attributes is basically for FFI to opencl code, so might as well make the specific meaning clearer with the opencl bit Repository: rG LLVM Github Monorepo CHANG

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-08-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 546164. arsenm retitled this revision from "[wip] clang/OpenCL: Add inline implementations of sqrt in builtin header" to "clang/OpenCL: Add inline implementations of sqrt in builtin header". arsenm edited the summary of this revision. arsenm added a comment.

[PATCH] D156816: [Clang] Make generic aliases to OpenCL address spaces

2023-08-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156816#4551409 , @Anastasia wrote: > Why not to just use target address space and define it to some macro with > desirable spelling? If you mean the numbered address spaces, that's the broken thing this is specifically tryin

[PATCH] D156928: [Clang][AMDGPU] Fix handling of -mcode-object-version=none arg

2023-08-02 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. missing tests Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1066 +if (!IsCC1As) { + std::string CodeObjVerStr = (CodeObjVer ? Twine(CodeObjVer) : "none").str(); CmdArgs.insert(CmdArgs.begin() + 1, don't need to go th

[PATCH] D156989: FloatingPointMode: Use -1 for "Dynamic"

2023-08-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: sepavloff, rjmccall, kpn, cameron.mcinally, uweigand, scanon, jcranmer-intel, foad. Herald added subscribers: StephenFan, tpr. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. Herald added

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-08-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 546814. arsenm marked an inline comment as done. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156743/new/ https://reviews.llvm.org/D156743 Files: clang/lib/Headers/opencl-c-base.h clang/lib/Headers/opencl-c.h clang/lib/Sema/OpenCLBuiltins.td

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-08-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Headers/opencl-c-base.h:832 + +inline float __ovld __cnfn sqrt(float __x) { + return __builtin_elementwise_sqrt(__x); svenvh wrote: > Anastasia wrote: > > Is this a generic implementation enough? Would some tar

[PATCH] D156989: FloatingPointMode: Use -1 for "Dynamic"

2023-08-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 546815. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156989/new/ https://reviews.llvm.org/D156989 Files: clang/include/clang/Basic/FPOptions.def clang/include/clang/Basic/LangOptions.h clang/lib/AST/JSONNodeDumper.cpp clang/lib/AST/TextNodeDu

[PATCH] D156989: FloatingPointMode: Use -1 for "Dynamic"

2023-08-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156989#4558133 , @sepavloff wrote: > Rounding mode is presented in FPOptions with 3 bits, so there is only 8 > values available for particular modes. 5 of them, which are specified in > IEEE-754, are listed in `RoundingMode`.

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17146 +Value *ABIVersion; +if (ABIVersionC) { + ABIVersion = CGF.Builder.CreateAlignedLoad(CGF.Int32Ty, ABIVersionC, this must always pass Comment at: openmp

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D139730#4561540 , @jhuber6 wrote: > Could you explain briefly what the approach here is? I'm confused as to > what's actually changed and how we're handling this difference. I thought if > this was just the definition of some

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D139730#4561575 , @jhuber6 wrote: > In D139730#4561573 , @arsenm wrote: > >> In D139730#4561540 , @jhuber6 >> wrote: >> >>> Could you explain b

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D139730#4561619 , @arsenm wrote: > In D139730#4561575 , @jhuber6 wrote: > >> In D139730#4561573 , @arsenm wrote: >> >>> In D139730#4561540

[PATCH] D156928: [Clang][AMDGPU] Fix handling of -mcode-object-version=none arg

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156928#4561811 , @JonChesterfield wrote: > What does code objects version= none mean? Handle any version Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156928/new/ https://revi

[PATCH] D156989: FloatingPointMode: Use -1 for "Dynamic"

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156989#4558486 , @sepavloff wrote: > Support of rounding mode in C standard is based on IEEE-754 model, where > rounding mode is a global state and affects all FP operations. The case of > two rounding modes does not fit this

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17143-17145 + llvm::LoadInst *LD; + Constant *Offset, *Offset1; + Value *DP, *DP1; Move down to define and initialize Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17163

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-08-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Headers/opencl-c-base.h:832 + +inline float __ovld __cnfn sqrt(float __x) { + return __builtin_elementwise_sqrt(__x); Anastasia wrote: > arsenm wrote: > > svenvh wrote: > > > Anastasia wrote: > > > > Is this a

[PATCH] D155850: [Clang][CodeGen][RFC] Add codegen support for C++ Parallel Algorithm Offload

2023-08-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/BackendUtil.cpp:1101-1102 +MPM.addPass(StdParAcceleratorCodeSelectionPass()); +} +else if (LangOpts.HIPStdParInterposeAlloc) { + MPM.addPass(StdParAllocationInterpositionPass()); For

[PATCH] D157438: [OpenMP] Ensure wrapper headers are included on both host and device

2023-08-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1190-1191 // the resource directory at clang/lib/Headers/llvm_libc_wrappers. -if (C.getActiveOffloadKinds() == Action::OFK_None) { +if ((getToolChain().getTriple().isNVPTX() || +

[PATCH] D156816: [Clang] Make generic aliases to OpenCL address spaces

2023-08-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added a comment. This revision now requires changes to proceed. Probably should just wrap uses in macros for now Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156816/new/ https://reviews.llvm.org/D1

[PATCH] D157911: clang: Add __builtin_exp10* and use new llvm.exp10 intrinsic

2023-09-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 556343. arsenm added a comment. Release notes CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157911/new/ https://reviews.llvm.org/D157911 Files: clang/docs/ReleaseNotes.rst clang/include/clang/Basic/Builtins.def clang/lib/CodeGen/CGBuiltin.cpp

[PATCH] D157911: clang: Add __builtin_exp10* and use new llvm.exp10 intrinsic

2023-09-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 6a08cf12d9cbc960159bf40e47078a882ca510ce CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157911/new/ https://reviews.llvm.org/D157911 __

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156743/new/ https://reviews.llvm.org/D156743 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D156989: FloatingPointMode: Use -1 for "Dynamic"

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping. This enum should just match FLT_ROUNDS and designing ABI around whatever this was doing doesn't really make sense CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156989/new/ https://reviews.llvm.org/D156989 ___ cf

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D156743#4644285 , @Anastasia wrote: > If we think there are no better alternatives and implementation is generic > enough for every vendor, LGTM! You could argue annotating the raw callsite is better but I don't know how to i

[PATCH] D156743: clang/OpenCL: Add inline implementations of sqrt in builtin header

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 15e0fe0b6122e32657b98daf74a1fce028d2e5bf CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156743/new/ https://reviews.llvm.org/D156743 __

[PATCH] D158131: HIP: Directly use f32 sqrt intrinsic

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. bca125569f33bd6a27c4c54815697966a823254e CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158131/new/ https://reviews.llvm.org/D158131 __

[PATCH] D138507: HIP: Directly use sqrt builtins instead of calling ocml (f32 case)

2023-09-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm abandoned this revision. arsenm added a comment. reposted D158131 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138507/new/ https://reviews.llvm.org/D138507 ___ cfe-commits mailing list cfe-commits@li

[PATCH] D154495: clang: Attach !fpmath metadata to __builtin_sqrt based on language flags

2023-07-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, Anastasia, jcranmer-intel, tra, jlebar, jhuber6. Herald added a project: All. arsenm requested review of this revision. Herald added subscribers: jplehr, sstefan1, wdng. Herald added a reviewer: jdoerfert. OpenCL and HIP have -cl-fp32-c

[PATCH] D154495: clang: Attach !fpmath metadata to __builtin_sqrt based on language flags

2023-07-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGExpr.cpp:5602 +// source are correctly rounded. +SetFPAccuracy(Val, 2.5); + } yaxunl wrote: > the spec says sqrt relative error is 3ULP > https://registry.khronos.org/OpenCL/specs/2.2/html/Op

[PATCH] D154495: clang: Attach !fpmath metadata to __builtin_sqrt based on language flags

2023-07-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGExpr.cpp:5602 +// source are correctly rounded. +SetFPAccuracy(Val, 2.5); + } arsenm wrote: > yaxunl wrote: > > the spec says sqrt relative error is 3ULP > > https://registry.khronos.org/Open

[PATCH] D154531: [AMDGPU] Support -mcpu=native for OpenCL

2023-07-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Basic/DiagnosticDriverKinds.td:86 +def warn_drv_multi_gpu_arch : Warning< + "multiple %0 architecture are detected: %1; only the first one is used for " + "'%2'">, InGroup; s/architecture/architectur

[PATCH] D154495: clang: Attach !fpmath metadata to __builtin_sqrt based on language flags

2023-07-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 537737. arsenm added a comment. Split div/sqrt handling since they have different values. Also cuda does have unimplemented flags to control these individually. Not sure it's worth trying to merge them into one function CHANGES SINCE LAST ACTION https://r

[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-07-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added inline comments. This revision now requires changes to proceed. Herald added a subscriber: wangpc. Comment at: llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp:187 +Value *AMDGPULateCodeGenPrepare::buildLegalLaneIntrins

[PATCH] D154495: clang: Attach !fpmath metadata to __builtin_sqrt based on language flags

2023-07-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D154495#4479481 , @jdoerfert wrote: > FWIW, I assume we want this also for OpenMP offload. I'd be surprised if OpenMP let you do this by default CHANGES SINCE LAST ACTION https://reviews.llvm.org/D154495/new/ https://revie

[PATCH] D151087: [Clang] Permit address space casts with 'reinterpret_cast' in C++

2023-07-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision. arsenm added a comment. This revision now requires changes to proceed. Conclusion seems to be this should have a separate cast operation Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151087/new/ https://re

[PATCH] D154133: [amdgpu] start documenting amdgpu support by clang

2023-07-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. Need to start somewhere CHANGES SINCE LAST ACTION https://reviews.llvm.org/D154133/new/ https://reviews.llvm.org/D154133 ___ cfe-commits mailing

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-07-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/tools/amdgpu-arch/AMDGPUArch.cpp:50 +#else + return printGPUsByHSA(); +#endif The HIP path should work on linux too. I generally think we should build as much code as possible on all hosts, so how about ``` #ifnde

[PATCH] D139629: clang: Stop emitting "strictfp"

2023-07-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139629/new/ https://reviews.llvm.org/D139629 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D154000: HIP: Directly call round builtins

2023-07-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 9a317516a515f3c5b15f9060329a503e8f261c7f CHANGES SINCE LAST ACTION https://reviews.llvm.org/D154000/new/ https://reviews.llvm.org/D154000 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[PATCH] D139629: clang: Stop emitting "strictfp"

2023-07-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. self accept after latest comments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139629/new/ https://reviews.llvm.org/D139629 ___ cfe-commi

[PATCH] D139629: clang: Stop emitting "strictfp"

2023-07-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 42d4c85ca83f25f993444fb5bbaa58525f724991 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139629/new/ https://reviews.llvm.org/D139629 __

[PATCH] D154123: [HIP] Start document HIP support by clang

2023-07-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/docs/HIPSupport.rst:49 + +You can also use ``--offload-arch=native`` to let ``amdgpu-arch`` automatically detect the GPU architecture on your system: + s/architecture/architectures Comment at: cl

[PATCH] D154790: [HIP] Use native math functions for `-fcuda-approx-transcendentals`

2023-07-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. This would be a lot easier if we clang fp was fully featured. As far as I can tell it lets you set #pragma clang fp reassociate(on), and contract(fast), but doesn't have a way to set arcp, afn, ninf or nnan Comment at: clang/lib/Headers/__clang_hip_mat

[PATCH] D154790: [HIP] Use native math functions for `-fcuda-approx-transcendentals`

2023-07-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Headers/__clang_hip_math.h:163 +__DEVICE__ +float __expf(float __x) { return __ocml_native_exp_f32(__x); } + arsenm wrote: > __builtin_expf Maybe this should just be __builtin_amdgcn_exp2f CHANGES SINCE LAST

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-07-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D153725#4484711 , @JonChesterfield wrote: > The right thing to do on Linux for this is to query the driver directly. That > is, the kernel should populate some string under /sys that we read. That > isn't yet implemented. It

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-07-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/tools/amdgpu-arch/AMDGPUArchByHIP.cpp:80 +if (err != hipSuccess) { + llvm::errs() << "Failed to get device id for ordinal " << i << "\n"; + return 1; yaxunl wrote: > arsenm wrote: > > single quotes aro

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-07-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D153725#4484754 , @JonChesterfield wrote: > - if you open the driver too many times at once it fails to open, so running > a parallel build that uses this tool doesn't work on fast machines Why would this happen? Seems like a

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-07-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D153725#4484754 , @JonChesterfield wrote: > The problem with using the proper API via HSA or similar is twofold: > > - we use this tool to enable tests, which means HSA has to exist before > building clang or the tests don't r

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-07-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D153725#4484973 , @arsenm wrote: > In D153725#4484754 , > @JonChesterfield wrote: > >> The problem with using the proper API via HSA or similar is twofold: >> >> - we use this tool to e

[PATCH] D154991: [FPEnv][TableGen] Add strictfp attribute to constrained intrinsics by default.

2023-07-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/test/Feature/fp-intrinsics-attr.ll:1 +; RUN: opt -passes=verify -S < %s | FileCheck %s + Should move to test/Assembler and round trip through llvm-as and llvm-dis like other similar tests Repository: rG LLVM Gi

[PATCH] D154991: [FPEnv][TableGen] Add strictfp attribute to constrained intrinsics by default.

2023-07-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added inline comments. This revision is now accepted and ready to land. Comment at: llvm/include/llvm/IR/Intrinsics.td:1102 -let IntrProperties = [IntrInaccessibleMemOnly, IntrWillReturn] in { +/// IntrStrictFP - The intrinsic is allowed to

[PATCH] D155081: Specify the developer policy around links to external resources

2023-07-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/docs/DeveloperPolicy.rst:359 + If the patch fixes a bug in GitHub Issues, we encourage adding + "Fixes https://github.com/llvm/llvm-project/issues/12345"; to automate closing + the issue in GitHub. If the patch has been reviewed,

[PATCH] D141414: [clang] add warning on shifting boolean type

2023-07-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D141414/new/ https://reviews.llvm.org/D141414 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi

[PATCH] D155191: clang/HIP: Directly use f32 exp and log builtins

2023-07-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: AMDGPU, yaxunl, jhuber6, JonChesterfield. Herald added a project: All. arsenm requested review of this revision. Herald added a subscriber: wdng. These are now lowered correctly by the backend, and you get proper fast math flags when directly h

[PATCH] D152914: [Draft] Make __builtin_cpu builtins target-independent

2023-07-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/include/llvm/IR/Intrinsics.td:903-907 +// Load of a value provided by the system library at a fixed address. Used for +// accessing things like HWCAP word provided by GLIBC. +def int_fixed_addr_ld +: DefaultAttrsIntrinsic<[llvm_i

[PATCH] D152914: [Draft] Make __builtin_cpu builtins target-independent

2023-07-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/include/llvm/IR/Intrinsics.td:903-907 +// Load of a value provided by the system library at a fixed address. Used for +// accessing things like HWCAP word provided by GLIBC. +def int_fixed_addr_ld +: DefaultAttrsIntrinsic<[llvm_i

[PATCH] D154123: [HIP] Start document HIP support by clang

2023-07-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/docs/HIPSupport.rst:33 + + clang -c --offload-arch=gfx906 -xhip test.cpp -o test.o + Aren't you supposed to use clang++? Also, could show that .hip is recognized? CHANGES SINCE LAST ACTION https://reviews.llvm

[PATCH] D155213: [HIP] Add `-fno-hip-uniform-block`

2023-07-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Driver/Options.td:1092 ShouldParseIf; +defm hip_uniform_block : BoolFOption<"hip-uniform-block", + LangOpts<"HIPUniformBlock">, DefaultTrue, Can we avoid adding yet another language flag for someth

[PATCH] D155213: [HIP] Add `-fno-hip-uniform-block`

2023-07-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Driver/Options.td:1092 ShouldParseIf; +defm hip_uniform_block : BoolFOption<"hip-uniform-block", + LangOpts<"HIPUniformBlock">, DefaultTrue, scchan wrote: > arsenm wrote: > > Can we avoid adding ye

[PATCH] D154495: clang: Attach !fpmath metadata to __builtin_sqrt based on language flags

2023-07-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D154495/new/ https://reviews.llvm.org/D154495 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D154495: clang: Attach !fpmath metadata to __builtin_sqrt based on language flags

2023-07-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. bac2a075408377a8aa41f6626b17bb3e471221f3 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D154495/new/ https://reviews.llvm.org/D154495 __

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Codegen parts LGTM, questions with the driver parts Comment at: clang/lib/Driver/ToolChain.cpp:1368 if (A->getOption().matches(options::OPT_m_Group)) { - if (SameTripleAsHost) + // Pass code objection version to device toolchain + //

[PATCH] D158695: [clang] Fix missing contract flag in sqrt intrinsic

2023-08-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGen/fp-contract-fast-pragma.cpp:11 #pragma clang fp contract(fast) - return a * b + c; + return a * b + c + __builtin_sqrtf(a); } Should leave the existing test function alone and add a new one. Also ca

<    3   4   5   6   7   8   9   10   11   12   >