[PATCH] D84345: [AMDGPU] Set the default globals address space to 1

2020-07-22 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/IR/AutoUpgrade.cpp:4297 + // address space of 1. + if (T.isAMDGPU() && !DL.contains("-G") && !DL.startswith("G")) { +return DL.empty() ? std::string("G1") : (DL + "-G1").str(); I would expect datalayout upg

[PATCH] D79744: clang: Use byref for aggregate kernel arguments

2020-07-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D79744#2165929 , @rjmccall wrote: > Arguably we should add this attribute to all indirect arguments. I can > understand not wanting to update all the test cases, but you could probably > avoid adding a new IndirectByRef kind o

[PATCH] D79744: clang: Use byref for aggregate kernel arguments

2020-07-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 280607. arsenm added a comment. Use distinct ABIArgInfo::Kind. Also don't enable this for OpenCL yet, since that requires fixing the callable kernel workaround CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79744/new/ https://reviews.llvm.org/D79744

[PATCH] D83088: Introduce CfgTraits abstraction

2020-07-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Analysis/Analyses/Dominators.h:50 + + // Clang's CFG contains nullpointers for unreachable succesors, e.g. when an + // if statement's condition is always false, it's 'then' branch is represented Mi

[PATCH] D79744: clang: Use byref for aggregate kernel arguments

2020-07-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:1997 case ABIArgInfo::Ignore: + case ABIArgInfo::IndirectAliased: return false; rjmccall wrote: > In principle, this can be `inreg` just as much as Indirect can. The IR verifier c

[PATCH] D79744: clang: Use byref for aggregate kernel arguments

2020-07-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 281629. arsenm marked 5 inline comments as done. arsenm added a comment. Address comments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79744/new/ https://reviews.llvm.org/D79744 Files: clang/include/clang/CodeGen/CGFunctionInfo.h clang/lib/Code

[PATCH] D57835: Fix -ftime-report with -x ir

2020-04-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 255095. arsenm added a comment. Rebase, add new PM run line CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57835/new/ https://reviews.llvm.org/D57835 Files: clang/lib/CodeGen/CodeGenAction.cpp clang/lib/Frontend/CompilerInstance.cpp clang/test/

[PATCH] D77683: [Docs] Make code review policy clearer about requested pre-commit reviews

2020-04-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/docs/CodeReview.rst:34 +uncertainty, a patch should be reviewed prior to being committed. If pre-commit +code reviewes in a particular area have been requested, code should clear a +significantly higher bar, e.g., fixes, to be commit

[PATCH] D59321: AMDGPU: Teach toolchain to link rocm device libs

2020-04-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D59321/new/ https://reviews.llvm.org/D59321 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D77885: AMDGPU: Search for new ROCm bitcode library structure

2020-04-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, b-sumner, hliao. Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl. The current install situation is a mess, but I'm working on fixing it. Search for the target layout instead of one of the N o

[PATCH] D77885: AMDGPU: Search for new ROCm bitcode library structure

2020-04-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 256598. arsenm added a comment. Forgot to commit part CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77885/new/ https://reviews.llvm.org/D77885 Files: clang/lib/Driver/ToolChains/AMDGPU.cpp clang/test/Driver/Inputs/rocm-device-libs/amdgcn/bitcode

[PATCH] D59321: AMDGPU: Teach toolchain to link rocm device libs

2020-04-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 4593e4131affa84e61d7b6844be409ba46d29f11 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D59321/new/ https://reviews.llvm.org/D59321 __

[PATCH] D77910: AMDGPU: Define cl_khr_gl_sharing as a supported extension

2020-04-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, Anastasia. Herald added subscribers: danielkiss, kerbowa, kristof.beyls, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl. This doesn't seem to be that useful, since it doesn't change any device side functions. Report it since

[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: Anastasia, yaxunl, scott.linder. Herald added subscribers: kerbowa, hiraditya, nhaehnle, wdng, jvesely. Some of the device specific standard predefines were missing. __IMAGE_SUPPORT__ was only hardcoded for SPIR. __OPENCL_VERSION__ was

[PATCH] D78019: HIP: Fix handling of denormal mode

2020-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, tra. Herald added subscribers: kerbowa, nhaehnle, wdng, jvesely. arsenm added a child revision: D78020: clang/AMDGPU: Assume denormals are enabled for the default target.. I didn't realize HIP was a distinct offloading kind, so the sub

[PATCH] D78020: clang/AMDGPU: Assume denormals are enabled for the default target.

2020-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, rampitec, b-sumner, msearles. Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl. arsenm added a parent revision: D78019: HIP: Fix handling of denormal mode. Since the default logic was based on

[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D77923#1976497 , @jvesely wrote: > OPENCL_VERSION is something that should be really set by the opencl driver > rather than the compiler. > coarse-grained SVM can be done without FEATURE_FLAT_ADDRESS_SPACE, so those > gpus can

[PATCH] D78019: HIP: Fix handling of denormal mode

2020-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm marked an inline comment as done. arsenm added inline comments. Comment at: clang/test/Driver/cuda-flush-denormals-to-zero.cu:27 +// Test multiple offload archs with different defaults. +// RUN: %clang -x hip -no-canonical-prefixes -### -target x86_64-linux-gnu -c -march=

[PATCH] D78020: clang/AMDGPU: Assume denormals are enabled for the default target.

2020-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D78020#1978034 , @yaxunl wrote: > I think https://reviews.llvm.org/D78019 should fix the issue about HIP not > using correct default denormal value if no arch is specified. > > In that case, driver actually sets offloading arch

[PATCH] D78019: HIP: Fix handling of denormal mode

2020-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 257011. arsenm added a comment. Remove leftover comment from before I used JobAction CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78019/new/ https://reviews.llvm.org/D78019 Files: clang/include/clang/Driver/ToolChain.h clang/lib/Driver/ToolChai

[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D77923#1978172 , @scott.linder wrote: > https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/preprocessorDirectives.html > describes `__OPENCL_VERSION__` as "an integer reflecting the version number > of the OpenCL s

[PATCH] D78019: HIP: Fix handling of denormal mode

2020-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D78019#1978216 , @tra wrote: > LGTM. The patch appears to be an NFC one for CUDA. CUDA currently isn't changing the default FTZ mode based on the subtarget, which differs from nvcc according to the documentation CHANGES SINC

[PATCH] D78019: HIP: Fix handling of denormal mode

2020-04-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. dc89a3efb43feedec04facfa2206de011d2606e7 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78019/new/ https://reviews.llvm.org/D78019 __

[PATCH] D78020: clang/AMDGPU: Assume denormals are enabled for the default target.

2020-04-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D78020#1978034 , @yaxunl wrote: > I think https://reviews.llvm.org/D78019 should fix the issue about HIP not > using correct default denormal value if no arch is specified. > > In that case, driver actually sets offloading arch

[PATCH] D78020: clang/AMDGPU: Assume denormals are enabled for the default target.

2020-04-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 257318. arsenm added a comment. Fix test CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78020/new/ https://reviews.llvm.org/D78020 Files: clang/lib/Driver/ToolChains/AMDGPU.cpp clang/test/Driver/cl-denorms-are-zero.cl clang/test/Driver/cuda-flu

[PATCH] D76389: [NewPM] Run the Speculative Execution Pass only if the target has divergent branches

2020-04-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/include/llvm/Transforms/Scalar/SpeculativeExecution.h:97 + +class SpeculativeExecutionIfHasBranchDivergencePass +: public SpeculativeExecutionPassImpl, Still define a second pass Comment at: ll

[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D77923#1979286 , @jvesely wrote: > > In D77923#1976497 , @jvesely wrote: > > > >> OPENCL_VERSION is something that should be really set by the opencl driver > >> rather than the compiler

[PATCH] D77910: AMDGPU: Define cl_khr_gl_sharing as a supported extension

2020-04-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D77910#1976171 , @b-sumner wrote: > I don't think we can guarantee this is or will be supported on all devices. > The language runtime makes this decision. We don't need to worry about theoretical devices. We should know the

[PATCH] D77910: AMDGPU: Define cl_khr_gl_sharing as a supported extension

2020-04-14 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 257470. arsenm added a comment. Check triple OS CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77910/new/ https://reviews.llvm.org/D77910 Files: clang/lib/Basic/Targets/AMDGPU.h clang/test/Misc/amdgcn.languageOptsOpenCL.cl Index: clang/test/Mis

[PATCH] D78020: clang/AMDGPU: Assume denormals are enabled for the default target.

2020-04-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 257676. arsenm added a comment. Fix comment CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78020/new/ https://reviews.llvm.org/D78020 Files: clang/lib/Driver/ToolChains/AMDGPU.cpp clang/test/Driver/cl-denorms-are-zero.cl clang/test/Driver/cuda-

[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 257681. arsenm added a comment. Check triple for support. Report 2.0 for -amdhsa and -amdpal with flat support, but 1.2 for clover/-mesa3d. Also require targets to explicitly set a value to define, rather than defaulting. CHANGES SINCE LAST ACTION https:/

[PATCH] D77923: OpenCL: Fix some missing predefined macros

2020-04-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 257684. arsenm added a comment. Attach correct diff CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77923/new/ https://reviews.llvm.org/D77923 Files: clang/include/clang/Basic/TargetInfo.h clang/lib/Basic/Targets/AMDGPU.cpp clang/lib/Basic/Targe

[PATCH] D76957: HIP: Merge builtin library handling

2020-04-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 257694. arsenm added a comment. Update for new library structure CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76957/new/ https://reviews.llvm.org/D76957 Files: clang/lib/Driver/ToolChains/AMDGPU.cpp clang/lib/Driver/ToolChains/AMDGPU.h clang/

[PATCH] D77885: AMDGPU: Search for new ROCm bitcode library structure

2020-04-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77885/new/ https://reviews.llvm.org/D77885 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D78020: clang/AMDGPU: Assume denormals are enabled for the default target.

2020-04-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm closed this revision. arsenm added a comment. 3a612450508b314b7a6f4db142d0c619031d760e CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78020/new/ https://reviews.llvm.org/D78020 __

[PATCH] D77885: AMDGPU: Search for new ROCm bitcode library structure

2020-04-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D77885#1983708 , @yaxunl wrote: > If we really want to do this, device lib change and hipcc change need to be > ready. Since once this is committed without corresponding device lib and > hipcc change, HIP will break. The libr

[PATCH] D80917: [OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 2

2020-06-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Basic/Targets/AMDGPU.cpp:290 assert(DataLayout->getAllocaAddrSpace() == Private); + GridValues = &(llvm::GPU::AMDGPUGpuGridValues[0]); + LongGridValues = &(llvm::GPU::AMDGPUGpuLongGridValues[0]); Don't need

[PATCH] D80804: [AMDGPU] Introduce Clang builtins to be mapped to AMDGCN atomic inc/dec intrinsics

2020-06-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/include/clang/Basic/BuiltinsAMDGPU.def:62-63 BUILTIN(__builtin_amdgcn_fence, "vUicC*", "n") +BUILTIN(__builtin_amdgcn_atomic_inc, "iiD*iUicC*", "n") +BUILTIN(__builtin_amdgcn_atomic_dec, "iiD*iUicC*", "n") My mai

[PATCH] D81311: [RFC] LangRef: Define inmem parameter attribute

2020-06-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: rjmccall, jdoerfert, efriedma, t-tye, yaxunl, scott.linder, rnk, spatel, lebedev.ri, nlopes, fhahn, hfinkel, Anastasia. Herald added subscribers: tpr, wdng. Herald added a project: LLVM. This allows tracking the in-memory type of a pointer arg

[PATCH] D81311: [RFC] LangRef: Define inmem parameter attribute

2020-06-05 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D81311#2077868 , @rjmccall wrote: > Most of the generalized optimization properties of this attribute seem to be > redundant with existing attributes. What are the properties you're trying to > convey exactly? The argument is

[PATCH] D80804: [AMDGPU] Introduce Clang builtins to be mapped to AMDGCN atomic inc/dec intrinsics

2020-06-08 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:14553 +case AMDGPU::BI__builtin_amdgcn_atomic_inc32: + IntType = Int32Ty; + BuiltinAtomicOp = Intrinsic::amdgcn_atomic_inc; This should be implied by the return type of the bu

[PATCH] D80804: [AMDGPU] Introduce Clang builtins to be mapped to AMDGCN atomic inc/dec intrinsics

2020-06-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D80804/new/ https://reviews.llvm.org/D80804 ___

[PATCH] D81311: [RFC] LangRef: Define inmem parameter attribute

2020-06-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D81311#2078235 , @rjmccall wrote: > I wonder if `byref` would be a better name for this, as a way to say that the > object is semantically a direct argument that's being passed by implicit > reference. This is probably a bett

[PATCH] D81422: Change filecheck default to dump input on failure

2020-06-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I think this is a worse default for development for large tests. For some generated check lines, I'm seeing many thousand of line dumps on failure, which just makes it harder to see the point it actually failed at. Can we move this default into the buildbot defaults or s

[PATCH] D81422: Change filecheck default to dump input on failure

2020-06-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D81422#2083808 , @mehdi_amini wrote: > In D81422#2083761 , @arsenm wrote: > > > I think this is a worse default for development for large tests. > > > Maybe the issue is with large tests t

[PATCH] D76283: [IRBuilder] Use preferred target type for len argument of memory intrinsic functions

2020-06-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D76283#2084627 , @kschwarz wrote: > I brought this to the mailing list > (http://lists.llvm.org/pipermail/llvm-dev/2020-March/140032.html) but didn't > get an answer there. > > @arsenm, have you any comments regarding @rjmccall

cfe-commits@lists.llvm.org

2020-06-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/test/CodeGen/AArch64/aarch64-bf16-ldst-intrinsics.ll:264 +; Function Attrs: argmemonly nounwind readonly +declare { <8 x bfloat>, <8 x bfloat> } @llvm.aarch64.neon.ld2lane.v8bf16.p0i8(<8 x bfloat>, <8 x bfloat>, i64, i8*) #3 + -

[PATCH] D81311: [RFC] LangRef: Define inmem parameter attribute

2020-06-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D81311#2084066 , @rjmccall wrote: > In D81311#2083295 , @arsenm wrote: > > > In D81311#2078235 , @rjmccall > > wrote: > > > > > I wonder if `byref

[PATCH] D81670: [TTI] Expose isNoopAddrSpaceCast from TLI.[SROA] Teach SROA to recognize no-op addrspacecast.

2020-06-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. We should instead allow bitcast to perform no-op addrspacecasts Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D81670/new/ https://reviews.llvm.org/D81670 ___ cfe-commits mailing

[PATCH] D81670: [TTI] Expose isNoopAddrSpaceCast from TLI.[SROA] Teach SROA to recognize no-op addrspacecast.

2020-06-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D81670#2088304 , @hliao wrote: > In D81670#2087974 , @arsenm wrote: > > > We should instead allow bitcast to perform no-op addrspacecasts > > > That may be a little bit challenging as so f

[PATCH] D81627: [HIP] Do not call opt/llc for -fno-gpu-rdc

2020-06-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. It doesn't matter if we don't support isa linking. We should just use clang and default to -flto. LTO "just works" as is CHANGES SINCE LAST ACTION https://reviews.llvm.org/D81627/new/ https://reviews.llvm.org/D81627 ___ c

[PATCH] D81627: [HIP] Do not call opt/llc for -fno-gpu-rdc

2020-06-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D81627#2088500 , @arsenm wrote: > It doesn't matter if we don't support isa linking. We should just use clang > and default to -flto. LTO "just works" as is This is a step forward, but the lack of ISA linking shouldn't block e

[PATCH] D81311: [RFC] LangRef: Define inmem parameter attribute

2020-06-11 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D81311#2088568 , @jdoerfert wrote: > In D81311#2088075 , @rjmccall wrote: > > > In D81311#2087592 , @jdoerfert > > wrote: > > > > > In D81311#2086

[PATCH] D81713: [HIP] Fix rocm not found on rocm3.5

2020-06-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Can you add tests for this? Is this also sufficient with the directory layout change? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D81713/new/ https://reviews.llvm.org/D81713 ___ cfe-commits mailing list cfe-commits

[PATCH] D81713: [HIP] Fix rocm not found on rocm3.5

2020-06-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D81713#2089680 , @yaxunl wrote: > In D81713#2089672 , @arsenm wrote: > > > Can you add tests for this? Is this also sufficient with the directory > > layout change? > > > You mean does th

[PATCH] D81713: [HIP] Fix rocm not found on rocm3.5

2020-06-12 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D81713#2090265 , @tra wrote: > In D81713#2089760 , @arsenm wrote: > > > In D81713#2089680 , @yaxunl wrote: > > > > > In D81713#2089672

[PATCH] D81422: Change filecheck default to dump input on failure

2020-06-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D81422#2090425 , @mehdi_amini wrote: > I'm thinking about a possible improvement here: we could have FileCheck dump > the input for the current CHECK-LABEL section only: it seems like it could > reduce the output drastically wh

[PATCH] D81849: [clang][amdgpu] Prefer not using `fp16` conversion intrinsics.

2020-06-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenHIP/half.hip:6 + +// CHECK-LABEL: @_Z2d0DF16_ +// CHECK: fpext Making sure the argument type is "half" wouldn't hurt either Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://re

[PATCH] D81422: Change filecheck default to dump input on failure

2020-06-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I'm growing to dislike the new behavior even more as I use it. I don't think restricting the dump to CHECK-LABEL prefixes will even help. The amount of context printed previously was good enough for development use. If I ever can't figure it out from the diff, I want to

[PATCH] D81861: [HIP] Do not use llvm-link/opt/llc for -fgpu-rdc

2020-06-15 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Thanks for doing this, this has been sorely needed since forever. We also really need to switch opencl over to this Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:207-209 + Arg *A = Args.getLastArg(options::OPT_march_EQ); + if (!A) +A = Ar

[PATCH] D81938: [SROA] Teach SROA to perform no-op pointer conversion.[InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`.

2020-06-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Should be separate patches for each pass Comment at: llvm/test/Transforms/InferAddressSpaces/noop-ptrint-pair.ll:17 + ret void +} This is missing a number of test cases I would like to see. First, I would like to make sure if the inter

[PATCH] D81861: [HIP] Do not use llvm-link/opt/llc for -fgpu-rdc

2020-06-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/Driver/hip-toolchain-no-rdc.hip:164 +// LKONLY-NOT: llvm-link +// LKONLY-NOT: opt +// LKONLY-NOT: llc yamauchi wrote: > Hi, this test seems to fail for me because I happen to have the string "opt" > in the "In

[PATCH] D81938: [InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`.

2020-06-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I'm not entirely convinced this is safe in all contexts. I think you can argue that this is safe if it directly feeds a memory instruction, as the access would be undefined if it weren't valid to do the no-op cast. However, I'm not sure if this is safe if used purely in

[PATCH] D81959: [HIP] Enable -amdgpu-internalize-symbols

2020-06-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Isn't the internalization implied by LTO? I thought part of the appeal of LTO is killing this off CHANGES SINCE LAST ACTION https://reviews.llvm.org/D81959/new/ https://reviews.llvm.org/D81959 ___ cfe-commits mailing list

[PATCH] D81959: [HIP] Enable -amdgpu-internalize-symbols

2020-06-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/ToolChains/HIP.cpp:64-65 +"-shared", +"-mllvm", +"-amdgpu-internalize-symbols", +"-o", We should probably ad

[PATCH] D81422: Change filecheck default to dump input on failure

2020-06-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I would also expect a simple command line flag to llvm-lit to be able to control this, rather than having to set an environment variable Comment at: mlir/test/mlir-tblgen/op-format-spec.td:1 -// RUN: mlir-tblgen -gen-op-decls -asmformat-error-is-fatal=f

[PATCH] D81713: [HIP] Fix rocm not found on rocm3.5

2020-06-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm accepted this revision. arsenm added a comment. This revision is now accepted and ready to land. This only solves compatability with the old layout in the case where the hipcc wrapper is pointing directly to the library directory. I think this will still fail with the native clang search

[PATCH] D81938: [InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`.

2020-06-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D81938#2096732 , @hliao wrote: > In D81938#2096500 , @arsenm wrote: > > > I'm not entirely convinced this is safe in all contexts. I think you can > > argue that this is safe if it direct

[PATCH] D82930: [HIP] Fix rocm detection

2020-07-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167 +llvm::ErrorOr> VersionFile = +FS.getBufferForFile(BinPath + "/.hipVersion"); +if (!VersionFile) I don't think the detection should fail if this is missing CHAN

[PATCH] D82930: [HIP] Fix rocm detection

2020-07-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167 +llvm::ErrorOr> VersionFile = +FS.getBufferForFile(BinPath + "/.hipVersion"); +if (!VersionFile) yaxunl wrote: > arsenm wrote: > > I don't think the detection sho

[PATCH] D82930: [HIP] Fix rocm detection

2020-07-01 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167 +llvm::ErrorOr> VersionFile = +FS.getBufferForFile(BinPath + "/.hipVersion"); +if (!VersionFile) arsenm wrote: > yaxunl wrote: > > arsenm wrote: > > > I don't thi

[PATCH] D83088: Introduce CfgTraits abstraction

2020-07-03 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/include/llvm/CodeGen/MachineCfgTraits.h:133 + ++m_def; + if (m_def == m_instr->defs().end()) { +++m_instr; != return early? Comment at: llvm/include/llvm/CodeGen/MachineCfgTraits.

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2020-07-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82087/new/ https://reviews.llvm.org/D82087 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D80952: [FPEnv][Clang][Driver] Disable constrained floating point on targets lacking support.

2020-07-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D80952#2133693 , @MaskRay wrote: > Note also that @arsenm is still a blocking reviewer. It is generally expected > that all feedback is acknowledged. @kpn you should probably have waited for > @arsenm to explicitly clear the bl

[PATCH] D82930: [HIP] Fix rocm detection

2020-07-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167 +llvm::ErrorOr> VersionFile = +FS.getBufferForFile(BinPath + "/.hipVersion"); +if (!VersionFile) yaxunl wrote: > arsenm wrote: > > arsenm wrote: > > > yaxunl wrot

[PATCH] D83088: Introduce CfgTraits abstraction

2020-07-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/include/llvm/CodeGen/MachineCfgTraits.h:136-138 + // Prefer to avoid support for bundled instructions as long as we + // don't really need it. + assert(!m_instr->isBundle()); nhaehnle wrote

[PATCH] D82930: [HIP] Fix rocm detection

2020-07-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167 +llvm::ErrorOr> VersionFile = +FS.getBufferForFile(BinPath + "/.hipVersion"); +if (!VersionFile) yaxunl wrote: > yaxunl wrote: > > yaxunl wrote: > > > tra wrote:

[PATCH] D82930: [HIP] Fix rocm detection

2020-07-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167 +llvm::ErrorOr> VersionFile = +FS.getBufferForFile(BinPath + "/.hipVersion"); +if (!VersionFile) tra wrote: > arsenm wrote: > > yaxunl wrote: > > > yaxunl wrote:

[PATCH] D82767: clang-format: Explicitly use python3

2020-07-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D82767#2141962 , @serge-sans-paille wrote: > In D82767#2132903 , @MyDeveloperDay > wrote: > > > We may not be consistent across all of LLVM > > > > $ find . -name '*.py' -print -exec /

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2020-07-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D82087#2140778 , @sameerds wrote: > The documentation for HIP __ballot seems to indicate that the user does not > have to explicitly specify the warp size. How is that achieved with these new > builtins? Can this be captured i

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2020-07-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm marked 2 inline comments as done. arsenm added inline comments. Comment at: clang/lib/Basic/Targets/AMDGPU.cpp:288 + if (!IsNullCPU) { +// Default to wave32 if available, or wave64 if not +if (Features.count("wavefrontsize32") == 0 && sameerds wro

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2020-07-10 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D82087#2140778 , @sameerds wrote: > The documentation for HIP __ballot seems to indicate that the user does not > have to explicitly specify the warp size. How is that achieved with these new > builtins? Can this be captured i

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2020-06-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: yaxunl, rampitec, b-sumner, foad, nhaehnle. Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, wdng, jvesely, kzhuravl. I wasn't sure what the best strategy was for the wave size difference. I went for an explicit, enforced builtin for

[PATCH] D81311: [RFC] LangRef: Define byref parameter attribute

2020-06-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 271732. arsenm retitled this revision from "[RFC] LangRef: Define inmem parameter attribute" to "[RFC] LangRef: Define byref parameter attribute". arsenm added a comment. Rename to byref. Specify more explicitly this is for the ABI, and should not be inferred

[PATCH] D82249: [HWASan] Disable GlobalISel/FastISel for HWASan Globals.

2020-06-19 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Is the fallback not working correctly in this case for some reason? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82249/new/ https://reviews.llvm.org/D82249 ___ cfe-commits mail

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2020-06-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82087/new/ https://reviews.llvm.org/D82087 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D82249: [HWASan] Disable GlobalISel/FastISel for HWASan Globals.

2020-06-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D82249#2107845 , @hctim wrote: > In D82249#2105036 , @arsenm wrote: > > > Is the fallback not working correctly in this case for some reason? > > > I'm fairly sure that G_GLOBAL_VALUE used

[PATCH] D81938: [InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`.

2020-06-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu:62-64 +// OPT-NOT: alloca +// OPT-NOT: ptrtoint +// OPT-NOT: inttoptr Positive checks are preferable (here and through the rest of the file) Comm

[PATCH] D81938: [InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`.

2020-06-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu:62-64 +// OPT-NOT: alloca +// OPT-NOT: ptrtoint +// OPT-NOT: inttoptr hliao wrote: > arsenm wrote: > > Positive checks are preferable (here and through the rest of

[PATCH] D81311: [RFC] LangRef: Define byref parameter attribute

2020-06-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I did realize one edge case that is different between carrying the size and the type. You can have a zero sized type with an alignment, but 0 is naturally an invalid value for the attribute (and all the others that carry an integer) CHANGES SINCE LAST ACTION https://r

[PATCH] D81938: [InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`.

2020-06-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu:66 +// COMMON-LABEL: define amdgpu_kernel void @_Z7kernel41S(%struct.S.coerce %s.coerce) +// OPT: %0 = extractvalue %struct.S.coerce %s.coerce, 0 +// OPT: %1 = extractvalue %struct

[PATCH] D81938: [InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`.

2020-06-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp:238-239 +return false; + // Check it's really safe to treat that pair of `ptrtoint`/`inttoptr` is a + // no-op cast. Besides checking both of them are no-op casts, as the + // reint

[PATCH] D79732: AMDGPU/HIP: Don't replace pointer types in kernel argument structs

2020-06-25 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm abandoned this revision. arsenm added a comment. I think this is obsoleted now CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79732/new/ https://reviews.llvm.org/D79732 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://l

[PATCH] D82767: clang-format: Explicitly use python3

2020-06-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm created this revision. arsenm added reviewers: djasper, compnerd. Herald added a subscriber: wdng. On Ubuntu 20.04, /usr/bin/env python always fails and requires explicitly choosing python2 or python3. Grep shows there are a lot of other places still relying on the old, sensible be

[PATCH] D81728: [InstCombine] Add target-specific inst combining

2020-06-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. I think an interface usable by InstructionSimplify would be helpful too, so I think that would be a separate thing from TTI Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D81728/new/ https://reviews.llvm.org/D81728 ___

[PATCH] D76389: [NewPM] Run the Speculative Execution Pass only if the target has divergent branches

2020-06-29 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. This seems like it covers a different case than D82735 ? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76389/new/ https://reviews.llvm.org/D76389 _

[PATCH] D82767: clang-format: Explicitly use python3

2020-06-30 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D82767#2123024 , @compnerd wrote: > While I really like this idea of migrating the scripts to python3, I believe > that the current plan is to allow until December for users to migrate, so > this might be a bit premature :-(.

[PATCH] D78190: Add Bfloat IR type

2020-04-16 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: llvm/test/Assembler/bfloat-constprop.ll:1 +; RUN: opt < %s -O3 -S | FileCheck %s +; RUN: verify-uselistorder %s This test is doing way too much. You can reduce the to just ret fadd K0, K1 Comment at: ll

[PATCH] D77910: AMDGPU: Define cl_khr_gl_sharing as a supported extension

2020-04-17 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D77910#1981828 , @b-sumner wrote: > In D77910#1981429 , @arsenm wrote: > > > In D77910#1976171 , @b-sumner > > wrote: > > > > > I don't think we c

[PATCH] D76957: HIP: Merge builtin library handling

2020-04-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm updated this revision to Diff 258562. arsenm added a comment. Switch default back for correct sqrt. Also add more checks for all of the linked libs, and fix duplicating wave64 logic CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76957/new/ https://reviews.llvm.org/D76957 Files:

<    1   2   3   4   5   6   7   8   9   10   >