[PATCH] D157452: [RFC][Clang][Codegen] `std::type_info` needs special care with explicit address spaces

2023-08-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D157452/new/ https://reviews.llvm.org/D157452 ___ cfe-commits mailing list cfe-

[PATCH] D155826: [HIP][Clang][Preprocessor][RFC] Add preprocessor support for C++ Parallel Algorithm Offload

2023-08-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks CHANGES SINCE LAST ACTION https://reviews.llvm.org/D155826/new/ https://reviews.llvm.org/D155826 ___ cfe-commits mailing list cfe-c

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/test/CodeGenCUDA/amdgpu-code-object-version-linking.cu:12 +// RUN: llvm-link %t_0 %t_5 -o -| llvm-dis -o - | FileCheck -check-prefix=LINKED5 %s + +#include "Inputs/cuda.h" need to test using clang -cc1 with -O3 and

[PATCH] D158695: [clang] Fix missing contract flag in sqrt intrinsic

2023-08-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: arsenm, rjmccall. Herald added a project: All. yaxunl requested review of this revision. Herald added a subscriber: wdng. Fix: https://github.com/llvm/llvm-project/issues/64653 https://reviews.llvm.org/D158695 Files: clang/lib/CodeGen/CGBu

[PATCH] D158695: [clang] Fix missing contract flag in sqrt intrinsic

2023-08-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 4 inline comments as done. yaxunl added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:501 if (CGF.Builder.getIsFPConstrained()) { CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, E); Function *F = CGF.CGM.getIntrinsic(ConstrainedIntrinsic

[PATCH] D158695: [clang] Fix missing contract flag in sqrt intrinsic

2023-08-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 553141. yaxunl marked 3 inline comments as done. yaxunl added a comment. revised by comments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158695/new/ https://reviews.llvm.org/D158695 Files: clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/fp-

[PATCH] D158695: [clang] Fix missing contract flag in sqrt intrinsic

2023-08-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 553250. yaxunl added a comment. fix test for strict fp CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158695/new/ https://reviews.llvm.org/D158695 Files: clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/fp-contract-fast-pragma.cpp Index: clang

[PATCH] D158778: [CUDA] Propagate __float128 support from the host.

2023-08-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158778/new/ https://reviews.llvm.org/D158778 ___

[PATCH] D145648: [clang][Driver] recognize `-ffp-contract=fast-honor-pragmas`

2023-08-24 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGe94c171ddb03: [clang][Driver] recognize `-ffp-contract=fast-honor-pragmas` (authored by yaxunl). Herald added a project: clang. Repository: rG LLV

[PATCH] D158367: [AMDGPU] Add target feature gws to clang

2023-08-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 553463. yaxunl retitled this revision from "[AMDGPU] Add target feature gds/gws to clang" to "[AMDGPU] Add target feature gws to clang". yaxunl added a comment. Herald added a reviewer: kiranchandramohan. remove gds feature since it is not used CHANGES SINCE

[PATCH] D158367: [AMDGPU] Add target feature gws to clang

2023-08-25 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGb8a9c50f2294: [AMDGPU] Add target feature gws to clang (authored by yaxunl). Herald added projects: clang, Flang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.or

[PATCH] D158247: [CUDA][HIP] Fix overloading resolution in global variable initializer

2023-08-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158247/new/ https://reviews.llvm.org/D158247 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. LGTM. Thanks Comment at: clang/test/CodeGenCUDA/amdgpu-code-object-version-linking.cu:12 +// RUN: llvm-link %t_0 %t_5 -o -| llvm-dis -o - | FileCheck -check-prefix=LINKED5 %s + +#include "Inputs/cuda.h" sa

[PATCH] D158247: [CUDA][HIP] Fix overloading resolution in global variable initializer

2023-08-29 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGde0df639724b: [CUDA][HIP] Fix overloading resolution in global variable initializer (authored by yaxunl). Herald added a project: clang. Repository:

[PATCH] D158247: [CUDA][HIP] Fix overloading resolution in global variable initializer

2023-09-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl reopened this revision. yaxunl added a comment. This revision is now accepted and ready to land. The patch was reverted since it caused regressions on Windows for HIP. A reduced test case is: typedef void (__stdcall* funcTy)(); void invoke(funcTy f); static void __stdcall callee(

[PATCH] D158247: [CUDA][HIP] Fix overloading resolution in global variable initializer

2023-09-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl closed this revision. yaxunl added a comment. Phabricator no longer allows me to update the patch. Created PR in github https://github.com/llvm/llvm-project/pull/65606 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158247/new/ https://review

[PATCH] D138221: [HIP] Fix lld failure when devie object is empty

2022-11-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 476454. yaxunl added a comment. need to specify osabi for elf64_amdgpu CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138221/new/ https://reviews.llvm.org/D138221 Files: clang/lib/Driver/ToolChains/HIPAMD.cpp clang/test/Driver/hip-toolchain-devic

[PATCH] D138221: [HIP] Fix lld failure when devie object is empty

2022-11-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 476892. yaxunl added a comment. Herald added subscribers: kerbowa, jvesely. add test emulation-amdgpu.s CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138221/new/ https://reviews.llvm.org/D138221 Files: clang/lib/Driver/ToolChains/HIPAMD.cpp clan

[PATCH] D138221: [HIP] Fix lld failure when devie object is empty

2022-11-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added a comment. In D138221#3939384 , @MaskRay wrote: >> Some host relocatable objects may not contain device relocatable objects, >> where an empty file is passed to lld, which causes lld to fail. > > How

[PATCH] D138221: [HIP] Fix lld failure when devie object is empty

2022-11-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added a comment. In D138221#3942095 , @MaskRay wrote: > In D138221#3941173 , @yaxunl wrote: > >> In D138221#3939384 , @Mask

[PATCH] D138391: clang/HIP: Add new header test for math IR gen

2022-11-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. LGTM. Thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138391/new/ https://reviews.llvm.org/D138391 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi

[PATCH] D138221: [HIP] Fix lld failure when devie object is empty

2022-11-22 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG056ebadf5c75: [HIP] Fix lld failure when devie object is empty (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.

[PATCH] D138473: clang/HIP: Inline frexp/frexpf implementations

2022-11-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Are you sure these functions are equivalent? we do not have a comprehensive test for these functions regarding accuracy. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138473/new/ https://reviews.llvm.org/D138473 ___ c

[PATCH] D138509: clang/HIP: Add another math header test

2022-11-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138509/new/ https://reviews.llvm.org/D138509 ___ cfe-commits mailing list cfe-

[PATCH] D138439: clang: Fix cast failure when using -fsanitize=undefined for HIP

2022-11-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138439/new/ https://reviews.llvm.org/D138439 ___ cfe-commits mailing list cfe-

[PATCH] D138651: [CUDA][HIP] Don't diagnose use for __bf16

2022-11-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D138651#3954179 , @tra wrote: >> Builds were failing because "__bf16" wasn't allowed on the target. > > For CUDA/NVPTX we've solved the issue by implementing storage-only support > for NVPTX: https://reviews.llvm.org/D136311 >

[PATCH] D139045: [HIP] use detected GPU in --offload-arch

2022-11-30 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added subscribers: kosarev, kerbowa, jvesely. Herald added a project: All. yaxunl requested review of this revision. Herald added a subscriber: MaskRay. Currently HIP uses gfx803 as offload arch if not specified. This is not conven

[PATCH] D139045: [HIP] use detected GPU in --offload-arch

2022-11-30 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D139045#3961931 , @tra wrote: >> This patch detects system GPU and use them in --offload-arch if not >> specified. If system GPU cannot be detected clang will fall back to gfx803. > > I don't think auto-probing is something we

[PATCH] D139045: [HIP] use detected GPU in --offload-arch

2022-12-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 479365. yaxunl added a comment. fix error handling CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139045/new/ https://reviews.llvm.org/D139045 Files: clang/lib/Driver/Driver.cpp clang/lib/Driver/ToolChains/AMDGPU.h Index: clang/lib/Driver/ToolC

[PATCH] D138393: HIP: Directly call fabs builtins

2022-12-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138393/new/ https://reviews.llvm.org/D138393 ___ cfe-commits mailing list cfe-c

[PATCH] D139045: [HIP] use detected GPU in --offload-arch

2022-12-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D139045#3964875 , @tra wrote: > In any case, I think this is something that may need a wider forum. Ask on > LLVM discourse? RFC opened at discord https://discourse.llvm.org/t/rfc-let-clang-use-system-gpu-as-default-offload-

[PATCH] D76520: [CUDA][HIP] Add -Xarch_device and -Xarch_host options

2020-03-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added a reviewer: jdoerfert. The argument after -Xarch_device will be added to the arguments for CUDA/HIP device compilation and will be removed for host compilation. The argument after -Xarch_host will be added to the arguments f

[PATCH] D76455: [NFC] Refactor handling of Xarch option

2020-03-22 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG78957bab5515: [NFC] Refactor handling of Xarch option (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D7645

[PATCH] D76520: [CUDA][HIP] Add -Xarch_device and -Xarch_host options

2020-03-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D76520#1934341 , @tra wrote: > Does it handle options with values? E.g. if I want to pass > `-mframe-pointer=none` ? I vaguely recall the current -Xarch_* implementation > had some limitations. > It may be worth adding a test

[PATCH] D76520: [CUDA][HIP] Add -Xarch_device and -Xarch_host options

2020-03-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 252024. yaxunl added a comment. Add a test for passing options with value CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76520/new/ https://reviews.llvm.org/D76520 Files: clang/include/clang/Driver/Options.td clang/include/clang/Driver/ToolChain.

[PATCH] D76520: [CUDA][HIP] Add -Xarch_device and -Xarch_host options

2020-03-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D76520#1937217 , @tra wrote: > In D76520#1936837 , @yaxunl wrote: > > > -Xarch_ works with driver options having value, e.g. > > `-fcf-protection=branch`. I added a test for that. > > > >

[PATCH] D76631: [Clang] Fix HIP tests when running on Windows with the LLVM toolchain in the path

2020-03-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I am curious why opt and llc is not affected Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76631/new/ https://reviews.llvm.org/D76631 ___ cfe-commits mailing list cfe-commits@li

[PATCH] D76520: [CUDA][HIP] Add -Xarch_device and -Xarch_host options

2020-03-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 252105. yaxunl added a comment. add TODO for fixing space separated arguments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76520/new/ https://reviews.llvm.org/D76520 Files: clang/include/clang/Driver/Options.td clang/include/clang/Driver/ToolCh

[PATCH] D76631: [Clang] Fix HIP tests when running on Windows with the LLVM toolchain in the path

2020-03-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76631/new/ https://reviews.llvm.org/D76631 __

[PATCH] D76631: [Clang] Fix HIP tests when running on Windows with the LLVM toolchain in the path

2020-03-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D76631#1937681 , @aganea wrote: > In D76631#1937428 , @yaxunl wrote: > > > I am curious why opt and llc is not affected > > > In one case (opt, llc, clang-offload-bundler) it finds those p

[PATCH] D76520: [CUDA][HIP] Add -Xarch_device and -Xarch_host options

2020-03-24 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG2ae25647d1a3: [CUDA][HIP] Add -Xarch_device and -Xarch_host options (authored by yaxunl). Herald added a project: clang. Changed prior to commit: https://reviews.llvm.org/D76520?vs=252105&id=252311#toc

[PATCH] D76772: [AMDGPU] Add __builtin_amdgcn_workgroup_size_x/y/z

2020-03-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: arsenm, b-sumner, cfang. Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl. The main purpose of introducing these builtins is to add a range metadata [1, 1025) on the work group size loaded from dispa

[PATCH] D76772: [AMDGPU] Add __builtin_amdgcn_workgroup_size_x/y/z

2020-03-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 252621. yaxunl marked 9 inline comments as done. yaxunl added a comment. Revised by Matt's comments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76772/new/ https://reviews.llvm.org/D76772 Files: clang/include/clang/Basic/BuiltinsAMDGPU.def clan

[PATCH] D76772: [AMDGPU] Add __builtin_amdgcn_workgroup_size_x/y/z

2020-03-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13428 +return Call; + return CGF.Builder.CreateAddrSpaceCast(Call, RetTy); +} arsenm wrote: > Why is this necessary? The builtin always has the same return type? due to https://github

[PATCH] D76795: [HIP] Change default --gpu-max-threads-per-block value to 1024

2020-03-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: b-sumner, tra. Herald added subscribers: kerbowa, nhaehnle, jvesely. This better matches CUDA behavior. https://reviews.llvm.org/D76795 Files: clang/include/clang/Basic/LangOptions.def clang/lib/CodeGen/TargetInfo.cpp clang/test/CodeGe

[PATCH] D76795: [HIP] Change default --gpu-max-threads-per-block value to 1024

2020-03-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 2 inline comments as done. yaxunl added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:8123 +// --gpu-max-threads-per-block=n or its default value for HIP. +const unsigned OpenCLMaxWorkGroupSize = 256; +const unsigned MaxWorkGroupSize = --

[PATCH] D76795: [HIP] Change default --gpu-max-threads-per-block value to 1024

2020-03-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 252661. yaxunl added a comment. change variable names CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76795/new/ https://reviews.llvm.org/D76795 Files: clang/include/clang/Basic/LangOptions.def clang/lib/CodeGen/TargetInfo.cpp clang/test/CodeGen

[PATCH] D76772: [AMDGPU] Add __builtin_amdgcn_workgroup_size_x/y/z

2020-03-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 2 inline comments as done. yaxunl added inline comments. Comment at: clang/test/CodeGenCUDA/amdgpu-workgroup-size.cu:2 +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa \ +// RUN: -fcuda-is-device -emit-llvm -o - -x hip %s \ +// RUN: | FileCheck %s -

[PATCH] D76937: Fix infinite recursion in deferred diagnostic emitter

2020-03-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: rjmccall, ABataev. Herald added a reviewer: jdoerfert. Currently deferred diagnostic emitter checks variable decl in DeclRefExpr, which causes infinite recursion for cases like `long a = (long)&a;`. Deferred diagnostic emitter does not need ch

[PATCH] D76887: AMDGPU: Make HIPToolChain a subclass of ROCMToolChain

2020-03-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Driver/ToolChains/HIP.cpp:271 const ToolChain &HostTC, const ArgList &Args) -: ToolChain(D, Triple, Args), HostTC(HostTC) { +: ROCMToolChain(D, Triple, Args), HostTC(HostTC) { // Lookup b

[PATCH] D76772: [AMDGPU] Add __builtin_amdgcn_workgroup_size_x/y/z

2020-03-27 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. yaxunl marked an inline comment as done. Closed by commit rG369e26ca9e0d: [AMDGPU] Add __builtin_amdgcn_workgroup_size_x/y/z (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGE

[PATCH] D76987: Rename options --cuda-gpu-arch and --no-cuda-gpu-arch

2020-03-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. yaxunl edited the summary of this revision. Options --cuda-gpu-arch and --no-cuda-gpu-arch are shared between CUDA and HIP. It is desirable to rename them for more generic names to avoid confusion. One option is --gpu-arch and --no-gpu-a

[PATCH] D77013: [AMDGPU] Add options -mamdgpu-ieee -mno-amdgpu-ieee

2020-03-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: arsenm, b-sumner, rjmccall. Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl. AMDGPU backend need to know whether IEEE754-2008 NaN compliant instructions need to be emitted for a function, which is co

[PATCH] D77028: [NFC] Refactor DeferredDiagsEmitter and skip redundant visit

2020-03-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: rjmccall, ABataev. Move function emitDeferredDiags from Sema to DeferredDiagsEmitter since it is only used by DeferredDiagsEmitter. Also record number of diagnostics triggered by each function and skip a function if it is known not to emit any

[PATCH] D77013: [AMDGPU] Add options -mamdgpu-ieee -mno-amdgpu-ieee

2020-03-30 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl planned changes to this revision. yaxunl added a comment. This patch is put on hold due to some concerns. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77013/new/ https://reviews.llvm.org/D77013 ___ cfe-commits mailing list cfe-commi

[PATCH] D76887: AMDGPU: Make HIPToolChain a subclass of ROCMToolChain

2020-03-30 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76887/new/ https://reviews.llvm.org/D76887 ___ cfe-commits mailing list cfe-

[PATCH] D76937: Fix infinite recursion in deferred diagnostic emitter

2020-03-30 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D76937#1950077 , @rjmccall wrote: > Can you explain what exactly the emission/semantic model is for variables? > Normal code-generation absolutely triggers the emission of many variables > lazily (e.g. internal-linkage globals

[PATCH] D76987: Rename options --cuda-gpu-arch and --no-cuda-gpu-arch

2020-03-30 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D76987#1950366 , @gregrodgers wrote: > This was discussed on llvm-dev three years ago. Here is the thread. > > http://lists.llvm.org/pipermail/llvm-dev/2017-February/109930.html > > The last name discussed was "-- offload-arch".

[PATCH] D76987: Rename options --cuda-gpu-arch and --no-cuda-gpu-arch

2020-03-30 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG764f54bb857b: Rename options --cuda-gpu-arch and --no-cuda-gpu-arch (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.

[PATCH] D76862: HIP: Ensure new denormal mode attributes are set

2020-03-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Are there any other clang options affecting flushing denormals? If so, are they working properly after this change? Do we need to have tests for them? Thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76862/new/ https://reviews.llvm.org/D76862 ___

[PATCH] D76862: HIP: Ensure new denormal mode attributes are set

2020-03-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76862/new/ https://reviews.llvm.org/D76862 ___ cfe-commits mailing list cfe-

[PATCH] D76950: HIP: Link correct denormal mode library

2020-03-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! Comment at: clang/test/Driver/hip-device-libs.hip:5 -// Test flush-denormals-to-zero enabled uses oclc_daz_opt_on +// Test if if oclc_daz_opt_on or if oclc_da

[PATCH] D59321: AMDGPU: Teach toolchain to link rocm device libs

2020-03-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/include/clang/Basic/DiagnosticDriverKinds.td:264 def err_drv_invalid_malign_branch_EQ : Error< "invalid argument '%0' to -malign-branch=; each element must be one of: %1">; could you please rebase your patch?

[PATCH] D76937: Fix infinite recursion in deferred diagnostic emitter

2020-04-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 254262. yaxunl added a comment. fix assert message CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76937/new/ https://reviews.llvm.org/D76937 Files: clang/lib/Sema/Sema.cpp clang/lib/Sema/SemaDecl.cpp clang/test/OpenMP/nvptx_target_exceptions_me

[PATCH] D76937: Fix infinite recursion in deferred diagnostic emitter

2020-04-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 254261. yaxunl added a comment. Revised by John's comments. Also only check file scope variables. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76937/new/ https://reviews.llvm.org/D76937 Files: clang/lib/Sema/Sema.cpp clang/lib/Sema/SemaDecl.cpp

[PATCH] D59321: AMDGPU: Teach toolchain to link rocm device libs

2020-04-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/include/clang/Driver/Options.td:611 def fno_cuda_short_ptr : Flag<["-"], "fno-cuda-short-ptr">; +def rocm_path_EQ : Joined<["--"], "rocm-path=">, Group, + HelpText<"ROCm installation path">; HIP toolchain will als

[PATCH] D77234: clang/AMDGPU: Stop setting old denormal subtarget features

2020-04-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77234/new/ https://reviews.llvm.org/D77234 ___ cfe-commits mailing list cfe-

[PATCH] D76937: Fix infinite recursion in deferred diagnostic emitter

2020-04-01 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG5767085c8de9: Fix infinite recursion in deferred diag emitter (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.o

[PATCH] D77028: [NFC] Refactor DeferredDiagsEmitter and skip redundant visit

2020-04-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 254407. yaxunl added a comment. rebase CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77028/new/ https://reviews.llvm.org/D77028 Files: clang/include/clang/Sema/Sema.h clang/lib/Sema/Sema.cpp Index: clang/lib/Sema/Sema.cpp ==

[PATCH] D77028: [NFC] Refactor DeferredDiagsEmitter and skip redundant visit

2020-04-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Sema/Sema.cpp:1508 void checkFunc(SourceLocation Loc, FunctionDecl *FD) { +auto DiagsCountIt = DiagsCount.find(FD); FunctionDecl *Caller = UseStack.empty() ? nullptr : UseStack.back(); rjmccall wrote

[PATCH] D77028: [NFC] Refactor DeferredDiagsEmitter and skip redundant visit

2020-04-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 254537. yaxunl marked 2 inline comments as done. yaxunl added a comment. Herald added a reviewer: jdoerfert. added comments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77028/new/ https://reviews.llvm.org/D77028 Files: clang/include/clang/Sema/Se

[PATCH] D77329: [AMDGPU] Allow AGPR in inline asm

2020-04-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: rampitec, arsenm. Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl. rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. Thanks. Could you also

[PATCH] D77329: [AMDGPU] Allow AGPR in inline asm

2020-04-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 254619. yaxunl added a comment. added agprs to GCCRegNames and fixed types in test CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77329/new/ https://reviews.llvm.org/D77329 Files: clang/lib/Basic/Targets/AMDGPU.cpp clang/lib/Basic/Targets/AMDGPU.

[PATCH] D77329: [AMDGPU] Allow AGPR in inline asm

2020-04-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 254668. yaxunl added a comment. fix test CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77329/new/ https://reviews.llvm.org/D77329 Files: clang/lib/Basic/Targets/AMDGPU.cpp clang/lib/Basic/Targets/AMDGPU.h clang/test/CodeGenOpenCL/inline-asm-am

[PATCH] D74807: Add cl_khr_mipmap_image_writes as supported to AMDGPU

2020-02-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: b-sumner. Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl. https://reviews.llvm.org/D74807 Files: clang/lib/Basic/Targets/AMDGPU.h Index: clang/lib/Basic/Targets/AMDGPU.h ==

[PATCH] D70172: [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese

2020-02-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D70172#1883567 , @ABataev wrote: > Seems to me, it causes some other issues. See > https://bugs.llvm.org/show_bug.cgi?id=44948 for example I will fix that bug. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D70172: [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese

2020-02-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/lib/Sema/Sema.cpp:1514 + void visitUsedDecl(SourceLocation Loc, Decl *D) { +if (auto *TD = dyn_cast(D)) { + for (auto *DD : TD->decls()) { rjmccall wrote: > erichke

[PATCH] D74807: Add cl_khr_mipmap_image_writes as supported to AMDGPU

2020-02-19 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGed07c89fc50f: Add cl_khr_mipmap_image_writes as supported to AMDGPU (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.

[PATCH] D70172: [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese

2020-02-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/lib/Sema/Sema.cpp:1514 + void visitUsedDecl(SourceLocation Loc, Decl *D) { +if (auto *TD = dyn_cast(D)) { + for (auto *DD : TD->decls()) { rjmccall wrote: > yaxunl

[PATCH] D74910: [OpenCL] Remove spurious atomic_fetch_min/max builtins

2020-02-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Are you sure this change will not break OpenCL conformance tests? I remember they are there for some reason. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D74910/new/ https://reviews.llvm.org/D74910 _

[PATCH] D75028: Make __builtin_amdgcn_dispatch_ptr dereferenceable and align at 4

2020-02-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: arsenm, cfang, b-sumner. Herald added subscribers: llvm-commits, kerbowa, nhaehnle, wdng, jvesely. Herald added a project: LLVM. https://reviews.llvm.org/D75028 Files: clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGenCUDA/builtins-amdgcn.

[PATCH] D75028: Make __builtin_amdgcn_dispatch_ptr dereferenceable and align at 4

2020-02-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: llvm/include/llvm/IR/IntrinsicsAMDGPU.td:144 def int_amdgcn_dispatch_ptr : - GCCBuiltin<"__builtin_amdgcn_dispatch_ptr">, Intrinsic<[LLVMQualPointerType], [], arsenm wrote: >

[PATCH] D75028: Make __builtin_amdgcn_dispatch_ptr dereferenceable and align at 4

2020-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGa57d9652a0dc: Make __builtin_amdgcn_dispatch_ptr dereferenceable and align at 4 (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION http

[PATCH] D70172: [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese

2020-02-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I tried recording functions to be emitted during normal parsing and using it as starting point for the final traversal. It is quite promising. I only get one lit test failure for OpenMP: int foobar2(); #pragma omp declare target int (*B)() = &foobar2; #pragma

[PATCH] D70172: [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese

2020-02-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Also, we cannot remove traversing of RecordDecl and CapturedDecl encountered in function body since we have OpenMP test like this: int main() { #pragma omp target { t1(0); } return 0; } This results in a kernel function embedded in a captured reco

[PATCH] D74910: [OpenCL] Remove spurious atomic_fetch_min/max builtins

2020-02-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D74910/new/ https://reviews.llvm.org/D74910 ___ cfe-c

[PATCH] D75285: Mark restrict pointer or reference to const as invariant

2020-02-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: rjmccall. We saw users intend to use `const int* restrict` to indicate the memory pointed to by the pointer is invariant. This makes sense since restrict means the memory is not aliased by any other pointers whereas const means the memory d

[PATCH] D75285: Mark restrict pointer or reference to const as invariant

2020-02-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. If this is not the right way to tell the compiler a memory pointed to by a pointer is invariant, what is the recommended way? Can we introduce clang builtins for llvm.invariant.start and llvm.invariant.end to allow user to specify that? Thanks. CHANGES SINCE LAST ACTI

[PATCH] D75285: Mark restrict pointer or reference to const as invariant

2020-02-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D75285#1896400 , @hfinkel wrote: > Unfortunately, we cannot do this kind of thing just because it seems to make > sense. The language semantics must be exactly satisfied by the IR-level > semantics. I certainly agree that it wo

[PATCH] D70172: [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese

2020-02-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I still got assertion when I use the built clang with check-mlir. The reduced testcase is class A { public: int foo(); }; static A a; struct B { B(int x = a.foo()); }; void test() { B x; } The assertion I got is: clang: /home/yaxun

[PATCH] D75285: Mark restrict pointer or reference to const as invariant

2020-02-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D75285#1896458 , @rjmccall wrote: > Unfortunately, `const` also doesn't mean that the memory doesn't change. It > does mean it can't be changed through this pointer, but `restrict` allows you > to derive more pointers from it

[PATCH] D70172: [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese

2020-02-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 247275. yaxunl added a comment. Do not traverse the whole CU. Record potentially emitted functions and variables in the normal parsing and traverse them instead. Also fixed bug 44948 and regression in check-mlir. CHANGES SINCE LAST ACTION https://reviews.

[PATCH] D75285: Mark restrict pointer or reference to const as invariant

2020-02-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D75285#1897502 , @jeroen.dobbelaere wrote: > I don't think that 'restrict' is a good match for this behavior. For c++, the > alias_set proposal > (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4150.pdf) would be >

[PATCH] D75423: [OpenCL] Mark pointers to constant address space as invariant

2020-03-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: rjmccall, Anastasia, bader. OpenCL constant address space is immutable, therefore pointers to constant address space can be marked with llvm.invariant.start permanently. This should allow more optimization opportunities in LLVM passes. http

[PATCH] D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code

2020-03-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D75402/new/ https://reviews.llvm.org/D75402 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm

[PATCH] D75423: [OpenCL] Mark pointers to constant address space as invariant

2020-03-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D75423#1901206 , @hliao wrote: > invariant checking already takes account of loading from constant address > space or memory (AA::pointsToConstantMemory), that's almost equivalent to > adding invariant attributes. Why do we mar

[PATCH] D75423: [OpenCL] Mark pointers to constant address space as invariant

2020-03-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D75423#1901254 , @rjmccall wrote: > In D75423#1901206 , @hliao wrote: > > > invariant checking already takes account of loading from constant address > > space or memory (AA::pointsToCons

[PATCH] D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code

2020-03-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1919 void CodeGenModule::addUsedGlobal(llvm::GlobalValue *GV) { LLVMUsed.emplace_back(GV); hliao wrote: > This check should be removed completely instead it should be revised for

[PATCH] D75402: [HIP] Make sure, unused hip-pinned-shadow global var is kept within device code

2020-03-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D75402#1901370 , @hsmhsm wrote: > In D75402#1901361 , @hliao wrote: > > > BTW, why that variable cannot have an initializer? Suppose that initializer > > is a trivial one, initializing to

[PATCH] D75423: [OpenCL] Mark pointers to constant address space as invariant

2020-03-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D75423#1901362 , @rjmccall wrote: > Okay, then I have no problem taking a patch for this into IRGen. But I think > it should be fine to do this by adding the invariant-load metadata when > loading from an l-value instead of in

<    16   17   18   19   20   21   22   23   24   25   >