[clang] [Clang][OpenCL][AMDGPU] Use `byref` for OpenCL kernel arguments (PR #134892)

2025-04-08 Thread Shilei Tian via cfe-commits
https://github.com/shiltian edited https://github.com/llvm/llvm-project/pull/134892 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Accept builtins in lambda declarations (PR #135027)

2025-04-09 Thread Shilei Tian via cfe-commits
@@ -27,7 +27,7 @@ bool SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall(unsigned BuiltinID, // position of memory order and scope arguments in the builtin unsigned OrderIndex, ScopeIndex; - const auto *FD = SemaRef.getCurFunctionDecl(); + const auto *FD = SemaRef.getCurFuncti

[clang] [AMDGPU][Clang] Add builtins for gfx12 ray tracing intrinsics (PR #135224)

2025-04-10 Thread Shilei Tian via cfe-commits
https://github.com/shiltian created https://github.com/llvm/llvm-project/pull/135224 __builtin_amdgcn_image_bvh8_intersect_ray __builtin_amdgcn_image_bvh_dual_intersect_ray For the above two builtins, the second and third return values of the intrinsics are returned through pointer-type functio

[clang] [AMDGPU][Clang] Add builtins for gfx12 ray tracing intrinsics (PR #135224)

2025-04-10 Thread Shilei Tian via cfe-commits
shiltian wrote: * **#135224** https://app.graphite.dev/github/pr/llvm/llvm-project/135224?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/135

[clang] [AMDGPU][Clang] Add builtins for gfx12 ray tracing intrinsics (PR #135224)

2025-04-10 Thread Shilei Tian via cfe-commits
https://github.com/shiltian edited https://github.com/llvm/llvm-project/pull/135224 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] [AMDGPU] Use COV6 by default (PR #118515)

2025-03-31 Thread Shilei Tian via cfe-commits
shiltian wrote: > @shiltian Could you update MLIR infrastructure for the new default as well? > `mlir/lib/Target/LLVM/ROCDL/Target.cpp` and > `mlir/lib/Dialect/LLVMIR/IR/ROCDLDialect.cpp`, which both keep an ear on the > ABI version, partly for linking in device libraries https://github.com/l

[clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-03-17 Thread Shilei Tian via cfe-commits
https://github.com/shiltian updated https://github.com/llvm/llvm-project/pull/122629 >From ab66262e163a8c63c980d8298480556aad9c5b4c Mon Sep 17 00:00:00 2001 From: Shilei Tian Date: Mon, 17 Mar 2025 12:31:06 -0400 Subject: [PATCH 1/3] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to s

[clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-03-17 Thread Shilei Tian via cfe-commits
@@ -84,31 +84,27 @@ OffloadTargetInfo::OffloadTargetInfo(const StringRef Target, : BundlerConfig(BC) { // TODO: Add error checking from ClangOffloadBundler.cpp - auto TargetFeatures = Target.split(':'); - auto TripleOrGPU = TargetFeatures.first.rsplit('-'); - - if (cl

[clang] [llvm] [DataLayout] Introduce sentinel pointer value (PR #131557)

2025-03-17 Thread Shilei Tian via cfe-commits
https://github.com/shiltian updated https://github.com/llvm/llvm-project/pull/131557 >From 015964e72ebc223bcd191ceb306de8fdbca360f3 Mon Sep 17 00:00:00 2001 From: Shilei Tian Date: Mon, 17 Mar 2025 13:52:06 -0400 Subject: [PATCH] [DataLayout] Introduce sentinel pointer value MIME-Version: 1.0 C

[clang] [llvm] [DataLayout] Introduce sentinel pointer value (PR #131557)

2025-03-17 Thread Shilei Tian via cfe-commits
https://github.com/shiltian updated https://github.com/llvm/llvm-project/pull/131557 >From efda127ecd06ea966df89425d10bd837c0cafe4e Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Mon, 17 Mar 2025 13:45:09 +0900 Subject: [PATCH] [DataLayout] Introduce sentinel pointer value MIME-Version: 1.

[clang] [llvm] [DataLayout] Introduce sentinel pointer value (PR #131557)

2025-03-17 Thread Shilei Tian via cfe-commits
https://github.com/shiltian updated https://github.com/llvm/llvm-project/pull/131557 >From 053949d4dd8f28a2daa57a6143f0267c0bd3af6c Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Mon, 17 Mar 2025 13:45:09 +0900 Subject: [PATCH 1/2] [LoopVectorize] Add test for follow-up metadata for loops

[clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-03-17 Thread Shilei Tian via cfe-commits
https://github.com/shiltian updated https://github.com/llvm/llvm-project/pull/122629 >From 16509121603e55539a5fa26420343d74c39b7963 Mon Sep 17 00:00:00 2001 From: Shilei Tian Date: Mon, 17 Mar 2025 12:31:06 -0400 Subject: [PATCH] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to suppo

[clang] [flang] [llvm] [NFC][AMDGPU] Replace more direct arch comparison with isAMDGCN() (PR #131379)

2025-03-14 Thread Shilei Tian via cfe-commits
https://github.com/shiltian created https://github.com/llvm/llvm-project/pull/131379 This is an extension of #131357. Hopefully this would be the last one. >From 59bc234d4a5c343e093417150688a3231a230961 Mon Sep 17 00:00:00 2001 From: Shilei Tian Date: Fri, 14 Mar 2025 15:06:30 -0400 Subject: [

[clang] [llvm] [DataLayout] Introduce sentinel pointer value (PR #131557)

2025-03-17 Thread Shilei Tian via cfe-commits
@@ -552,6 +553,11 @@ class DataLayout { /// /// This includes an explicitly requested alignment (if the global has one). Align getPreferredAlign(const GlobalVariable *GV) const; + + /// Returns the sentinel pointer value for a given address space. If the + /// address s

[clang] [Clang][OpenCL][AMDGPU] Use `byref` for OpenCL kernel arguments (PR #134892)

2025-04-08 Thread Shilei Tian via cfe-commits
shiltian wrote: > The question isn't byval or byref, we already don't use byval. The important > ABI piece is the alignment of a pointer value passed indirectly. > > We lose all parameter attributes by going through indirect passing, but some > of those can be recovered by putting the metadata

[clang] [Clang][OpenCL][AMDGPU] Use `byref` for OpenCL kernel arguments (PR #134892)

2025-04-08 Thread Shilei Tian via cfe-commits
https://github.com/shiltian created https://github.com/llvm/llvm-project/pull/134892 Due to a previous workaround allowing kernels to be called from other functions, Clang currently doesn't use the `byref` attribute for aggregate kernel arguments. The issue was recently resolved in https://githu

[clang] [Clang][AMDGPU] Enable `avail-extern-to-local` for ThinLTO in HIP (PR #134476)

2025-04-08 Thread Shilei Tian via cfe-commits
https://github.com/shiltian updated https://github.com/llvm/llvm-project/pull/134476 >From d508aa41f7eb7767953c3eec745300c678029c04 Mon Sep 17 00:00:00 2001 From: Shilei Tian Date: Tue, 8 Apr 2025 10:47:47 -0400 Subject: [PATCH] [Clang][AMDGPU] Enable `avail-extern-to-local` for ThinLTO in HIP

[clang] [AMDGPU][Clang] Add builtins for gfx12 ray tracing intrinsics (PR #135224)

2025-04-11 Thread Shilei Tian via cfe-commits
https://github.com/shiltian closed https://github.com/llvm/llvm-project/pull/135224 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] vmem-to-lds-load-insts incoherence between TargetParser and AMDGPU.td (PR #135376)

2025-04-11 Thread Shilei Tian via cfe-commits
https://github.com/shiltian approved this pull request. https://github.com/llvm/llvm-project/pull/135376 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][OpenCL][AMDGPU] Use `byref` for aggregate OpenCL kernel arguments (PR #134892)

2025-04-13 Thread Shilei Tian via cfe-commits
https://github.com/shiltian closed https://github.com/llvm/llvm-project/pull/134892 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Forward two linker options to `lld` when ThinLTO is enabled for AMDGPU (PR #135690)

2025-04-14 Thread Shilei Tian via cfe-commits
https://github.com/shiltian closed https://github.com/llvm/llvm-project/pull/135690 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Clang][OpenMP] Support for dispatch construct (Sema & Codegen) support (PR #131838)

2025-04-23 Thread Shilei Tian via cfe-commits
shiltian wrote: Hmm, this PR is much shorter than it used to be. https://github.com/llvm/llvm-project/pull/131838 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Clang][OpenMP] Support for dispatch construct (Sema & Codegen) support (PR #131838)

2025-04-23 Thread Shilei Tian via cfe-commits
@@ -423,8 +423,8 @@ static void instantiateOMPDeclareVariantAttr( auto *FD = cast(New); auto *ThisContext = dyn_cast_or_null(FD->getDeclContext()); - auto &&SubstExpr = [FD, ThisContext, &S, &TemplateArgs](Expr *E) { -if (auto *DRE = dyn_cast(E->IgnoreParenImpCasts())

[clang] [AMDGPU][Clang] Add builtins for gfx12 ray tracing intrinsics (PR #135224)

2025-04-10 Thread Shilei Tian via cfe-commits
shiltian wrote: FWIW, this is part of the gfx12 upstream. https://github.com/llvm/llvm-project/pull/135224 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Enable `avail-extern-to-local` for ThinLTO in HIP (PR #134476)

2025-04-14 Thread Shilei Tian via cfe-commits
https://github.com/shiltian closed https://github.com/llvm/llvm-project/pull/134476 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Enable `avail-extern-to-local` for ThinLTO in HIP (PR #134476)

2025-04-14 Thread Shilei Tian via cfe-commits
shiltian wrote: ping https://github.com/llvm/llvm-project/pull/134476 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Driver] Only enable internalization for OpenMP target offloading with ThinLTO on AMDGPU (PR #138547)

2025-05-05 Thread Shilei Tian via cfe-commits
https://github.com/shiltian created https://github.com/llvm/llvm-project/pull/138547 None Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial

[clang] [Clang][Driver] Only enable internalization for OpenMP target offloading with ThinLTO on AMDGPU (PR #138547)

2025-05-05 Thread Shilei Tian via cfe-commits
shiltian wrote: * **#138547** https://app.graphite.dev/github/pr/llvm/llvm-project/138547?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/138

[clang] [Clang][Driver] Only enable internalization for OpenMP target offloading with ThinLTO on AMDGPU (PR #138547)

2025-05-05 Thread Shilei Tian via cfe-commits
https://github.com/shiltian closed https://github.com/llvm/llvm-project/pull/138547 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][NFC] Move OffloadArch enum to a generic location (PR #137070)

2025-04-24 Thread Shilei Tian via cfe-commits
https://github.com/shiltian edited https://github.com/llvm/llvm-project/pull/137070 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Disable RTTI for offloading at the frontend level (PR #127082)

2025-04-25 Thread Shilei Tian via cfe-commits
https://github.com/shiltian approved this pull request. The AMDGPU part looks good. Not sure if CUDA supports it. @Artem-B https://github.com/llvm/llvm-project/pull/127082 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/c

[clang] [Clang] Forward two linker options to `lld` when ThinLTO is enabled for AMDGPU (PR #135690)

2025-04-14 Thread Shilei Tian via cfe-commits
https://github.com/shiltian created https://github.com/llvm/llvm-project/pull/135690 None >From c1fd5f3f3493b4e5b553438f023fde77d721199b Mon Sep 17 00:00:00 2001 From: Shilei Tian Date: Mon, 14 Apr 2025 18:46:45 -0400 Subject: [PATCH] [Clang] Forward two linker options to `lld` when ThinLTO is

[clang] [Clang] Forward two linker options to `lld` when ThinLTO is enabled for AMDGPU (PR #135690)

2025-04-14 Thread Shilei Tian via cfe-commits
https://github.com/shiltian edited https://github.com/llvm/llvm-project/pull/135690 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Forward two linker options to `lld` when ThinLTO is enabled for AMDGPU (PR #135690)

2025-04-14 Thread Shilei Tian via cfe-commits
shiltian wrote: * **#135690** https://app.graphite.dev/github/pr/llvm/llvm-project/135690?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/135

[clang] [AMDGPU] Support the OpenCL generic addrspace feature by default (PR #137636)

2025-04-28 Thread Shilei Tian via cfe-commits
@@ -146,3 +146,8 @@ #pragma OPENCL EXTENSION cl_khr_subgroups: enable // expected-warning@-1{{unsupported OpenCL extension 'cl_khr_subgroups' - ignoring}} +#ifdef __opencl_c_generic_address_space +#error "Incorrect __opencl_c_generic_address_space define" shi

[clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-26 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,144 @@ +//===-- TargetVerifier.cpp - LLVM IR Target Verifier *- C++ -*-===// + +/ Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +/ See https://llvm.org/LICENSE.txt for license information. +/ SPDX-License

[clang] [llvm] [mlir] [AMDGPU] Add a new amdgcn.load.to.lds intrinsic (PR #137425)

2025-04-26 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,75 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc -global-isel=0 -mtriple=amdgcn -mcpu=gfx950 < %s | FileCheck -check-prefixes=GFX950,GFX950-SDAG %s +; RUN: llc -global-isel=1 -mtriple=amdgcn -mcpu=g

[clang] [llvm] [mlir] [AMDGPU] Add a new amdgcn.load.to.lds intrinsic (PR #137425)

2025-04-26 Thread Shilei Tian via cfe-commits
https://github.com/shiltian edited https://github.com/llvm/llvm-project/pull/137425 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [AMDGPU] Add a new amdgcn.load.to.lds intrinsic (PR #137425)

2025-04-26 Thread Shilei Tian via cfe-commits
https://github.com/shiltian edited https://github.com/llvm/llvm-project/pull/137425 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][amdgpu] Add builtins for raw/struct buffer lds load (PR #137678)

2025-04-28 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,12 @@ +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu tahiti -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu bonaire -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu carrizo -S -verify

[clang] [llvm] [clang][amdgpu] Add builtins for raw/struct buffer lds load (PR #137678)

2025-04-28 Thread Shilei Tian via cfe-commits
shiltian wrote: Isn't it similar to what @krzysz00 is doing in another PR to some extent? https://github.com/llvm/llvm-project/pull/137678 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commi

[clang] [llvm] [clang][amdgpu] Add builtins for raw/struct buffer lds load (PR #137678)

2025-04-28 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,12 @@ +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu tahiti -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu bonaire -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu carrizo -S -verify

[clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-29 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,84 @@ +//===-- llvm/Target/TargetVerifier.h - LLVM IR Target Verifier ---*- C++ -*-===// shiltian wrote: What's the value of this interface class instead of just making those target verifier a module/function pass? https://github.com/llvm/llvm-proje

[clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-29 Thread Shilei Tian via cfe-commits
@@ -841,6 +857,11 @@ class PassBuilder { // Callbacks to parse `filter` parameter in register allocation passes SmallVector, 2> RegClassFilterParsingCallbacks; + // Verifier callbacks + SmallVector, 2> + VerifierCallbacks; + SmallVector, 2> + FnVerifierCa

[clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-29 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,34 @@ +set(LLVM_LINK_COMPONENTS shiltian wrote: I don't know why we want a dedicated tool for this https://github.com/llvm/llvm-project/pull/123609 ___ cfe-commits mailing list cfe-commits@lists.llvm.org htt

[clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-29 Thread Shilei Tian via cfe-commits
@@ -172,6 +172,13 @@ class PassBuilder { /// additional analyses. void registerLoopAnalyses(LoopAnalysisManager &LAM); + /// Registers all available verifier passes. + /// + /// This is an interface that can be used to populate a + /// \c ModuleAnalysisManager with all

[clang] [flang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-29 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,144 @@ +//===-- TargetVerifier.cpp - LLVM IR Target Verifier *- C++ -*-===// + +/ Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +/ See https://llvm.org/LICENSE.txt for license information. +/ SPDX-License

[clang] [llvm] [clang][amdgpu] Add builtins for raw/struct buffer lds load (PR #137678)

2025-04-29 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,12 @@ +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu tahiti -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu bonaire -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu carrizo -S -verify

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-02 Thread Shilei Tian via cfe-commits
@@ -9284,6 +9284,12 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA, CmdArgs.push_back(Args.MakeArgString( "--device-linker=" + TC->getTripleString() + "=" + Arg)); + // Enable internalization for AMDGPU. + if (TC->getTrip

[clang] [llvm] [mlir] [AMDGPU] Add a new amdgcn.load.to.lds intrinsic (PR #137425)

2025-05-02 Thread Shilei Tian via cfe-commits
@@ -2641,6 +2641,28 @@ def int_amdgcn_perm : // GFX9 Intrinsics //===--===// +/// This is a general-purpose intrinsic for all operations that take a pointer +/// a base location in LDS, and a data size and us

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-02 Thread Shilei Tian via cfe-commits
https://github.com/shiltian closed https://github.com/llvm/llvm-project/pull/138365 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-02 Thread Shilei Tian via cfe-commits
shiltian wrote: * **#138365** https://app.graphite.dev/github/pr/llvm/llvm-project/138365?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/138

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-02 Thread Shilei Tian via cfe-commits
https://github.com/shiltian created https://github.com/llvm/llvm-project/pull/138365 None >From acc89cf6a85a8fb758b528a4e2ae587fccd61ce5 Mon Sep 17 00:00:00 2001 From: Shilei Tian Date: Fri, 2 May 2025 19:41:11 -0400 Subject: [PATCH] [Clang][Driver] Enable internalization by default for AMDGPU

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-02 Thread Shilei Tian via cfe-commits
https://github.com/shiltian updated https://github.com/llvm/llvm-project/pull/138365 >From c9a8e6e2d67d8e7d029e58402c17d8c9419cd028 Mon Sep 17 00:00:00 2001 From: Shilei Tian Date: Fri, 2 May 2025 19:41:11 -0400 Subject: [PATCH] [Clang][Driver] Enable internalization by default for AMDGPU ---

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-03 Thread Shilei Tian via cfe-commits
shiltian wrote: I don't think OpenMP is more special than HIP here. Anything exposed to the host should not be internalized. In addition, OpenMP actually also heavily uses internalization as well in OpenMPOpt. It is likely that this change exposes something bad in the downstream. The motivati

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-03 Thread Shilei Tian via cfe-commits
shiltian wrote: > also seeing > > "PluginInterface" error: Failure to look up global address: Error in > hsa_executable_get_symbol_by_name(grid_points): > HSA_STATUS_ERROR_INVALID_SYMBOL_NAME: There is no symbol with the given name. > > omptarget error: Failed to load symbol grid_points > >

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-03 Thread Shilei Tian via cfe-commits
shiltian wrote: I got it that you are trying to make it generic. That's why I didn’t roll back to using builtin bitcode as we did before. However there is one limitation that we can't really work around, which is the fact that we don't support ABI linking. This is not a new topic at all and wh

[clang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-19 Thread Shilei Tian via cfe-commits
shiltian wrote: The target hook approach won't work here because `VerifierPass` is part of `LLVMCore`, so it can't depend on target-specific components and doing so would introduce a circular dependency. Instead, I'm thinking of an alternative: make the target-dependent verifier run after the

[clang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-21 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,175 @@ +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" + +#include "llvm/Analysis/UniformityAnalysis.h" +#include "llvm/Analysis/PostDominators.h" +#include "llvm/Support/Debug.h" +#include "llvm/IR/Dominators.h" +#include "llvm/IR/Function.h" +#include "llv

[clang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-21 Thread Shilei Tian via cfe-commits
@@ -2040,6 +2043,8 @@ void AMDGPUCodeGenPassBuilder::addIRPasses(AddIRPass &addPass) const { // but EarlyCSE can do neither of them. if (isPassEnabled(EnableScalarIRPasses)) addEarlyCSEOrGVNPass(addPass); + + addPass(AMDGPUTargetVerifierPass()); shilt

[clang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-21 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,175 @@ +//===--- llvm-isel-fuzzer.cpp - Fuzzer for instruction selection --===// shiltian wrote: the file name doesn't match the actual file name https://github.com/llvm/llvm-project/pull/123609 ___

[clang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-21 Thread Shilei Tian via cfe-commits
@@ -1298,6 +1299,8 @@ void AMDGPUPassConfig::addIRPasses() { addPass(createLICMPass()); } + //addPass(AMDGPUTargetVerifierPass()); shiltian wrote: left over https://github.com/llvm/llvm-project/pull/123609 ___

[clang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-21 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,175 @@ +#include "llvm/Target/TargetVerify/AMDGPUTargetVerifier.h" shiltian wrote: This file doesn't have LLVM copyright header https://github.com/llvm/llvm-project/pull/123609 ___ cfe-commits mailing list c

[clang] [llvm] [TargetVerifier][AMDGPU] Add TargetVerifier. (PR #123609)

2025-04-21 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,175 @@ +//===--- llvm-isel-fuzzer.cpp - Fuzzer for instruction selection --===// shiltian wrote: also, I wonder why do we need an extra binary for this? https://github.com/llvm/llvm-project/pull/123609

[clang] [llvm] [LLVM][Triple][NFCI] Add function to test for offloading triples (PR #126956)

2025-02-17 Thread Shilei Tian via cfe-commits
https://github.com/shiltian approved this pull request. LGTM but I'd like to get a second stamp on it. https://github.com/llvm/llvm-project/pull/126956 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listi

[clang] [OpenMP] Silently accept `neon_vector_type` for the offloading device (PR #127439)

2025-02-17 Thread Shilei Tian via cfe-commits
@@ -0,0 +1,6 @@ +// RUN: %clang -fopenmp --offload-arch=sm_90 -nocudalib -target aarch64-unknown-linux-gnu -c -Xclang -verify %s shiltian wrote: Use `%clang_cc1` here instead of the driver. https://github.com/llvm/llvm-project/pull/127439 __

[clang] [libc] [Clang] Add handlers for 'match_any' and 'match_all' to `gpuintrin.h` (PR #127504)

2025-02-17 Thread Shilei Tian via cfe-commits
@@ -92,6 +92,14 @@ LIBC_INLINE uint32_t shuffle(uint64_t lane_mask, uint32_t idx, uint32_t x, return __gpu_shuffle_idx_u32(lane_mask, idx, x, width); } +LIBC_INLINE uint64_t match_any(uint64_t lane_mask, uint32_t x) { + return __gpu_match_any_u32(lane_mask, x); +} + +LIBC_

[clang] [llvm] [LLVM][Triple][NFCI] Add function to test for offloading triples (PR #126956)

2025-02-17 Thread Shilei Tian via cfe-commits
https://github.com/shiltian edited https://github.com/llvm/llvm-project/pull/126956 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[flang] [libc] [libclc] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in offload and libclc (PR #125826)

2025-02-18 Thread Shilei Tian via cfe-commits
https://github.com/shiltian approved this pull request. https://github.com/llvm/llvm-project/pull/125826 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [Clang] Add handlers for 'match_any' and 'match_all' to `gpuintrin.h` (PR #127504)

2025-02-17 Thread Shilei Tian via cfe-commits
https://github.com/shiltian approved this pull request. https://github.com/llvm/llvm-project/pull/127504 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Offload] Always consider `flto` on for AMDGPU (PR #129118)

2025-02-27 Thread Shilei Tian via cfe-commits
shiltian wrote: We do have some framework teams that are still using non-LTO (or non-gpu-rdc) build. https://github.com/llvm/llvm-project/pull/129118 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listin

[clang] [llvm] [Clang][AMDGPU] Use 32-bit index for SWMMAC builtins (PR #129101)

2025-02-27 Thread Shilei Tian via cfe-commits
https://github.com/shiltian closed https://github.com/llvm/llvm-project/pull/129101 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-03-10 Thread Shilei Tian via cfe-commits
@@ -1582,6 +1582,26 @@ void CodeGenFunction::GenerateCode(GlobalDecl GD, llvm::Function *Fn, // Implicit copy-assignment gets the same special treatment as implicit // copy-constructors. emitImplicitAssignmentOperatorBody(Args); + } else if (FD->hasAttr() && +

[clang] [clang][AMDGPU] Enable module splitting by default (PR #128509)

2025-03-11 Thread Shilei Tian via cfe-commits
shiltian wrote: I'm okay with this change, but did you run a PSDB or even a full testing cycle? https://github.com/llvm/llvm-project/pull/128509 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe

[clang] [llvm] [Clang][AMDGPU] Use 32-bit index for SWMMAC builtins (PR #129101)

2025-02-27 Thread Shilei Tian via cfe-commits
https://github.com/shiltian edited https://github.com/llvm/llvm-project/pull/129101 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Clang][AMDGPU] Use 32-bit index for SWMMAC builtins (PR #129101)

2025-02-27 Thread Shilei Tian via cfe-commits
https://github.com/shiltian updated https://github.com/llvm/llvm-project/pull/129101 >From daec69f37a9b10f4bcf258f3a6f9e45cee72b64d Mon Sep 17 00:00:00 2001 From: Shilei Tian Date: Thu, 27 Feb 2025 14:00:05 -0500 Subject: [PATCH] [AMDGPU] Use 32-bit index for SWMMAC builtins Currently, the ind

[clang] [llvm] [AMDGPU] Use 32-bit index for SWMMAC builtins (PR #129101)

2025-02-27 Thread Shilei Tian via cfe-commits
https://github.com/shiltian created https://github.com/llvm/llvm-project/pull/129101 Currently, the index of SWMMAC builtins is of type `short`, likely based on the assumption that K can only be up to 32, meaning there are only 16 non-zero elements. However, this is not future-proof. This patch

[clang] [llvm] [AMDGPU] Use 32-bit index for SWMMAC builtins (PR #129101)

2025-02-27 Thread Shilei Tian via cfe-commits
shiltian wrote: * **#129101** https://app.graphite.dev/github/pr/llvm/llvm-project/129101?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/129

[clang] [libc] [llvm] Reapply "[AMDGPU] Use COV6 by default (#118515)" (PR #130963)

2025-03-12 Thread Shilei Tian via cfe-commits
shiltian wrote: > > CC @jdoerfert @ye-luo Once this is merged, ROCm 6.3 will be needed to run > > any program compiled for AMDGPU. > > Unless you pass `-mcode-object-version=5` right? Yes. https://github.com/llvm/llvm-project/pull/130963 ___ cfe-com

[clang] [Clang][OpenCL] Fix Missing `-fdeclare-opencl-builtins` When Using `--save-temps` (PR #131017)

2025-03-12 Thread Shilei Tian via cfe-commits
shiltian wrote: * **#131017** https://app.graphite.dev/github/pr/llvm/llvm-project/131017?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/131

[clang] [Clang][OpenCL] Fix Missing `-fdeclare-opencl-builtins` When Using `--save-temps` (PR #131017)

2025-03-12 Thread Shilei Tian via cfe-commits
https://github.com/shiltian created https://github.com/llvm/llvm-project/pull/131017 When compiling an OpenCL program directly with `clang` using `--save-temps`, an error may occur if the program contains OpenCL builtins: ``` test.cl:3:21: error: use of undeclared identifier 'get_global_id'

[clang] [libc] [llvm] Reapply "[AMDGPU] Use COV6 by default (#118515)" (PR #130963)

2025-03-12 Thread Shilei Tian via cfe-commits
shiltian wrote: * **#130963** https://app.graphite.dev/github/pr/llvm/llvm-project/130963?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/130

[clang] [libc] [llvm] Reapply "[AMDGPU] Use COV6 by default (#118515)" (PR #130963)

2025-03-12 Thread Shilei Tian via cfe-commits
https://github.com/shiltian created https://github.com/llvm/llvm-project/pull/130963 This reverts commit 68bcba6d7a1cc18996c0bcb7c62267c62d2040d0. >From 0f831a4a78fefcdf0ac973173397325a1f53d393 Mon Sep 17 00:00:00 2001 From: Shilei Tian Date: Wed, 12 Mar 2025 09:39:45 -0400 Subject: [PATCH] Re

[clang] [libc] [llvm] Reapply "[AMDGPU] Use COV6 by default (#118515)" (PR #130963)

2025-03-12 Thread Shilei Tian via cfe-commits
shiltian wrote: We will need to wait for the AMD bots to be ready. https://github.com/llvm/llvm-project/pull/130963 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] Reapply "[AMDGPU] Use COV6 by default (#118515)" (PR #130963)

2025-03-12 Thread Shilei Tian via cfe-commits
shiltian wrote: CC @jdoerfert @ye-luo Once this is merged, ROCm 6.3 will be needed to run any program compiled for AMDGPU. https://github.com/llvm/llvm-project/pull/130963 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/

[clang] [libc] [llvm] Reapply "[AMDGPU] Use COV6 by default (#118515)" (PR #130963)

2025-03-12 Thread Shilei Tian via cfe-commits
shiltian wrote: > The OpenMP runtime doesn't know how to handle `generic` ISAs right? I think people are working on it? https://github.com/llvm/llvm-project/pull/130963 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi

[clang] [Clang][OpenCL] Fix Missing `-fdeclare-opencl-builtins` When Using `--save-temps` (PR #131017)

2025-03-12 Thread Shilei Tian via cfe-commits
shiltian wrote: Yeah these are implemented in bitcode file, therefore it needs the front end to be able to recognize it instead of treating it as an unknown symbol. https://github.com/llvm/llvm-project/pull/131017 ___ cfe-commits mailing list cfe-comm

[clang] [Clang][OpenCL] Fix Missing `-fdeclare-opencl-builtins` When Using `--save-temps` (PR #131017)

2025-03-12 Thread Shilei Tian via cfe-commits
https://github.com/shiltian closed https://github.com/llvm/llvm-project/pull/131017 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [LLVM][Triple][NFCI] Add function to test for GPU offloading triples (PR #126956)

2025-02-12 Thread Shilei Tian via cfe-commits
@@ -1338,9 +1338,7 @@ struct InformationCache { bool stackIsAccessibleByOtherThreads() { return !targetIsGPU(); } /// Return true if the target is a GPU. - bool targetIsGPU() { -return TargetTriple.isAMDGPU() || TargetTriple.isNVPTX(); - } + bool targetIsGPU() { ret

[clang] [llvm] [LLVM][Triple][NFCI] Add function to test for GPU offloading triples (PR #126956)

2025-02-12 Thread Shilei Tian via cfe-commits
@@ -2624,9 +2624,8 @@ void CGOpenMPRuntime::emitDistributeStaticInit( emitUpdateLocation(CGF, Loc, OMP_IDENT_WORK_DISTRIBUTE); llvm::Value *ThreadId = getThreadID(CGF, Loc); llvm::FunctionCallee StaticInitFunction; - bool isGPUDistribute = - CGM.getLangOpts().Op

[clang] [llvm] [LLVM][Triple][NFCI] Add function to test for GPU offloading triples (PR #126956)

2025-02-12 Thread Shilei Tian via cfe-commits
@@ -1109,6 +1109,11 @@ class Triple { Env == llvm::Triple::EABIHF; } + /// Tests if the target represents a GPU which can be offloaded to. + bool isOffloadingTargetGPU() const { shiltian wrote: Is it better to use `isOffloadingTarget`? https:/

[clang] [llvm] [AMDGPU] Add builtins for wave reduction intrinsics (PR #127013)

2025-02-21 Thread Shilei Tian via cfe-commits
@@ -277,16 +277,31 @@ def : GCNPat <(vt (int_amdgcn_set_inactive vt:$src, vt:$inactive)), def : GCNPat<(i32 (int_amdgcn_set_inactive_chain_arg i32:$src, i32:$inactive)), (V_SET_INACTIVE_B32 0, VGPR_32:$src, 0, VGPR_32:$inactive, (IMPLICIT_DEF))>; -let usesCustomInserter

[clang] [libc] [Clang] Fix cross-lane scan when given divergent lanes (PR #127703)

2025-02-19 Thread Shilei Tian via cfe-commits
@@ -188,6 +186,32 @@ __DO_LANE_SCAN(float, uint32_t, f32);// float __gpu_lane_scan_f32(m, x) __DO_LANE_SCAN(double, uint64_t, f64); // double __gpu_lane_scan_f64(m, x) #undef __DO_LANE_SCAN +// Gets the sum of all lanes inside the warp or wavefront. shi

[clang] [clang][AMDGPU] Enable module splitting by default (PR #128509)

2025-02-25 Thread Shilei Tian via cfe-commits
@@ -7417,7 +7419,7 @@ def fuse_register_sized_bitfield_access: Flag<["-"], "fuse-register-sized-bitfie def relaxed_aliasing : Flag<["-"], "relaxed-aliasing">, HelpText<"Turn off Type Based Alias Analysis">, MarshallingInfoFlag>; -defm pointer_tbaa: BoolOption<"", "pointer-

[clang] [clang][AMDGPU] Enable module splitting by default (PR #128509)

2025-02-25 Thread Shilei Tian via cfe-commits
@@ -708,6 +712,34 @@ void amdgpu::getAMDGPUTargetFeatures(const Driver &D, options::OPT_m_amdgpu_Features_Group); } +static unsigned GetFullLTOPartitions(const Driver &D, const ArgList &Args) { + const Arg *A = Args.getLastArg(options::OPT_flto_par

[clang] [clang][AMDGPU] Enable module splitting by default (PR #128509)

2025-02-25 Thread Shilei Tian via cfe-commits
@@ -708,6 +712,34 @@ void amdgpu::getAMDGPUTargetFeatures(const Driver &D, options::OPT_m_amdgpu_Features_Group); } +static unsigned GetFullLTOPartitions(const Driver &D, const ArgList &Args) { shiltian wrote: ```suggestion static

[clang] [clang][AMDGPU] Enable module splitting by default (PR #128509)

2025-02-25 Thread Shilei Tian via cfe-commits
@@ -708,6 +712,34 @@ void amdgpu::getAMDGPUTargetFeatures(const Driver &D, options::OPT_m_amdgpu_Features_Group); } +static unsigned GetFullLTOPartitions(const Driver &D, const ArgList &Args) { + const Arg *A = Args.getLastArg(options::OPT_flto_par

[clang] [llvm] [clang][IR] Overload @llvm.thread.pointer to support non-AS0 targets (PR #132489)

2025-03-22 Thread Shilei Tian via cfe-commits
shiltian wrote: > (assuming this intrinsic is supported there) The intrinsic is at least not supported by AMDGPU. :-) https://github.com/llvm/llvm-project/pull/132489 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-b

[clang] [llvm] [Clang][AMDGPU] Expose buffer load lds as a clang builtin (PR #132048)

2025-03-20 Thread Shilei Tian via cfe-commits
Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= Message-ID: In-Reply-To: shiltian wrote: > I've also seen that gfx11 seem to have some kind of BUFFER_LOAD_LDS_(SIZE) > instruction (different from the BUFFER_LOAD_(SIZE)_LDS instructions > associated with th

[clang] [NFC][clang] Split clang/lib/CodeGen/CGBuiltin.cpp into target-specific files (PR #132252)

2025-03-20 Thread Shilei Tian via cfe-commits
https://github.com/shiltian commented: I'm super happy to see this change. The AMDGPU part looks good to me! Thanks! https://github.com/llvm/llvm-project/pull/132252 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin

<    5   6   7   8   9   10   11   >