https://github.com/arsenm edited
https://github.com/llvm/llvm-project/pull/157633
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -66,13 +65,11 @@ bool __attribute__((noinline))
__clc_runtime_has_hw_fma32(void);
#define LOG_MAGIC_NUM_SP32 (1 + NUMEXPBITS_SP32 - EXPBIAS_SP32)
_CLC_OVERLOAD _CLC_INLINE float __clc_flush_denormal_if_not_supported(float x)
{
- int ix = __clc_as_int(x);
- if (!__clc_fp
https://github.com/arsenm ready_for_review
https://github.com/llvm/llvm-project/pull/157591
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -3680,6 +3681,7 @@ bool SelectionDAGLegalize::ExpandNode(SDNode *Node) {
Results.push_back(Tmp1);
break;
}
+ case ISD::STACKADDRESS:
case ISD::STACKSAVE:
// Expand to CopyFromReg if the target set
// StackPointerRegisterToSaveRestore.
https://github.com/arsenm closed
https://github.com/llvm/llvm-project/pull/157591
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm closed
https://github.com/llvm/llvm-project/pull/157321
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/157321
>From 5f2205d454e38e63ab6d9ed2a41ff8d8b674ec6b Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Sun, 7 Sep 2025 09:03:22 +0900
Subject: [PATCH 1/2] MC: Add Triple overloads for more MC constructors
Avoids mor
https://github.com/arsenm auto_merge_enabled
https://github.com/llvm/llvm-project/pull/157321
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/157321
>From 5f2205d454e38e63ab6d9ed2a41ff8d8b674ec6b Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Sun, 7 Sep 2025 09:03:22 +0900
Subject: [PATCH] MC: Add Triple overloads for more MC constructors
Avoids more Tr
arsenm wrote:
> Perhaps the TargetParser directory is more suitable for them. Not sure what
> they are used for.
It's closer but not quite right. I think there probably should be some kind of
ABI-information for this sort of stuff. TargetParser is more specific to
frontend-backend interaction
https://github.com/arsenm created
https://github.com/llvm/llvm-project/pull/157321
Avoids more Triple->string->Triple round trip. This
is a continuation of f137c3d592e96330e450a8fd63ef7e8877fc1908
>From 233037d81eee84cc6aafd0708758f898e6b96593 Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date
arsenm wrote:
* **#157321** https://app.graphite.dev/github/pr/llvm/llvm-project/157321?utm_source=stack-comment-icon";
target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite"
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/15732
https://github.com/arsenm ready_for_review
https://github.com/llvm/llvm-project/pull/157321
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/157055
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm commented:
At some point we should split Support into "ReallySupport" and
"RandomStuffThatIRAndCodeGenUse"
https://github.com/llvm/llvm-project/pull/157057
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists
@@ -1557,6 +1557,23 @@ def HIPManaged : InheritableAttr {
let Documentation = [HIPManagedAttrDocs];
}
+def CUDAClusterDims : InheritableAttr {
+ let Spellings = [GNU<"cluster_dims">, Declspec<"__cluster_dims__">];
+ let Args = [ExprArgument<"X">, ExprArgument<"Y", 1>, Expr
https://github.com/arsenm commented:
https://godbolt.org/z/Y6vYeYvaW
This shows my main concern. We should not be skipping the mandatory passes on
this path. The SPIRV consumer should not be taking on the responsibility of
handling the IR-lowered-in-frontend features (mainly always-inline and
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/153883
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
arsenm wrote:
> Not sure what loop(loop-idiom-vectorize) or globaldce are doing there at -O0,
> those seem like bugs
The loop-idiom-recognize [appears to be an aarch64 specific
bug](https://github.com/llvm/llvm-project/issues/156787). I would also consider
the globaldce to be a bug, but I ass
@@ -56,10 +56,19 @@ static bool printOp(const DWARFExpression::Operation *Op,
raw_ostream &OS,
assert(!Name.empty() && "DW_OP has no name!");
OS << Name;
+ std::optional SubOpcode = Op->getSubCode();
+ if (SubOpcode) {
+StringRef SubName = SubOperationEncodingString
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/156778
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -70,10 +79,8 @@ static bool printOp(const DWARFExpression::Operation *Op,
raw_ostream &OS,
unsigned Signed = Size & DWARFExpression::Operation::SignBit;
if (Size == DWARFExpression::Operation::SizeSubOpLEB) {
- StringRef SubName =
- SubOperationEncodi
@@ -6776,6 +6776,28 @@ void Verifier::visitIntrinsicCall(Intrinsic::ID ID,
CallBase &Call) {
"invalid vector type for format", &Call, Src1,
Call.getArgOperand(2));
break;
}
+ case Intrinsic::amdgcn_cooperative_atomic_load_32x4B:
+ case Intrinsic::amdgcn_coop
@@ -6776,6 +6776,28 @@ void Verifier::visitIntrinsicCall(Intrinsic::ID ID,
CallBase &Call) {
"invalid vector type for format", &Call, Src1,
Call.getArgOperand(2));
break;
}
+ case Intrinsic::amdgcn_cooperative_atomic_load_32x4B:
+ case Intrinsic::amdgcn_coop
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/152275
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/155740
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
arsenm wrote:
flang is also adding it
https://github.com/llvm/llvm-project/pull/155740
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -961,3 +961,51 @@ static_assert(fmaDouble1[3] == 26.0);
constexpr float fmaArray[] = {2.0f, 2.0f, 2.0f, 2.0f};
constexpr float fmaResult = __builtin_elementwise_fma(fmaArray[1],
fmaArray[2], fmaArray[3]);
static_assert(fmaResult == 6.0f, "");
+
+static_assert(__builtin_elem
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/154681
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,731 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple amdgcn-- -target-cpu gfx1100 %s -emit-llvm -o - |
FileCheck %s
+
+typedef int int8 __attribute__((ext_vector_type(8)));
+typedef flo
arsenm wrote:
> Dropping in after the fact, is there a reason we called this
> `__builtin_elementwise_ctlz` instead of `__builtin_elementwise_clzg`? The
> builtin is just `clzg` done on each element so the name is confusing me.
It matches the llvm intrinsic name, and the second argument is a d
@@ -774,6 +774,18 @@ static bool isWave32Capable(StringRef GPU, const Triple
&T) {
return IsWave32Capable;
}
+static bool isWave64Capable(StringRef GPU, const Triple &T) {
+ if (T.isAMDGCN()) {
arsenm wrote:
Yes
https://github.com/llvm/llvm-project/pull
arsenm wrote:
> > I'd rather just fix clang. This is a workaround that shouldn't be
> > necessary, and seems to spread the knowledge of the dodgy
> > !__opencl_c_generic_address_space address space handling case to a new place
>
> By fixing clang you mean the cast `(void [[clang::address_space
https://github.com/arsenm edited
https://github.com/llvm/llvm-project/pull/140210
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,731 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple amdgcn-- -target-cpu gfx1100 %s -emit-llvm -o - |
FileCheck %s
+
+typedef int int8 __attribute__((ext_vector_type(8)));
+typedef flo
@@ -0,0 +1,731 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple amdgcn-- -target-cpu gfx1100 %s -emit-llvm -o - |
FileCheck %s
+
+typedef int int8 __attribute__((ext_vector_type(8)));
+typedef flo
https://github.com/arsenm commented:
The main issue now is I think this should start out using an opaque type,
similar to the buffer intrinsics, for the image descriptor instead of the
integer vector
https://github.com/llvm/llvm-project/pull/140210
@@ -160,6 +163,27 @@ static Value *EmitAMDGCNBallotForExec(CodeGenFunction
&CGF, const CallExpr *E,
return Call;
}
+template
arsenm wrote:
This doesn't need to be a template function. You can just check E->getNumArgs
for the loop bounds, and use the know
@@ -112,11 +112,12 @@ bool SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall(unsigned
BuiltinID,
case AMDGPU::BI__builtin_amdgcn_image_load_mip_3d_v4f16_i32:
case AMDGPU::BI__builtin_amdgcn_image_load_mip_cube_v4f32_i32:
case AMDGPU::BI__builtin_amdgcn_image_load_mip_cube_v4f16
https://github.com/arsenm approved this pull request.
I think the only issue is documentation phrasing, but the existing builtins
already have internally inconsistent phrasing, so that can be addressed later
https://github.com/llvm/llvm-project/pull/131995
_
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/152919
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/154474
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -11658,6 +11658,29 @@ bool VectorExprEvaluator::VisitCallExpr(const CallExpr
*E) {
return Success(APValue(ResultElements.data(), ResultElements.size()), E);
}
+ case Builtin::BI__builtin_elementwise_fma: {
+APValue SourceX, SourceY, SourceZ;
+if (!EvaluateAs
@@ -774,6 +774,18 @@ static bool isWave32Capable(StringRef GPU, const Triple
&T) {
return IsWave32Capable;
}
+static bool isWave64Capable(StringRef GPU, const Triple &T) {
+ if (T.isAMDGCN()) {
arsenm wrote:
Everything should be feature driven, not random
arsenm wrote:
I'd rather just fix clang. This is a workaround that shouldn't be necessary,
and seems to spread the knowledge of the dodgy
!__opencl_c_generic_address_space address space handling case to a new place
https://github.com/llvm/llvm-project/pull/152314
_
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/153785
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -56,10 +56,19 @@ static bool printOp(const DWARFExpression::Operation *Op,
raw_ostream &OS,
assert(!Name.empty() && "DW_OP has no name!");
OS << Name;
+ std::optional SubOpcode = Op->getSubCode();
+ if (SubOpcode) {
+StringRef SubName = SubOperationEncodingString
@@ -1011,6 +1018,7 @@ LLVM_ABI StringRef IndexString(unsigned Idx);
LLVM_ABI StringRef FormatString(DwarfFormat Format);
LLVM_ABI StringRef FormatString(bool IsDWARF64);
LLVM_ABI StringRef RLEString(unsigned RLE);
+LLVM_ABI StringRef AddressSpaceString(unsigned AS, llvm::Triple
@@ -120,6 +120,46 @@ inline bool isConstantAddressSpace(unsigned AS) {
return false;
}
}
+
+namespace DWARFAS {
+enum : unsigned {
+ GLOBAL = 0,
+ GENERIC = 1,
+ REGION = 2,
+ LOCAL = 3,
+ PRIVATE_LANE = 5,
+ PRIVATE_WAVE = 6,
+ DEFAULT = GLOBAL,
+};
+} // namespac
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/153785
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/153784
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -774,6 +774,18 @@ static bool isWave32Capable(StringRef GPU, const Triple
&T) {
return IsWave32Capable;
}
+static bool isWave64Capable(StringRef GPU, const Triple &T) {
+ if (T.isAMDGCN()) {
arsenm wrote:
This should be added to the feature flags in th
@@ -304,7 +304,7 @@ set_source_files_properties(
${CMAKE_CURRENT_SOURCE_DIR}/opencl/lib/generic/math/native_sin.cl
${CMAKE_CURRENT_SOURCE_DIR}/opencl/lib/generic/math/native_sqrt.cl
${CMAKE_CURRENT_SOURCE_DIR}/opencl/lib/generic/math/native_tan.cl
- PROPERTIES COMPILE_OP
arsenm wrote:
I think fp contract should be globally enabled in the build, and selectively
disabled in the handful of places that it is problematic (namely specific
blocks in expF, sinbF, and trig reductions)
https://github.com/llvm/llvm-project/pull/153137
___
@@ -257,11 +257,13 @@ namespace llvm {
assert(Other.segmentSet == nullptr &&
"Copying of LiveRanges with active SegmentSets is not supported");
// Duplicate valnos.
+ auto FirstNewVNIIdx = valnos.size();
arsenm wrote:
```suggestio
https://github.com/arsenm commented:
Probably do need to drop to a unit test for this
https://github.com/llvm/llvm-project/pull/148790
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm edited
https://github.com/llvm/llvm-project/pull/148790
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -691,9 +696,12 @@ Error runNVLink(ArrayRef Files, const ArgList
&Args) {
if (Args.hasArg(OPT_lto_emit_asm) || Args.hasArg(OPT_lto_emit_llvm))
return Error::success();
- std::string CudaPath = Args.getLastArgValue(OPT_cuda_path_EQ).str();
- Expected NVLinkPath =
-
@@ -141,6 +141,16 @@ static void diagnoseNonConstexprBuiltin(InterpState &S,
CodePtr OpPC,
S.CCEDiag(Loc, diag::note_invalid_subexpr_in_const_expr);
}
+// Same implementation as Compiler::getRoundingMode.
+static llvm::RoundingMode getRoundingMode(const InterpState &S, co
@@ -691,9 +696,12 @@ Error runNVLink(ArrayRef Files, const ArgList
&Args) {
if (Args.hasArg(OPT_lto_emit_asm) || Args.hasArg(OPT_lto_emit_llvm))
return Error::success();
- std::string CudaPath = Args.getLastArgValue(OPT_cuda_path_EQ).str();
- Expected NVLinkPath =
-
@@ -2725,6 +2727,22 @@ void OMPClausePrinter::VisitOMPXDynCGroupMemClause(
OS << ")";
}
+void OMPClausePrinter::VisitOMPDynGroupprivateClause(OMPDynGroupprivateClause
*Node) {
+ OS << "dyn_groupprivate(";
+ if (Node->getFirstDynGroupprivateModifier() !=
OMPC_SCHEDULE_MOD
@@ -0,0 +1,21 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apac
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/152440
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm commented:
ocml rsqrt should be deleted, this is implementable by
```
float rsqrt(float x) {
#pragma clang fp contract(fast)
return 1.0f / __builtin_elementwise_sqrt(x);
}
```
https://github.com/llvm/llvm-project/pull/152436
@@ -0,0 +1,21 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apac
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/152259
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/149216
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
arsenm wrote:
I'd view assertively stating its undefined behavior as a move away from the
status quo. For `__builtin_clz`, GCC states `the result is undefined`. Clang
appears to not have separate documentation for that exact spelling. It does
have documentation for `__builtin_clzg`, which has
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/151446
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,37 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apac
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/152088
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -165,8 +165,9 @@ int printGPUsByHIP() {
llvm::sys::DynamicLibrary::getPermanentLibrary(DynamicHIPPath.c_str(),
&ErrMsg));
if (!DynlibHandle->isValid()) {
-llvm::errs() << "Failed to load " << DynamicHIPPath <<
https://github.com/arsenm edited
https://github.com/llvm/llvm-project/pull/151964
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/151918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/151852
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/151804
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
arsenm wrote:
> My literal reading of this is already that it's always undefined behaviour
> for all targets (as you're requesting),
That's not what I'm requesting, it would be an undefined *value*. It is not
instant undefined behavior, you would get UB on use of that value
https://github.co
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -S -verify=expected -o - %s
+// REQUIRES: amdgpu-registered-target
+
+void test_raw_ptr_atomics(__amdgpu_buffer_rsrc_t rsrc, float f32, double f64,
int offset, int soffset) {
+ f32 = __builtin_amdgcn_raw_ptr_buff
@@ -459,6 +459,8 @@ void AMDGPU::fillAMDGPUFeatureMap(StringRef GPU, const
Triple &T,
Features["atomic-global-pk-add-bf16-inst"] = true;
Features["atomic-ds-pk-add-16-insts"] = true;
Features["setprio-inc-wg-inst"] = true;
+ Features["atomic-fmin-fmax-gl
@@ -163,6 +163,13 @@ BUILTIN(__builtin_amdgcn_raw_buffer_load_b64,
"V2UiQbiiIi", "n")
BUILTIN(__builtin_amdgcn_raw_buffer_load_b96, "V3UiQbiiIi", "n")
BUILTIN(__builtin_amdgcn_raw_buffer_load_b128, "V4UiQbiiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_ptr_buffer_atomic_add_i32, "i
@@ -0,0 +1,9 @@
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu gfx90a
-target-feature +atomic-fmin-fmax-global-f32 -target-feature
+atomic-fmin-fmax-global-f64 -S -verify=expected -o - %s
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu gfx942
-tar
@@ -0,0 +1,24 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-feature
+atomic-fmin-fmax-global-f32 -target-feature +atomic-fmin-fmax-global-f64
-emit-llvm -o - %s
@@ -0,0 +1,36 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apac
https://github.com/arsenm commented:
Missing test
https://github.com/llvm/llvm-project/pull/151239
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
arsenm wrote:
> as it could potentially override functions intended to be provided by the
> ROCm device libraries.
These don't provide any of the standard entry point names
https://github.com/llvm/llvm-project/pull/151239
___
cfe-commits mailing l
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/151278
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
arsenm wrote:
> HIP does not only not need the C device runtime,
It does. The language bakes in the assumption that all of the libc and libm
calls exist in the ambient environment. A header file wrapper is not sufficient
(e.g. we need workarounds like 77de8a0c0abc9d245a7c6278670554b47ae183ea)
@@ -74,30 +74,6 @@
// O0-CGO2-SAME: "-O0"
// O0-CGO2-NOT: "--lto-CGO2"
-// ALL-NOT: "{{.*}}opt"
arsenm wrote:
If the old driver isn't being removed, the tests should have run lines with the
new and old version
https://github.com/llvm/llvm-project/pull/1233
@@ -4886,6 +4886,10 @@ Action *Driver::BuildOffloadingActions(Compilation &C,
// individually.
for (Action *&A : DeviceActions) {
if ((A->getType() != types::TY_Object &&
+ !(A->getOffloadingToolChain() &&
+ A->getOffloadingToolChain()->getTr
@@ -6627,6 +6627,54 @@ void Verifier::visitIntrinsicCall(Intrinsic::ID ID,
CallBase &Call) {
"invalid vector type for format", &Call, Src1,
Call.getArgOperand(5));
break;
}
+ case Intrinsic::amdgcn_wmma_f32_16x16x128_f8f6f4: {
+Value *Src0 = Call.getArgOp
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/149763
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/149450
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -77,6 +79,82 @@ class RocmInstallationDetector {
SPACKReleaseStr(SPACKReleaseStr.str()) {}
};
+ struct CommonBitcodeLibsPreferences {
+CommonBitcodeLibsPreferences(const Driver &D,
+ const llvm::opt::ArgList &DriverArgs,
+
@@ -25,4 +25,27 @@ define amdgpu_ps void @llvm_log2_bf16_s(ptr addrspace(1)
%out, bfloat inreg %src
ret void
}
+define amdgpu_ps void @llvm_exp2_bf16_v(ptr addrspace(1) %out, bfloat %src) {
+; GCN-LABEL: llvm_exp2_bf16_v:
+; GCN: ; %bb.0:
+; GCN-NEXT:v_exp_bf16_e3
@@ -121,6 +121,11 @@ enum NodeType {
/// function calling this intrinsic.
SPONENTRY,
+ /// STACKADDR - Represents the llvm.stackaddr intrinsic. Takes no argument
+ /// and returns the starting address of the stack region that may be used
+ /// by called functions.
+ ST
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/149253
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,240 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
UTC_ARGS: --version 5
arsenm wrote:
Can you give this a bf16 suffix instead of bfloat, and add the other targets
where it's not legal
https://github.com/llvm/llvm
https://github.com/arsenm commented:
Testcase?
https://github.com/llvm/llvm-project/pull/148790
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/148762
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -704,12 +704,12 @@ void diagnoseUnknownMMRAASName(const MachineInstr &MI,
StringRef AS) {
DiagnosticInfoUnsupported(Fn, Str.str(), MI.getDebugLoc(), DS_Warning));
}
-/// Reads \p MI's MMRAs to parse the "amdgpu-as" MMRA.
+/// Reads \p MI's MMRAs to parse the "amdgpu-
1 - 100 of 2183 matches
Mail list logo