[llvm-branch-commits] [clang] [clang][FMV][AArch64] Improve streaming mode compatibility (PR #101007)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101007 >From b9203b6067c868d4305400f0964dbac8e15285db Mon Sep 17 00:00:00 2001 From: Alexandros Lamprineas Date: Tue, 23 Jul 2024 19:24:41 +0100 Subject: [PATCH 1/5] [clang][FMV][AArch64] Improve streaming mode compatibility. * Allow arm-streaming if all the functions versions adhere to it. * Allow arm-streaming-compatible if all the functions versions adhere to it. * Allow arm-locally-streaming regardless of the other functions versions. When the caller needs to toggle the streaming mode all the function versions of the callee must adhere to the same mode, otherwise the call will yield a runtime error. Imagine the versions of the callee live in separate TUs. The version that is visible to the caller will determine the calling convention used when generating code for the callsite. Therefore we cannot support mixing streaming with non-streaming function versions. Imagine TU1 has a streaming caller and calls foo._sme which is streaming-compatible. The codegen for the callsite will not switch off the streaming mode. Then in TU2 we have a version which is non-streaming and could potentially be called in streaming mode. Similarly if the caller is non-streaming and the called version is streaming-compatible the codegen for the callsite will not switch on the streaming mode, but other versions may be streaming. --- clang/include/clang/AST/ASTContext.h | 16 +++ .../clang/Basic/DiagnosticSemaKinds.td| 2 - clang/lib/AST/ASTContext.cpp | 3 +- clang/lib/Sema/SemaDecl.cpp | 27 ++-- clang/lib/Sema/SemaDeclAttr.cpp | 7 --- clang/test/Sema/aarch64-fmv-streaming.c | 43 +++ clang/test/Sema/aarch64-sme-func-attrs.c | 42 -- 7 files changed, 83 insertions(+), 57 deletions(-) create mode 100644 clang/test/Sema/aarch64-fmv-streaming.c diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h index 6d1c8ca8a2f96..a86394a51db16 100644 --- a/clang/include/clang/AST/ASTContext.h +++ b/clang/include/clang/AST/ASTContext.h @@ -3189,6 +3189,22 @@ class ASTContext : public RefCountedBase { const FunctionDecl *FD, llvm::function_ref Pred) const; + bool areFMVCompatible(const FunctionDecl *FD1, +const FunctionDecl *FD2) const { +if (!hasSameType(FD1->getReturnType(), FD2->getReturnType())) + return false; + +if (FD1->getNumParams() != FD2->getNumParams()) + return false; + +for (unsigned I = 0; I < FD1->getNumParams(); ++I) + if (!hasSameType(FD1->getParamDecl(I)->getOriginalType(), + FD2->getParamDecl(I)->getOriginalType())) +return false; + +return true; + } + const CXXConstructorDecl * getCopyConstructorForExceptionObject(CXXRecordDecl *RD); diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 95ce4166ceb66..8a00fe21a08ce 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -3811,8 +3811,6 @@ def warn_sme_locally_streaming_has_vl_args_returns : Warning< InGroup, DefaultIgnore; def err_conflicting_attributes_arm_state : Error< "conflicting attributes for state '%0'">; -def err_sme_streaming_cannot_be_multiversioned : Error< - "streaming function cannot be multi-versioned">; def err_unknown_arm_state : Error< "unknown state '%0'">; def err_missing_arm_state : Error< diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp index 7af9ea7105bb0..8f121ed0fe86c 100644 --- a/clang/lib/AST/ASTContext.cpp +++ b/clang/lib/AST/ASTContext.cpp @@ -12451,8 +12451,7 @@ void ASTContext::forEachMultiversionedFunctionVersion( for (auto *CurDecl : FD->getDeclContext()->getRedeclContext()->lookup(FD->getDeclName())) { FunctionDecl *CurFD = CurDecl->getAsFunction()->getMostRecentDecl(); -if (CurFD && hasSameType(CurFD->getType(), FD->getType()) && -!SeenDecls.contains(CurFD)) { +if (CurFD && areFMVCompatible(CurFD, FD) && !SeenDecls.contains(CurFD)) { SeenDecls.insert(CurFD); Pred(CurFD); } diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp index f60cc78be4f92..77331f0ca0997 100644 --- a/clang/lib/Sema/SemaDecl.cpp +++ b/clang/lib/Sema/SemaDecl.cpp @@ -11014,6 +11014,9 @@ static bool AttrCompatibleWithMultiVersion(attr::Kind Kind, switch (Kind) { default: return false; + case attr::ArmLocallyStreaming: +return MVKind == MultiVersionKind::TargetVersion || + MVKind == MultiVersionKind::TargetClones; case attr::Used: return MVKind == MultiVersionKind::Target; case attr::NonNull: @@ -11150,7 +11153,24 @@ bool Sema::areMultiversionVariantFunctionsCompatible( FunctionType::ExtInfo OldTypeInfo = OldType->getExtInfo(); FunctionType::ExtI
[llvm-branch-commits] [clang] b9203b6 - [clang][FMV][AArch64] Improve streaming mode compatibility.
Author: Alexandros Lamprineas Date: 2024-08-01T09:02:41+02:00 New Revision: b9203b6067c868d4305400f0964dbac8e15285db URL: https://github.com/llvm/llvm-project/commit/b9203b6067c868d4305400f0964dbac8e15285db DIFF: https://github.com/llvm/llvm-project/commit/b9203b6067c868d4305400f0964dbac8e15285db.diff LOG: [clang][FMV][AArch64] Improve streaming mode compatibility. * Allow arm-streaming if all the functions versions adhere to it. * Allow arm-streaming-compatible if all the functions versions adhere to it. * Allow arm-locally-streaming regardless of the other functions versions. When the caller needs to toggle the streaming mode all the function versions of the callee must adhere to the same mode, otherwise the call will yield a runtime error. Imagine the versions of the callee live in separate TUs. The version that is visible to the caller will determine the calling convention used when generating code for the callsite. Therefore we cannot support mixing streaming with non-streaming function versions. Imagine TU1 has a streaming caller and calls foo._sme which is streaming-compatible. The codegen for the callsite will not switch off the streaming mode. Then in TU2 we have a version which is non-streaming and could potentially be called in streaming mode. Similarly if the caller is non-streaming and the called version is streaming-compatible the codegen for the callsite will not switch on the streaming mode, but other versions may be streaming. Added: clang/test/Sema/aarch64-fmv-streaming.c Modified: clang/include/clang/AST/ASTContext.h clang/include/clang/Basic/DiagnosticSemaKinds.td clang/lib/AST/ASTContext.cpp clang/lib/Sema/SemaDecl.cpp clang/lib/Sema/SemaDeclAttr.cpp clang/test/Sema/aarch64-sme-func-attrs.c Removed: diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h index 6d1c8ca8a2f96..a86394a51db16 100644 --- a/clang/include/clang/AST/ASTContext.h +++ b/clang/include/clang/AST/ASTContext.h @@ -3189,6 +3189,22 @@ class ASTContext : public RefCountedBase { const FunctionDecl *FD, llvm::function_ref Pred) const; + bool areFMVCompatible(const FunctionDecl *FD1, +const FunctionDecl *FD2) const { +if (!hasSameType(FD1->getReturnType(), FD2->getReturnType())) + return false; + +if (FD1->getNumParams() != FD2->getNumParams()) + return false; + +for (unsigned I = 0; I < FD1->getNumParams(); ++I) + if (!hasSameType(FD1->getParamDecl(I)->getOriginalType(), + FD2->getParamDecl(I)->getOriginalType())) +return false; + +return true; + } + const CXXConstructorDecl * getCopyConstructorForExceptionObject(CXXRecordDecl *RD); diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 95ce4166ceb66..8a00fe21a08ce 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -3811,8 +3811,6 @@ def warn_sme_locally_streaming_has_vl_args_returns : Warning< InGroup, DefaultIgnore; def err_conflicting_attributes_arm_state : Error< "conflicting attributes for state '%0'">; -def err_sme_streaming_cannot_be_multiversioned : Error< - "streaming function cannot be multi-versioned">; def err_unknown_arm_state : Error< "unknown state '%0'">; def err_missing_arm_state : Error< diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp index 7af9ea7105bb0..8f121ed0fe86c 100644 --- a/clang/lib/AST/ASTContext.cpp +++ b/clang/lib/AST/ASTContext.cpp @@ -12451,8 +12451,7 @@ void ASTContext::forEachMultiversionedFunctionVersion( for (auto *CurDecl : FD->getDeclContext()->getRedeclContext()->lookup(FD->getDeclName())) { FunctionDecl *CurFD = CurDecl->getAsFunction()->getMostRecentDecl(); -if (CurFD && hasSameType(CurFD->getType(), FD->getType()) && -!SeenDecls.contains(CurFD)) { +if (CurFD && areFMVCompatible(CurFD, FD) && !SeenDecls.contains(CurFD)) { SeenDecls.insert(CurFD); Pred(CurFD); } diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp index f60cc78be4f92..77331f0ca0997 100644 --- a/clang/lib/Sema/SemaDecl.cpp +++ b/clang/lib/Sema/SemaDecl.cpp @@ -11014,6 +11014,9 @@ static bool AttrCompatibleWithMultiVersion(attr::Kind Kind, switch (Kind) { default: return false; + case attr::ArmLocallyStreaming: +return MVKind == MultiVersionKind::TargetVersion || + MVKind == MultiVersionKind::TargetClones; case attr::Used: return MVKind == MultiVersionKind::Target; case attr::NonNull: @@ -11150,7 +11153,24 @@ bool Sema::areMultiversionVariantFunctionsCompatible( FunctionType::ExtInfo OldTypeInfo = OldType->getExtInfo(); FunctionType::ExtInfo NewTypeInfo = NewType->getExtInfo(); -if (OldTypeInfo.getCC
[llvm-branch-commits] [clang] 00d9703 - Changes from last revision:
Author: Alexandros Lamprineas Date: 2024-08-01T09:02:41+02:00 New Revision: 00d97039a6dacd17beebce32b727e8c23900eeae URL: https://github.com/llvm/llvm-project/commit/00d97039a6dacd17beebce32b727e8c23900eeae DIFF: https://github.com/llvm/llvm-project/commit/00d97039a6dacd17beebce32b727e8c23900eeae.diff LOG: Changes from last revision: * Disregard declarations with different variadic type. Note that we are not diagnosing such differences, we just do not consider the two declarations part of the same declaration chain. As a result the diagnostic comes upon use: "ambiguous call". This is NFC. * Added a sema test for variadic type mismatch. * Added a codegen test for the calling conventions. Added: clang/test/CodeGen/aarch64-fmv-streaming.c Modified: clang/include/clang/AST/ASTContext.h clang/test/Sema/attr-target-version.c Removed: diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h index a86394a51db16..419104059838f 100644 --- a/clang/include/clang/AST/ASTContext.h +++ b/clang/include/clang/AST/ASTContext.h @@ -3194,6 +3194,9 @@ class ASTContext : public RefCountedBase { if (!hasSameType(FD1->getReturnType(), FD2->getReturnType())) return false; +if (FD1->isVariadic() != FD2->isVariadic()) + return false; + if (FD1->getNumParams() != FD2->getNumParams()) return false; diff --git a/clang/test/CodeGen/aarch64-fmv-streaming.c b/clang/test/CodeGen/aarch64-fmv-streaming.c new file mode 100644 index 0..e777a53b2f038 --- /dev/null +++ b/clang/test/CodeGen/aarch64-fmv-streaming.c @@ -0,0 +1,106 @@ +// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -emit-llvm -o - %s | FileCheck %s + + +// CHECK-LABEL: define {{[^@]+}}@n_callee._Msve +// CHECK-SAME: () #[[ATTR0:[0-9]+]] { +// +// CHECK-LABEL: define {{[^@]+}}@n_callee._Msimd +// CHECK-SAME: () #[[ATTR1:[0-9]+]] { +// +__arm_locally_streaming __attribute__((target_clones("sve", "simd"))) void n_callee(void) {} +// CHECK-LABEL: define {{[^@]+}}@n_callee._Msme2 +// CHECK-SAME: () #[[ATTR2:[0-9]+]] { +// +__attribute__((target_version("sme2"))) void n_callee(void) {} +// CHECK-LABEL: define {{[^@]+}}@n_callee.default +// CHECK-SAME: () #[[ATTR3:[0-9]+]] { +// +__attribute__((target_version("default"))) void n_callee(void) {} + + +// CHECK-LABEL: define {{[^@]+}}@s_callee._Msve +// CHECK-SAME: () #[[ATTR4:[0-9]+]] { +// +// CHECK-LABEL: define {{[^@]+}}@s_callee._Msimd +// CHECK-SAME: () #[[ATTR5:[0-9]+]] { +// +__attribute__((target_clones("sve", "simd"))) void s_callee(void) __arm_streaming {} +// CHECK-LABEL: define {{[^@]+}}@s_callee._Msme2 +// CHECK-SAME: () #[[ATTR6:[0-9]+]] { +// +__arm_locally_streaming __attribute__((target_version("sme2"))) void s_callee(void) {} +// CHECK-LABEL: define {{[^@]+}}@s_callee.default +// CHECK-SAME: () #[[ATTR7:[0-9]+]] { +// +__attribute__((target_version("default"))) void s_callee(void) __arm_streaming {} + + +// CHECK-LABEL: define {{[^@]+}}@sc_callee._Msve +// CHECK-SAME: () #[[ATTR8:[0-9]+]] { +// +// CHECK-LABEL: define {{[^@]+}}@sc_callee._Msimd +// CHECK-SAME: () #[[ATTR9:[0-9]+]] { +// +__attribute__((target_clones("sve", "simd"))) void sc_callee(void) __arm_streaming_compatible {} +// CHECK-LABEL: define {{[^@]+}}@sc_callee._Msme2 +// CHECK-SAME: () #[[ATTR6:[0-9]+]] { +// +__arm_locally_streaming __attribute__((target_version("sme2"))) void sc_callee(void) {} +// CHECK-LABEL: define {{[^@]+}}@sc_callee.default +// CHECK-SAME: () #[[ATTR10:[0-9]+]] { +// +__attribute__((target_version("default"))) void sc_callee(void) __arm_streaming_compatible {} + + +// CHECK-LABEL: define {{[^@]+}}@n_caller +// CHECK-SAME: () #[[ATTR3:[0-9]+]] { +// CHECK:call void @n_callee() +// CHECK:call void @s_callee() #[[ATTR11:[0-9]+]] +// CHECK:call void @sc_callee() #[[ATTR12:[0-9]+]] +// +void n_caller(void) { + n_callee(); + s_callee(); + sc_callee(); +} + + +// CHECK-LABEL: define {{[^@]+}}@s_caller +// CHECK-SAME: () #[[ATTR7:[0-9]+]] { +// CHECK:call void @n_callee() +// CHECK:call void @s_callee() #[[ATTR11]] +// CHECK:call void @sc_callee() #[[ATTR12]] +// +void s_caller(void) __arm_streaming { + n_callee(); + s_callee(); + sc_callee(); +} + + +// CHECK-LABEL: define {{[^@]+}}@sc_caller +// CHECK-SAME: () #[[ATTR10:[0-9]+]] { +// CHECK:call void @n_callee() +// CHECK:call void @s_callee() #[[ATTR11]] +// CHECK:call void @sc_callee() #[[ATTR12]] +// +void sc_caller(void) __arm_streaming_compatible { + n_callee(); + s_callee(); + sc_callee(); +} + + +// CHECK: attributes #[[ATTR0:[0-9]+]] = {{.*}} "aarch64_pstate_sm_body" +// CHECK: attributes #[[ATTR1:[0-9]+]] = {{.*}} "aarch64_pstate_sm_body" +// CHECK: attributes #[[ATTR2:[0-9]+]] = {{.*}} +// CHECK: attributes #[[ATTR3]] = {{.*}} +// CHECK: attributes #[[ATTR4:[0-9]+]] = {{.*}} "aarch64_pstate_sm_enabled
[llvm-branch-commits] [clang] 84de157 - Changes from last revision:
Author: Alexandros Lamprineas Date: 2024-08-01T09:02:41+02:00 New Revision: 84de15796052e629a2276bcf1d502d1a8163e32b URL: https://github.com/llvm/llvm-project/commit/84de15796052e629a2276bcf1d502d1a8163e32b DIFF: https://github.com/llvm/llvm-project/commit/84de15796052e629a2276bcf1d502d1a8163e32b.diff LOG: Changes from last revision: Made __arm_locally_streaming require the same calling convention as the rest of the callee versions and updated the tests. Added: Modified: clang/lib/Sema/SemaDecl.cpp clang/test/CodeGen/aarch64-fmv-streaming.c clang/test/Sema/aarch64-fmv-streaming.c Removed: diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp index 77331f0ca0997..4dc72063e54c0 100644 --- a/clang/lib/Sema/SemaDecl.cpp +++ b/clang/lib/Sema/SemaDecl.cpp @@ -11157,9 +11157,7 @@ bool Sema::areMultiversionVariantFunctionsCompatible( const auto *NewFPT = NewFD->getType()->getAs(); bool ArmStreamingCCMismatched = false; -// Locally streaming does not affect the calling convention. -if (OldFPT && NewFPT && !OldFD->hasAttr() && -!NewFD->hasAttr()) { +if (OldFPT && NewFPT) { unsigned Diff = OldFPT->getAArch64SMEAttributes() ^ NewFPT->getAArch64SMEAttributes(); // Streaming versions cannot be mixed with non-streaming versions. diff --git a/clang/test/CodeGen/aarch64-fmv-streaming.c b/clang/test/CodeGen/aarch64-fmv-streaming.c index e777a53b2f038..e549ccda59ad8 100644 --- a/clang/test/CodeGen/aarch64-fmv-streaming.c +++ b/clang/test/CodeGen/aarch64-fmv-streaming.c @@ -28,7 +28,7 @@ __attribute__((target_clones("sve", "simd"))) void s_callee(void) __arm_streamin // CHECK-LABEL: define {{[^@]+}}@s_callee._Msme2 // CHECK-SAME: () #[[ATTR6:[0-9]+]] { // -__arm_locally_streaming __attribute__((target_version("sme2"))) void s_callee(void) {} +__arm_locally_streaming __attribute__((target_version("sme2"))) void s_callee(void) __arm_streaming {} // CHECK-LABEL: define {{[^@]+}}@s_callee.default // CHECK-SAME: () #[[ATTR7:[0-9]+]] { // @@ -43,11 +43,11 @@ __attribute__((target_version("default"))) void s_callee(void) __arm_streaming { // __attribute__((target_clones("sve", "simd"))) void sc_callee(void) __arm_streaming_compatible {} // CHECK-LABEL: define {{[^@]+}}@sc_callee._Msme2 -// CHECK-SAME: () #[[ATTR6:[0-9]+]] { +// CHECK-SAME: () #[[ATTR10:[0-9]+]] { // -__arm_locally_streaming __attribute__((target_version("sme2"))) void sc_callee(void) {} +__arm_locally_streaming __attribute__((target_version("sme2"))) void sc_callee(void) __arm_streaming_compatible {} // CHECK-LABEL: define {{[^@]+}}@sc_callee.default -// CHECK-SAME: () #[[ATTR10:[0-9]+]] { +// CHECK-SAME: () #[[ATTR11:[0-9]+]] { // __attribute__((target_version("default"))) void sc_callee(void) __arm_streaming_compatible {} @@ -55,8 +55,8 @@ __attribute__((target_version("default"))) void sc_callee(void) __arm_streaming_ // CHECK-LABEL: define {{[^@]+}}@n_caller // CHECK-SAME: () #[[ATTR3:[0-9]+]] { // CHECK:call void @n_callee() -// CHECK:call void @s_callee() #[[ATTR11:[0-9]+]] -// CHECK:call void @sc_callee() #[[ATTR12:[0-9]+]] +// CHECK:call void @s_callee() #[[ATTR12:[0-9]+]] +// CHECK:call void @sc_callee() #[[ATTR13:[0-9]+]] // void n_caller(void) { n_callee(); @@ -68,8 +68,8 @@ void n_caller(void) { // CHECK-LABEL: define {{[^@]+}}@s_caller // CHECK-SAME: () #[[ATTR7:[0-9]+]] { // CHECK:call void @n_callee() -// CHECK:call void @s_callee() #[[ATTR11]] -// CHECK:call void @sc_callee() #[[ATTR12]] +// CHECK:call void @s_callee() #[[ATTR12]] +// CHECK:call void @sc_callee() #[[ATTR13]] // void s_caller(void) __arm_streaming { n_callee(); @@ -79,10 +79,10 @@ void s_caller(void) __arm_streaming { // CHECK-LABEL: define {{[^@]+}}@sc_caller -// CHECK-SAME: () #[[ATTR10:[0-9]+]] { +// CHECK-SAME: () #[[ATTR11:[0-9]+]] { // CHECK:call void @n_callee() -// CHECK:call void @s_callee() #[[ATTR11]] -// CHECK:call void @sc_callee() #[[ATTR12]] +// CHECK:call void @s_callee() #[[ATTR12]] +// CHECK:call void @sc_callee() #[[ATTR13]] // void sc_caller(void) __arm_streaming_compatible { n_callee(); @@ -97,10 +97,11 @@ void sc_caller(void) __arm_streaming_compatible { // CHECK: attributes #[[ATTR3]] = {{.*}} // CHECK: attributes #[[ATTR4:[0-9]+]] = {{.*}} "aarch64_pstate_sm_enabled" // CHECK: attributes #[[ATTR5:[0-9]+]] = {{.*}} "aarch64_pstate_sm_enabled" -// CHECK: attributes #[[ATTR6:[0-9]+]] = {{.*}} "aarch64_pstate_sm_body" +// CHECK: attributes #[[ATTR6:[0-9]+]] = {{.*}} "aarch64_pstate_sm_body" "aarch64_pstate_sm_enabled" // CHECK: attributes #[[ATTR7]] = {{.*}} "aarch64_pstate_sm_enabled" // CHECK: attributes #[[ATTR8:[0-9]+]] = {{.*}} "aarch64_pstate_sm_compatible" // CHECK: attributes #[[ATTR9:[0-9]+]] = {{.*}} "aarch64_pstate_sm_compatible" -// CHECK: attri
[llvm-branch-commits] [clang] 196fb42 - Changes fro last revision:
Author: Alexandros Lamprineas Date: 2024-08-01T09:02:41+02:00 New Revision: 196fb42d2ef10cc6b3c9732c2612d2cd2973d340 URL: https://github.com/llvm/llvm-project/commit/196fb42d2ef10cc6b3c9732c2612d2cd2973d340 DIFF: https://github.com/llvm/llvm-project/commit/196fb42d2ef10cc6b3c9732c2612d2cd2973d340.diff LOG: Changes fro last revision: Combined two separate SME_PState bitmask checks into one as suggested. Added: Modified: clang/lib/Sema/SemaDecl.cpp Removed: diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp index 4dc72063e54c0..01231f8e385ef 100644 --- a/clang/lib/Sema/SemaDecl.cpp +++ b/clang/lib/Sema/SemaDecl.cpp @@ -11160,11 +11160,10 @@ bool Sema::areMultiversionVariantFunctionsCompatible( if (OldFPT && NewFPT) { unsigned Diff = OldFPT->getAArch64SMEAttributes() ^ NewFPT->getAArch64SMEAttributes(); - // Streaming versions cannot be mixed with non-streaming versions. - if (Diff & FunctionType::SME_PStateSMEnabledMask) -ArmStreamingCCMismatched = true; - // Streaming-compatible versions cannot be mixed with anything else. - if (Diff & FunctionType::SME_PStateSMCompatibleMask) + // Arm-streaming, arm-streaming-compatible and non-streaming versions + // cannot be mixed. + if (Diff & (FunctionType::SME_PStateSMEnabledMask | + FunctionType::SME_PStateSMCompatibleMask)) ArmStreamingCCMismatched = true; } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] bb412b7 - Changes from last revision:
Author: Alexandros Lamprineas Date: 2024-08-01T09:02:41+02:00 New Revision: bb412b723139a7a7f20f768857d9d4c656ac6fbb URL: https://github.com/llvm/llvm-project/commit/bb412b723139a7a7f20f768857d9d4c656ac6fbb DIFF: https://github.com/llvm/llvm-project/commit/bb412b723139a7a7f20f768857d9d4c656ac6fbb.diff LOG: Changes from last revision: * Replaced areFMVCompatible with hasSameType. Looks like this change was unnecessary in the first place. Most likely a residue from my WIP before I raised the PR. Thanks Sander for finding this! * Removed the corresponding sema test for variadic type mismatch. Added: Modified: clang/include/clang/AST/ASTContext.h clang/lib/AST/ASTContext.cpp clang/test/Sema/attr-target-version.c Removed: diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h index 419104059838f..6d1c8ca8a2f96 100644 --- a/clang/include/clang/AST/ASTContext.h +++ b/clang/include/clang/AST/ASTContext.h @@ -3189,25 +3189,6 @@ class ASTContext : public RefCountedBase { const FunctionDecl *FD, llvm::function_ref Pred) const; - bool areFMVCompatible(const FunctionDecl *FD1, -const FunctionDecl *FD2) const { -if (!hasSameType(FD1->getReturnType(), FD2->getReturnType())) - return false; - -if (FD1->isVariadic() != FD2->isVariadic()) - return false; - -if (FD1->getNumParams() != FD2->getNumParams()) - return false; - -for (unsigned I = 0; I < FD1->getNumParams(); ++I) - if (!hasSameType(FD1->getParamDecl(I)->getOriginalType(), - FD2->getParamDecl(I)->getOriginalType())) -return false; - -return true; - } - const CXXConstructorDecl * getCopyConstructorForExceptionObject(CXXRecordDecl *RD); diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp index 8f121ed0fe86c..7af9ea7105bb0 100644 --- a/clang/lib/AST/ASTContext.cpp +++ b/clang/lib/AST/ASTContext.cpp @@ -12451,7 +12451,8 @@ void ASTContext::forEachMultiversionedFunctionVersion( for (auto *CurDecl : FD->getDeclContext()->getRedeclContext()->lookup(FD->getDeclName())) { FunctionDecl *CurFD = CurDecl->getAsFunction()->getMostRecentDecl(); -if (CurFD && areFMVCompatible(CurFD, FD) && !SeenDecls.contains(CurFD)) { +if (CurFD && hasSameType(CurFD->getType(), FD->getType()) && +!SeenDecls.contains(CurFD)) { SeenDecls.insert(CurFD); Pred(CurFD); } diff --git a/clang/test/Sema/attr-target-version.c b/clang/test/Sema/attr-target-version.c index 91c89cfd1e7b0..88a927a58f991 100644 --- a/clang/test/Sema/attr-target-version.c +++ b/clang/test/Sema/attr-target-version.c @@ -112,15 +112,3 @@ int unspec_args_implicit_default_first(); // expected-note@+1 {{function multiversioning caused by this declaration}} int __attribute__((target_version("aes"))) unspec_args_implicit_default_first() { return -1; } int __attribute__((target_version("default"))) unspec_args_implicit_default_first() { return 0; } - -void __attribute__((target_version("default"))) variadic_ok(int x, ...) {} -void __attribute__((target_version("fp"))) variadic_ok(int x, ...) {} -// expected-note@+1 {{candidate function}} -void __attribute__((target_version("default"))) variadic_bad(int x) {} -void __attribute__((target_version("fp"))) variadic_bad(int x, ...) {} - -void calls_variadic() { - variadic_ok(3); - //expected-error@+1 {{call to 'variadic_bad' is ambiguous}} - variadic_bad(3); -} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][FMV][AArch64] Improve streaming mode compatibility (PR #101007)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101007 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][FMV][AArch64] Improve streaming mode compatibility (PR #101007)
github-actions[bot] wrote: @labrinea (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/101007 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] 32b786c - [clang][FMV][AArch64] Improve streaming mode compatibility.
Author: Alexandros Lamprineas Date: 2024-08-01T09:04:44+02:00 New Revision: 32b786c92f0ae52201888dcfba5c3ac789afbb3a URL: https://github.com/llvm/llvm-project/commit/32b786c92f0ae52201888dcfba5c3ac789afbb3a DIFF: https://github.com/llvm/llvm-project/commit/32b786c92f0ae52201888dcfba5c3ac789afbb3a.diff LOG: [clang][FMV][AArch64] Improve streaming mode compatibility. * Allow arm-streaming if all the functions versions adhere to it. * Allow arm-streaming-compatible if all the functions versions adhere to it. * Allow arm-locally-streaming regardless of the other functions versions. When the caller needs to toggle the streaming mode all the function versions of the callee must adhere to the same mode, otherwise the call will yield a runtime error. Imagine the versions of the callee live in separate TUs. The version that is visible to the caller will determine the calling convention used when generating code for the callsite. Therefore we cannot support mixing streaming with non-streaming function versions. Imagine TU1 has a streaming caller and calls foo._sme which is streaming-compatible. The codegen for the callsite will not switch off the streaming mode. Then in TU2 we have a version which is non-streaming and could potentially be called in streaming mode. Similarly if the caller is non-streaming and the called version is streaming-compatible the codegen for the callsite will not switch on the streaming mode, but other versions may be streaming. Added: clang/test/CodeGen/aarch64-fmv-streaming.c clang/test/Sema/aarch64-fmv-streaming.c Modified: clang/include/clang/Basic/DiagnosticSemaKinds.td clang/lib/Sema/SemaDecl.cpp clang/lib/Sema/SemaDeclAttr.cpp clang/test/Sema/aarch64-sme-func-attrs.c Removed: diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 95ce4166ceb66..8a00fe21a08ce 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -3811,8 +3811,6 @@ def warn_sme_locally_streaming_has_vl_args_returns : Warning< InGroup, DefaultIgnore; def err_conflicting_attributes_arm_state : Error< "conflicting attributes for state '%0'">; -def err_sme_streaming_cannot_be_multiversioned : Error< - "streaming function cannot be multi-versioned">; def err_unknown_arm_state : Error< "unknown state '%0'">; def err_missing_arm_state : Error< diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp index f60cc78be4f92..01231f8e385ef 100644 --- a/clang/lib/Sema/SemaDecl.cpp +++ b/clang/lib/Sema/SemaDecl.cpp @@ -11014,6 +11014,9 @@ static bool AttrCompatibleWithMultiVersion(attr::Kind Kind, switch (Kind) { default: return false; + case attr::ArmLocallyStreaming: +return MVKind == MultiVersionKind::TargetVersion || + MVKind == MultiVersionKind::TargetClones; case attr::Used: return MVKind == MultiVersionKind::Target; case attr::NonNull: @@ -11150,7 +11153,21 @@ bool Sema::areMultiversionVariantFunctionsCompatible( FunctionType::ExtInfo OldTypeInfo = OldType->getExtInfo(); FunctionType::ExtInfo NewTypeInfo = NewType->getExtInfo(); -if (OldTypeInfo.getCC() != NewTypeInfo.getCC()) +const auto *OldFPT = OldFD->getType()->getAs(); +const auto *NewFPT = NewFD->getType()->getAs(); + +bool ArmStreamingCCMismatched = false; +if (OldFPT && NewFPT) { + unsigned Diff = + OldFPT->getAArch64SMEAttributes() ^ NewFPT->getAArch64SMEAttributes(); + // Arm-streaming, arm-streaming-compatible and non-streaming versions + // cannot be mixed. + if (Diff & (FunctionType::SME_PStateSMEnabledMask | + FunctionType::SME_PStateSMCompatibleMask)) +ArmStreamingCCMismatched = true; +} + +if (OldTypeInfo.getCC() != NewTypeInfo.getCC() || ArmStreamingCCMismatched) return Diag(DiffDiagIDAt.first, DiffDiagIDAt.second) << CallingConv; QualType OldReturnType = OldType->getReturnType(); @@ -11170,9 +11187,8 @@ bool Sema::areMultiversionVariantFunctionsCompatible( if (!CLinkageMayDiffer && OldFD->isExternC() != NewFD->isExternC()) return Diag(DiffDiagIDAt.first, DiffDiagIDAt.second) << LanguageLinkage; -if (CheckEquivalentExceptionSpec( -OldFD->getType()->getAs(), OldFD->getLocation(), -NewFD->getType()->getAs(), NewFD->getLocation())) +if (CheckEquivalentExceptionSpec(OldFPT, OldFD->getLocation(), NewFPT, + NewFD->getLocation())) return true; } return false; diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp index 10bacc17a07ca..e2eada24f9fcc 100644 --- a/clang/lib/Sema/SemaDeclAttr.cpp +++ b/clang/lib/Sema/SemaDeclAttr.cpp @@ -3024,9 +3024,6 @@ bool Sema::checkTargetVersionAttr(SourceLocation LiteralLoc, Decl *D,
[llvm-branch-commits] [compiler-rt] release/19.x: [Sanitizers] Avoid overload ambiguity for interceptors (#100986) (PR #101150)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101150 >From 742576dc3b332d0f67e883b445f482a51ea1feec Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Tue, 30 Jul 2024 09:25:03 +0200 Subject: [PATCH] [Sanitizers] Avoid overload ambiguity for interceptors (#100986) Since glibc 2.40 some functions like openat make use of overloads when built with `-D_FORTIFY_SOURCE=2`, see: https://github.com/bminor/glibc/blob/master/io/bits/fcntl2.h This means that doing something like `(uintptr_t) openat` or `(void *) openat` is now ambiguous, breaking the compiler-rt build on new glibc versions. Fix this by explicitly casting the symbol to the expected function type before casting it to an intptr. The expected type is obtained as `decltype(REAL(func))` so we don't have to repeat the signature from INTERCEPTOR in the INTERCEPT_FUNTION macro. Fixes https://github.com/llvm/llvm-project/issues/100754. (cherry picked from commit 155b7a12820ec45095988b6aa6e057afaf2bc892) --- .../lib/interception/interception_linux.h| 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/compiler-rt/lib/interception/interception_linux.h b/compiler-rt/lib/interception/interception_linux.h index 433a3d9bd7fa7..2e01ff44578c3 100644 --- a/compiler-rt/lib/interception/interception_linux.h +++ b/compiler-rt/lib/interception/interception_linux.h @@ -28,12 +28,14 @@ bool InterceptFunction(const char *name, const char *ver, uptr *ptr_to_real, uptr func, uptr trampoline); } // namespace __interception -#define INTERCEPT_FUNCTION_LINUX_OR_FREEBSD(func) \ - ::__interception::InterceptFunction(\ - #func, \ - (::__interception::uptr *)&REAL(func), \ - (::__interception::uptr)&(func),\ - (::__interception::uptr)&TRAMPOLINE(func)) +// Cast func to type of REAL(func) before casting to uptr in case it is an +// overloaded function, which is the case for some glibc functions when +// _FORTIFY_SOURCE is used. This disambiguates which overload to use. +#define INTERCEPT_FUNCTION_LINUX_OR_FREEBSD(func)\ + ::__interception::InterceptFunction( \ + #func, (::__interception::uptr *)&REAL(func), \ + (::__interception::uptr)(decltype(REAL(func)))&(func), \ + (::__interception::uptr) &TRAMPOLINE(func)) // dlvsym is a GNU extension supported by some other platforms. #if SANITIZER_GLIBC || SANITIZER_FREEBSD || SANITIZER_NETBSD @@ -41,7 +43,7 @@ bool InterceptFunction(const char *name, const char *ver, uptr *ptr_to_real, ::__interception::InterceptFunction(\ #func, symver, \ (::__interception::uptr *)&REAL(func), \ - (::__interception::uptr)&(func),\ + (::__interception::uptr)(decltype(REAL(func)))&(func), \ (::__interception::uptr)&TRAMPOLINE(func)) #else #define INTERCEPT_FUNCTION_VER_LINUX_OR_FREEBSD(func, symver) \ ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [Sanitizers] Avoid overload ambiguity for interceptors (#100986) (PR #101150)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101150 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] 742576d - [Sanitizers] Avoid overload ambiguity for interceptors (#100986)
Author: Nikita Popov Date: 2024-08-01T09:13:44+02:00 New Revision: 742576dc3b332d0f67e883b445f482a51ea1feec URL: https://github.com/llvm/llvm-project/commit/742576dc3b332d0f67e883b445f482a51ea1feec DIFF: https://github.com/llvm/llvm-project/commit/742576dc3b332d0f67e883b445f482a51ea1feec.diff LOG: [Sanitizers] Avoid overload ambiguity for interceptors (#100986) Since glibc 2.40 some functions like openat make use of overloads when built with `-D_FORTIFY_SOURCE=2`, see: https://github.com/bminor/glibc/blob/master/io/bits/fcntl2.h This means that doing something like `(uintptr_t) openat` or `(void *) openat` is now ambiguous, breaking the compiler-rt build on new glibc versions. Fix this by explicitly casting the symbol to the expected function type before casting it to an intptr. The expected type is obtained as `decltype(REAL(func))` so we don't have to repeat the signature from INTERCEPTOR in the INTERCEPT_FUNTION macro. Fixes https://github.com/llvm/llvm-project/issues/100754. (cherry picked from commit 155b7a12820ec45095988b6aa6e057afaf2bc892) Added: Modified: compiler-rt/lib/interception/interception_linux.h Removed: diff --git a/compiler-rt/lib/interception/interception_linux.h b/compiler-rt/lib/interception/interception_linux.h index 433a3d9bd7fa7..2e01ff44578c3 100644 --- a/compiler-rt/lib/interception/interception_linux.h +++ b/compiler-rt/lib/interception/interception_linux.h @@ -28,12 +28,14 @@ bool InterceptFunction(const char *name, const char *ver, uptr *ptr_to_real, uptr func, uptr trampoline); } // namespace __interception -#define INTERCEPT_FUNCTION_LINUX_OR_FREEBSD(func) \ - ::__interception::InterceptFunction(\ - #func, \ - (::__interception::uptr *)&REAL(func), \ - (::__interception::uptr)&(func),\ - (::__interception::uptr)&TRAMPOLINE(func)) +// Cast func to type of REAL(func) before casting to uptr in case it is an +// overloaded function, which is the case for some glibc functions when +// _FORTIFY_SOURCE is used. This disambiguates which overload to use. +#define INTERCEPT_FUNCTION_LINUX_OR_FREEBSD(func)\ + ::__interception::InterceptFunction( \ + #func, (::__interception::uptr *)&REAL(func), \ + (::__interception::uptr)(decltype(REAL(func)))&(func), \ + (::__interception::uptr) &TRAMPOLINE(func)) // dlvsym is a GNU extension supported by some other platforms. #if SANITIZER_GLIBC || SANITIZER_FREEBSD || SANITIZER_NETBSD @@ -41,7 +43,7 @@ bool InterceptFunction(const char *name, const char *ver, uptr *ptr_to_real, ::__interception::InterceptFunction(\ #func, symver, \ (::__interception::uptr *)&REAL(func), \ - (::__interception::uptr)&(func),\ + (::__interception::uptr)(decltype(REAL(func)))&(func), \ (::__interception::uptr)&TRAMPOLINE(func)) #else #define INTERCEPT_FUNCTION_VER_LINUX_OR_FREEBSD(func, symver) \ ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [Sanitizers] Avoid overload ambiguity for interceptors (#100986) (PR #101150)
llvmbot wrote: @llvm/pr-subscribers-compiler-rt-sanitizer Author: None (llvmbot) Changes Backport 155b7a12820ec45095988b6aa6e057afaf2bc892 Requested by: @nikic --- Full diff: https://github.com/llvm/llvm-project/pull/101150.diff 1 Files Affected: - (modified) compiler-rt/lib/interception/interception_linux.h (+9-7) ``diff diff --git a/compiler-rt/lib/interception/interception_linux.h b/compiler-rt/lib/interception/interception_linux.h index 433a3d9bd7fa7..2e01ff44578c3 100644 --- a/compiler-rt/lib/interception/interception_linux.h +++ b/compiler-rt/lib/interception/interception_linux.h @@ -28,12 +28,14 @@ bool InterceptFunction(const char *name, const char *ver, uptr *ptr_to_real, uptr func, uptr trampoline); } // namespace __interception -#define INTERCEPT_FUNCTION_LINUX_OR_FREEBSD(func) \ - ::__interception::InterceptFunction(\ - #func, \ - (::__interception::uptr *)&REAL(func), \ - (::__interception::uptr)&(func),\ - (::__interception::uptr)&TRAMPOLINE(func)) +// Cast func to type of REAL(func) before casting to uptr in case it is an +// overloaded function, which is the case for some glibc functions when +// _FORTIFY_SOURCE is used. This disambiguates which overload to use. +#define INTERCEPT_FUNCTION_LINUX_OR_FREEBSD(func)\ + ::__interception::InterceptFunction( \ + #func, (::__interception::uptr *)&REAL(func), \ + (::__interception::uptr)(decltype(REAL(func)))&(func), \ + (::__interception::uptr) &TRAMPOLINE(func)) // dlvsym is a GNU extension supported by some other platforms. #if SANITIZER_GLIBC || SANITIZER_FREEBSD || SANITIZER_NETBSD @@ -41,7 +43,7 @@ bool InterceptFunction(const char *name, const char *ver, uptr *ptr_to_real, ::__interception::InterceptFunction(\ #func, symver, \ (::__interception::uptr *)&REAL(func), \ - (::__interception::uptr)&(func),\ + (::__interception::uptr)(decltype(REAL(func)))&(func), \ (::__interception::uptr)&TRAMPOLINE(func)) #else #define INTERCEPT_FUNCTION_VER_LINUX_OR_FREEBSD(func, symver) \ `` https://github.com/llvm/llvm-project/pull/101150 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: Revert "[MC] Compute fragment offsets eagerly" (PR #101254)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101254 >From 03ae9f9fc62b0283505d2d363118b04dd5d947a8 Mon Sep 17 00:00:00 2001 From: Fangrui Song Date: Tue, 30 Jul 2024 14:52:29 -0700 Subject: [PATCH] Revert "[MC] Compute fragment offsets eagerly" This reverts commit 1a47f3f3db66589c11f8ddacfeaecc03fb80c510. Fix #100283 This commit is actually a trigger of other preexisting problems: * Size change of fill fragments does not influence the fixed-point iteration. * The `invalid number of bytes` error is reported too early. Since `.zero A-B` might have temporary negative values in the first few iterations. However, the problems appeared at least "benign" (did not affect the Linux kernel builds) before this commit. (cherry picked from commit 4eb5450f630849ee0518487de38d857fbe5b1aee) --- llvm/include/llvm/MC/MCAsmBackend.h | 5 +- llvm/include/llvm/MC/MCAssembler.h| 4 +- llvm/include/llvm/MC/MCSection.h | 5 ++ llvm/lib/MC/MCAssembler.cpp | 77 +-- llvm/lib/MC/MCSection.cpp | 4 +- .../MCTargetDesc/HexagonAsmBackend.cpp| 4 +- .../Target/X86/MCTargetDesc/X86AsmBackend.cpp | 26 +-- 7 files changed, 71 insertions(+), 54 deletions(-) diff --git a/llvm/include/llvm/MC/MCAsmBackend.h b/llvm/include/llvm/MC/MCAsmBackend.h index d1d1814dd8b52..3f88ac02cd92a 100644 --- a/llvm/include/llvm/MC/MCAsmBackend.h +++ b/llvm/include/llvm/MC/MCAsmBackend.h @@ -217,9 +217,8 @@ class MCAsmBackend { virtual bool writeNopData(raw_ostream &OS, uint64_t Count, const MCSubtargetInfo *STI) const = 0; - // Return true if fragment offsets have been adjusted and an extra layout - // iteration is needed. - virtual bool finishLayout(const MCAssembler &Asm) const { return false; } + /// Give backend an opportunity to finish layout after relaxation + virtual void finishLayout(MCAssembler const &Asm) const {} /// Handle any target-specific assembler flags. By default, do nothing. virtual void handleAssemblerFlag(MCAssemblerFlag Flag) {} diff --git a/llvm/include/llvm/MC/MCAssembler.h b/llvm/include/llvm/MC/MCAssembler.h index d9752912ee66a..c6fa48128d189 100644 --- a/llvm/include/llvm/MC/MCAssembler.h +++ b/llvm/include/llvm/MC/MCAssembler.h @@ -111,7 +111,6 @@ class MCAssembler { /// Check whether the given fragment needs relaxation. bool fragmentNeedsRelaxation(const MCRelaxableFragment *IF) const; - void layoutSection(MCSection &Sec); /// Perform one layout iteration and return true if any offsets /// were adjusted. bool layoutOnce(); @@ -148,9 +147,10 @@ class MCAssembler { uint64_t computeFragmentSize(const MCFragment &F) const; void layoutBundle(MCFragment *Prev, MCFragment *F) const; + void ensureValid(MCSection &Sec) const; // Get the offset of the given fragment inside its containing section. - uint64_t getFragmentOffset(const MCFragment &F) const { return F.Offset; } + uint64_t getFragmentOffset(const MCFragment &F) const; uint64_t getSectionAddressSize(const MCSection &Sec) const; uint64_t getSectionFileSize(const MCSection &Sec) const; diff --git a/llvm/include/llvm/MC/MCSection.h b/llvm/include/llvm/MC/MCSection.h index 1289d6f6f9f65..dcdcd094fa17b 100644 --- a/llvm/include/llvm/MC/MCSection.h +++ b/llvm/include/llvm/MC/MCSection.h @@ -99,6 +99,8 @@ class MCSection { /// Whether this section has had instructions emitted into it. bool HasInstructions : 1; + bool HasLayout : 1; + bool IsRegistered : 1; bool IsText : 1; @@ -167,6 +169,9 @@ class MCSection { bool hasInstructions() const { return HasInstructions; } void setHasInstructions(bool Value) { HasInstructions = Value; } + bool hasLayout() const { return HasLayout; } + void setHasLayout(bool Value) { HasLayout = Value; } + bool isRegistered() const { return IsRegistered; } void setIsRegistered(bool Value) { IsRegistered = Value; } diff --git a/llvm/lib/MC/MCAssembler.cpp b/llvm/lib/MC/MCAssembler.cpp index ceeb7af0fecc4..c3da4bb5cc363 100644 --- a/llvm/lib/MC/MCAssembler.cpp +++ b/llvm/lib/MC/MCAssembler.cpp @@ -432,6 +432,28 @@ void MCAssembler::layoutBundle(MCFragment *Prev, MCFragment *F) const { DF->Offset = EF->Offset; } +void MCAssembler::ensureValid(MCSection &Sec) const { + if (Sec.hasLayout()) +return; + Sec.setHasLayout(true); + MCFragment *Prev = nullptr; + uint64_t Offset = 0; + for (MCFragment &F : Sec) { +F.Offset = Offset; +if (isBundlingEnabled() && F.hasInstructions()) { + layoutBundle(Prev, &F); + Offset = F.Offset; +} +Offset += computeFragmentSize(F); +Prev = &F; + } +} + +uint64_t MCAssembler::getFragmentOffset(const MCFragment &F) const { + ensureValid(*F.getParent()); + return F.Offset; +} + // Simple getSymbolOffset helper for the non-variable case. static bool getLabelOffset(const MCAssembler &Asm, const MCSymbol &S,
[llvm-branch-commits] [llvm] 03ae9f9 - Revert "[MC] Compute fragment offsets eagerly"
Author: Fangrui Song Date: 2024-08-01T09:14:24+02:00 New Revision: 03ae9f9fc62b0283505d2d363118b04dd5d947a8 URL: https://github.com/llvm/llvm-project/commit/03ae9f9fc62b0283505d2d363118b04dd5d947a8 DIFF: https://github.com/llvm/llvm-project/commit/03ae9f9fc62b0283505d2d363118b04dd5d947a8.diff LOG: Revert "[MC] Compute fragment offsets eagerly" This reverts commit 1a47f3f3db66589c11f8ddacfeaecc03fb80c510. Fix #100283 This commit is actually a trigger of other preexisting problems: * Size change of fill fragments does not influence the fixed-point iteration. * The `invalid number of bytes` error is reported too early. Since `.zero A-B` might have temporary negative values in the first few iterations. However, the problems appeared at least "benign" (did not affect the Linux kernel builds) before this commit. (cherry picked from commit 4eb5450f630849ee0518487de38d857fbe5b1aee) Added: Modified: llvm/include/llvm/MC/MCAsmBackend.h llvm/include/llvm/MC/MCAssembler.h llvm/include/llvm/MC/MCSection.h llvm/lib/MC/MCAssembler.cpp llvm/lib/MC/MCSection.cpp llvm/lib/Target/Hexagon/MCTargetDesc/HexagonAsmBackend.cpp llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp Removed: diff --git a/llvm/include/llvm/MC/MCAsmBackend.h b/llvm/include/llvm/MC/MCAsmBackend.h index d1d1814dd8b52..3f88ac02cd92a 100644 --- a/llvm/include/llvm/MC/MCAsmBackend.h +++ b/llvm/include/llvm/MC/MCAsmBackend.h @@ -217,9 +217,8 @@ class MCAsmBackend { virtual bool writeNopData(raw_ostream &OS, uint64_t Count, const MCSubtargetInfo *STI) const = 0; - // Return true if fragment offsets have been adjusted and an extra layout - // iteration is needed. - virtual bool finishLayout(const MCAssembler &Asm) const { return false; } + /// Give backend an opportunity to finish layout after relaxation + virtual void finishLayout(MCAssembler const &Asm) const {} /// Handle any target-specific assembler flags. By default, do nothing. virtual void handleAssemblerFlag(MCAssemblerFlag Flag) {} diff --git a/llvm/include/llvm/MC/MCAssembler.h b/llvm/include/llvm/MC/MCAssembler.h index d9752912ee66a..c6fa48128d189 100644 --- a/llvm/include/llvm/MC/MCAssembler.h +++ b/llvm/include/llvm/MC/MCAssembler.h @@ -111,7 +111,6 @@ class MCAssembler { /// Check whether the given fragment needs relaxation. bool fragmentNeedsRelaxation(const MCRelaxableFragment *IF) const; - void layoutSection(MCSection &Sec); /// Perform one layout iteration and return true if any offsets /// were adjusted. bool layoutOnce(); @@ -148,9 +147,10 @@ class MCAssembler { uint64_t computeFragmentSize(const MCFragment &F) const; void layoutBundle(MCFragment *Prev, MCFragment *F) const; + void ensureValid(MCSection &Sec) const; // Get the offset of the given fragment inside its containing section. - uint64_t getFragmentOffset(const MCFragment &F) const { return F.Offset; } + uint64_t getFragmentOffset(const MCFragment &F) const; uint64_t getSectionAddressSize(const MCSection &Sec) const; uint64_t getSectionFileSize(const MCSection &Sec) const; diff --git a/llvm/include/llvm/MC/MCSection.h b/llvm/include/llvm/MC/MCSection.h index 1289d6f6f9f65..dcdcd094fa17b 100644 --- a/llvm/include/llvm/MC/MCSection.h +++ b/llvm/include/llvm/MC/MCSection.h @@ -99,6 +99,8 @@ class MCSection { /// Whether this section has had instructions emitted into it. bool HasInstructions : 1; + bool HasLayout : 1; + bool IsRegistered : 1; bool IsText : 1; @@ -167,6 +169,9 @@ class MCSection { bool hasInstructions() const { return HasInstructions; } void setHasInstructions(bool Value) { HasInstructions = Value; } + bool hasLayout() const { return HasLayout; } + void setHasLayout(bool Value) { HasLayout = Value; } + bool isRegistered() const { return IsRegistered; } void setIsRegistered(bool Value) { IsRegistered = Value; } diff --git a/llvm/lib/MC/MCAssembler.cpp b/llvm/lib/MC/MCAssembler.cpp index ceeb7af0fecc4..c3da4bb5cc363 100644 --- a/llvm/lib/MC/MCAssembler.cpp +++ b/llvm/lib/MC/MCAssembler.cpp @@ -432,6 +432,28 @@ void MCAssembler::layoutBundle(MCFragment *Prev, MCFragment *F) const { DF->Offset = EF->Offset; } +void MCAssembler::ensureValid(MCSection &Sec) const { + if (Sec.hasLayout()) +return; + Sec.setHasLayout(true); + MCFragment *Prev = nullptr; + uint64_t Offset = 0; + for (MCFragment &F : Sec) { +F.Offset = Offset; +if (isBundlingEnabled() && F.hasInstructions()) { + layoutBundle(Prev, &F); + Offset = F.Offset; +} +Offset += computeFragmentSize(F); +Prev = &F; + } +} + +uint64_t MCAssembler::getFragmentOffset(const MCFragment &F) const { + ensureValid(*F.getParent()); + return F.Offset; +} + // Simple getSymbolOffset helper for the non-variable case. static bool getLabelOffset(const MCAssembler
[llvm-branch-commits] [compiler-rt] release/19.x: [Sanitizers] Avoid overload ambiguity for interceptors (#100986) (PR #101150)
github-actions[bot] wrote: @nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/101150 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: Revert "[MC] Compute fragment offsets eagerly" (PR #101254)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101254 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: Revert "[MC] Compute fragment offsets eagerly" (PR #101254)
github-actions[bot] wrote: @MaskRay (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/101254 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [compiler-rt] [rtsan] Revert openat interceptor that breaks fortify-source builds (PR #100876)
nikic wrote: As https://github.com/llvm/llvm-project/pull/101150 has been merged, this one shouldn't be needed on the release branch anymore. https://github.com/llvm/llvm-project/pull/100876 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] Revert "[compiler-rt][RISCV] Implement __init_riscv_feature_bits (#85… (PR #101320)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101320 >From b14801954e346a3d2f89f4047f0b0bf457bb0194 Mon Sep 17 00:00:00 2001 From: Piyou Chen Date: Wed, 31 Jul 2024 00:54:03 -0700 Subject: [PATCH] Revert "[compiler-rt][RISCV] Implement __init_riscv_feature_bits (#85790)" This reverts commit a41a4ac78294c728fb70a51623c602ea7f3e308a. --- compiler-rt/lib/builtins/CMakeLists.txt | 1 - compiler-rt/lib/builtins/riscv/feature_bits.c | 298 -- 2 files changed, 299 deletions(-) delete mode 100644 compiler-rt/lib/builtins/riscv/feature_bits.c diff --git a/compiler-rt/lib/builtins/CMakeLists.txt b/compiler-rt/lib/builtins/CMakeLists.txt index 88a5998fd4610..abea8c498f7bd 100644 --- a/compiler-rt/lib/builtins/CMakeLists.txt +++ b/compiler-rt/lib/builtins/CMakeLists.txt @@ -739,7 +739,6 @@ endif() set(powerpc64le_SOURCES ${powerpc64_SOURCES}) set(riscv_SOURCES - riscv/feature_bits.c riscv/fp_mode.c riscv/save.S riscv/restore.S diff --git a/compiler-rt/lib/builtins/riscv/feature_bits.c b/compiler-rt/lib/builtins/riscv/feature_bits.c deleted file mode 100644 index 77422935bd2d3..0 --- a/compiler-rt/lib/builtins/riscv/feature_bits.c +++ /dev/null @@ -1,298 +0,0 @@ -//=== feature_bits.c - Update RISC-V Feature Bits Structure -*- C -*-=// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===--===// - -#define RISCV_FEATURE_BITS_LENGTH 1 -struct { - unsigned length; - unsigned long long features[RISCV_FEATURE_BITS_LENGTH]; -} __riscv_feature_bits __attribute__((visibility("hidden"), nocommon)); - -#define RISCV_VENDOR_FEATURE_BITS_LENGTH 1 -struct { - unsigned vendorID; - unsigned length; - unsigned long long features[RISCV_VENDOR_FEATURE_BITS_LENGTH]; -} __riscv_vendor_feature_bits __attribute__((visibility("hidden"), nocommon)); - -// NOTE: Should sync-up with RISCVFeatures.td -// TODO: Maybe generate a header from tablegen then include it. -#define A_GROUPID 0 -#define A_BITMASK (1ULL << 0) -#define C_GROUPID 0 -#define C_BITMASK (1ULL << 2) -#define D_GROUPID 0 -#define D_BITMASK (1ULL << 3) -#define F_GROUPID 0 -#define F_BITMASK (1ULL << 5) -#define I_GROUPID 0 -#define I_BITMASK (1ULL << 8) -#define M_GROUPID 0 -#define M_BITMASK (1ULL << 12) -#define V_GROUPID 0 -#define V_BITMASK (1ULL << 21) -#define ZACAS_GROUPID 0 -#define ZACAS_BITMASK (1ULL << 26) -#define ZBA_GROUPID 0 -#define ZBA_BITMASK (1ULL << 27) -#define ZBB_GROUPID 0 -#define ZBB_BITMASK (1ULL << 28) -#define ZBC_GROUPID 0 -#define ZBC_BITMASK (1ULL << 29) -#define ZBKB_GROUPID 0 -#define ZBKB_BITMASK (1ULL << 30) -#define ZBKC_GROUPID 0 -#define ZBKC_BITMASK (1ULL << 31) -#define ZBKX_GROUPID 0 -#define ZBKX_BITMASK (1ULL << 32) -#define ZBS_GROUPID 0 -#define ZBS_BITMASK (1ULL << 33) -#define ZFA_GROUPID 0 -#define ZFA_BITMASK (1ULL << 34) -#define ZFH_GROUPID 0 -#define ZFH_BITMASK (1ULL << 35) -#define ZFHMIN_GROUPID 0 -#define ZFHMIN_BITMASK (1ULL << 36) -#define ZICBOZ_GROUPID 0 -#define ZICBOZ_BITMASK (1ULL << 37) -#define ZICOND_GROUPID 0 -#define ZICOND_BITMASK (1ULL << 38) -#define ZIHINTNTL_GROUPID 0 -#define ZIHINTNTL_BITMASK (1ULL << 39) -#define ZIHINTPAUSE_GROUPID 0 -#define ZIHINTPAUSE_BITMASK (1ULL << 40) -#define ZKND_GROUPID 0 -#define ZKND_BITMASK (1ULL << 41) -#define ZKNE_GROUPID 0 -#define ZKNE_BITMASK (1ULL << 42) -#define ZKNH_GROUPID 0 -#define ZKNH_BITMASK (1ULL << 43) -#define ZKSED_GROUPID 0 -#define ZKSED_BITMASK (1ULL << 44) -#define ZKSH_GROUPID 0 -#define ZKSH_BITMASK (1ULL << 45) -#define ZKT_GROUPID 0 -#define ZKT_BITMASK (1ULL << 46) -#define ZTSO_GROUPID 0 -#define ZTSO_BITMASK (1ULL << 47) -#define ZVBB_GROUPID 0 -#define ZVBB_BITMASK (1ULL << 48) -#define ZVBC_GROUPID 0 -#define ZVBC_BITMASK (1ULL << 49) -#define ZVFH_GROUPID 0 -#define ZVFH_BITMASK (1ULL << 50) -#define ZVFHMIN_GROUPID 0 -#define ZVFHMIN_BITMASK (1ULL << 51) -#define ZVKB_GROUPID 0 -#define ZVKB_BITMASK (1ULL << 52) -#define ZVKG_GROUPID 0 -#define ZVKG_BITMASK (1ULL << 53) -#define ZVKNED_GROUPID 0 -#define ZVKNED_BITMASK (1ULL << 54) -#define ZVKNHA_GROUPID 0 -#define ZVKNHA_BITMASK (1ULL << 55) -#define ZVKNHB_GROUPID 0 -#define ZVKNHB_BITMASK (1ULL << 56) -#define ZVKSED_GROUPID 0 -#define ZVKSED_BITMASK (1ULL << 57) -#define ZVKSH_GROUPID 0 -#define ZVKSH_BITMASK (1ULL << 58) -#define ZVKT_GROUPID 0 -#define ZVKT_BITMASK (1ULL << 59) - -#if defined(__linux__) - -static long syscall_impl_5_args(long number, long arg1, long arg2, long arg3, -long arg4, long arg5) { - register long a7 __asm__("a7") = number; - register long a0 __asm__("a0") = arg1; - register long a1 __asm__("a1") = arg2; - register long a2 __asm__("a2") = arg3; - register long a3 __asm__("a3") = arg4; - regist
[llvm-branch-commits] [compiler-rt] b148019 - Revert "[compiler-rt][RISCV] Implement __init_riscv_feature_bits (#85790)"
Author: Piyou Chen Date: 2024-08-01T09:16:47+02:00 New Revision: b14801954e346a3d2f89f4047f0b0bf457bb0194 URL: https://github.com/llvm/llvm-project/commit/b14801954e346a3d2f89f4047f0b0bf457bb0194 DIFF: https://github.com/llvm/llvm-project/commit/b14801954e346a3d2f89f4047f0b0bf457bb0194.diff LOG: Revert "[compiler-rt][RISCV] Implement __init_riscv_feature_bits (#85790)" This reverts commit a41a4ac78294c728fb70a51623c602ea7f3e308a. Added: Modified: compiler-rt/lib/builtins/CMakeLists.txt Removed: compiler-rt/lib/builtins/riscv/feature_bits.c diff --git a/compiler-rt/lib/builtins/CMakeLists.txt b/compiler-rt/lib/builtins/CMakeLists.txt index 88a5998fd4610..abea8c498f7bd 100644 --- a/compiler-rt/lib/builtins/CMakeLists.txt +++ b/compiler-rt/lib/builtins/CMakeLists.txt @@ -739,7 +739,6 @@ endif() set(powerpc64le_SOURCES ${powerpc64_SOURCES}) set(riscv_SOURCES - riscv/feature_bits.c riscv/fp_mode.c riscv/save.S riscv/restore.S diff --git a/compiler-rt/lib/builtins/riscv/feature_bits.c b/compiler-rt/lib/builtins/riscv/feature_bits.c deleted file mode 100644 index 77422935bd2d3..0 --- a/compiler-rt/lib/builtins/riscv/feature_bits.c +++ /dev/null @@ -1,298 +0,0 @@ -//=== feature_bits.c - Update RISC-V Feature Bits Structure -*- C -*-=// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===--===// - -#define RISCV_FEATURE_BITS_LENGTH 1 -struct { - unsigned length; - unsigned long long features[RISCV_FEATURE_BITS_LENGTH]; -} __riscv_feature_bits __attribute__((visibility("hidden"), nocommon)); - -#define RISCV_VENDOR_FEATURE_BITS_LENGTH 1 -struct { - unsigned vendorID; - unsigned length; - unsigned long long features[RISCV_VENDOR_FEATURE_BITS_LENGTH]; -} __riscv_vendor_feature_bits __attribute__((visibility("hidden"), nocommon)); - -// NOTE: Should sync-up with RISCVFeatures.td -// TODO: Maybe generate a header from tablegen then include it. -#define A_GROUPID 0 -#define A_BITMASK (1ULL << 0) -#define C_GROUPID 0 -#define C_BITMASK (1ULL << 2) -#define D_GROUPID 0 -#define D_BITMASK (1ULL << 3) -#define F_GROUPID 0 -#define F_BITMASK (1ULL << 5) -#define I_GROUPID 0 -#define I_BITMASK (1ULL << 8) -#define M_GROUPID 0 -#define M_BITMASK (1ULL << 12) -#define V_GROUPID 0 -#define V_BITMASK (1ULL << 21) -#define ZACAS_GROUPID 0 -#define ZACAS_BITMASK (1ULL << 26) -#define ZBA_GROUPID 0 -#define ZBA_BITMASK (1ULL << 27) -#define ZBB_GROUPID 0 -#define ZBB_BITMASK (1ULL << 28) -#define ZBC_GROUPID 0 -#define ZBC_BITMASK (1ULL << 29) -#define ZBKB_GROUPID 0 -#define ZBKB_BITMASK (1ULL << 30) -#define ZBKC_GROUPID 0 -#define ZBKC_BITMASK (1ULL << 31) -#define ZBKX_GROUPID 0 -#define ZBKX_BITMASK (1ULL << 32) -#define ZBS_GROUPID 0 -#define ZBS_BITMASK (1ULL << 33) -#define ZFA_GROUPID 0 -#define ZFA_BITMASK (1ULL << 34) -#define ZFH_GROUPID 0 -#define ZFH_BITMASK (1ULL << 35) -#define ZFHMIN_GROUPID 0 -#define ZFHMIN_BITMASK (1ULL << 36) -#define ZICBOZ_GROUPID 0 -#define ZICBOZ_BITMASK (1ULL << 37) -#define ZICOND_GROUPID 0 -#define ZICOND_BITMASK (1ULL << 38) -#define ZIHINTNTL_GROUPID 0 -#define ZIHINTNTL_BITMASK (1ULL << 39) -#define ZIHINTPAUSE_GROUPID 0 -#define ZIHINTPAUSE_BITMASK (1ULL << 40) -#define ZKND_GROUPID 0 -#define ZKND_BITMASK (1ULL << 41) -#define ZKNE_GROUPID 0 -#define ZKNE_BITMASK (1ULL << 42) -#define ZKNH_GROUPID 0 -#define ZKNH_BITMASK (1ULL << 43) -#define ZKSED_GROUPID 0 -#define ZKSED_BITMASK (1ULL << 44) -#define ZKSH_GROUPID 0 -#define ZKSH_BITMASK (1ULL << 45) -#define ZKT_GROUPID 0 -#define ZKT_BITMASK (1ULL << 46) -#define ZTSO_GROUPID 0 -#define ZTSO_BITMASK (1ULL << 47) -#define ZVBB_GROUPID 0 -#define ZVBB_BITMASK (1ULL << 48) -#define ZVBC_GROUPID 0 -#define ZVBC_BITMASK (1ULL << 49) -#define ZVFH_GROUPID 0 -#define ZVFH_BITMASK (1ULL << 50) -#define ZVFHMIN_GROUPID 0 -#define ZVFHMIN_BITMASK (1ULL << 51) -#define ZVKB_GROUPID 0 -#define ZVKB_BITMASK (1ULL << 52) -#define ZVKG_GROUPID 0 -#define ZVKG_BITMASK (1ULL << 53) -#define ZVKNED_GROUPID 0 -#define ZVKNED_BITMASK (1ULL << 54) -#define ZVKNHA_GROUPID 0 -#define ZVKNHA_BITMASK (1ULL << 55) -#define ZVKNHB_GROUPID 0 -#define ZVKNHB_BITMASK (1ULL << 56) -#define ZVKSED_GROUPID 0 -#define ZVKSED_BITMASK (1ULL << 57) -#define ZVKSH_GROUPID 0 -#define ZVKSH_BITMASK (1ULL << 58) -#define ZVKT_GROUPID 0 -#define ZVKT_BITMASK (1ULL << 59) - -#if defined(__linux__) - -static long syscall_impl_5_args(long number, long arg1, long arg2, long arg3, -long arg4, long arg5) { - register long a7 __asm__("a7") = number; - register long a0 __asm__("a0") = arg1; - register long a1 __asm__("a1") = arg2; - register long a2 __asm__("a2") =
[llvm-branch-commits] [compiler-rt] Revert "[compiler-rt][RISCV] Implement __init_riscv_feature_bits (#85… (PR #101320)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101320 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [Sanitizers] Avoid overload ambiguity for interceptors (#100986) (PR #101150)
nikic wrote: Release note: > Fixed compiler-rt rtsan build with glibc 2.40 when `_FORTIFY_SOURCE` is > enabled. https://github.com/llvm/llvm-project/pull/101150 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] Revert "[compiler-rt][RISCV] Implement __init_riscv_feature_bits (#85… (PR #101320)
github-actions[bot] wrote: @BeMg (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/101320 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [NVPTX] Fix DwarfFrameBase construction (#101000) (PR #101145)
nikic wrote: Release note: > Fixed test failures in llvm/test/DebugInfo/NVPTX on 32-bit and big endian > architectures. https://github.com/llvm/llvm-project/pull/101145 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: Reland: "[Clang] Demote always_inline error to warning for mismatching SME attrs" (#100991) (#100996) (PR #101303)
tru wrote: Is this safe enough to reland? Have it lived without a problem in main for a bit? https://github.com/llvm/llvm-project/pull/101303 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++] Revert "Use GCC type traits builtins for remove_cv and remove_cvref (#81386)" (PR #101345)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101345 >From 0e615206e3b2c5f329cd612c09f3237c6060c06e Mon Sep 17 00:00:00 2001 From: Louis Dionne Date: Wed, 31 Jul 2024 10:40:14 -0400 Subject: [PATCH] [libc++] Revert "Use GCC type traits builtins for remove_cv and remove_cvref (#81386)" This reverts commit 55357160d0e151c32f86e1d6683b4bddbb706aa1. This is only being reverted from the LLVM 19 branch as a convenience to avoid breaking some IDEs which were not ready for that change. Fixes #99464 --- libcxx/include/__type_traits/remove_cv.h| 15 +++ libcxx/include/__type_traits/remove_cvref.h | 15 +-- 2 files changed, 16 insertions(+), 14 deletions(-) diff --git a/libcxx/include/__type_traits/remove_cv.h b/libcxx/include/__type_traits/remove_cv.h index 50e9f3e8aa78d..c4bf612794bd5 100644 --- a/libcxx/include/__type_traits/remove_cv.h +++ b/libcxx/include/__type_traits/remove_cv.h @@ -10,6 +10,8 @@ #define _LIBCPP___TYPE_TRAITS_REMOVE_CV_H #include <__config> +#include <__type_traits/remove_const.h> +#include <__type_traits/remove_volatile.h> #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER) # pragma GCC system_header @@ -17,18 +19,23 @@ _LIBCPP_BEGIN_NAMESPACE_STD +#if __has_builtin(__remove_cv) && !defined(_LIBCPP_COMPILER_GCC) template struct remove_cv { using type _LIBCPP_NODEBUG = __remove_cv(_Tp); }; -#if defined(_LIBCPP_COMPILER_GCC) template -using __remove_cv_t = typename remove_cv<_Tp>::type; +using __remove_cv_t = __remove_cv(_Tp); #else template -using __remove_cv_t = __remove_cv(_Tp); -#endif +struct _LIBCPP_TEMPLATE_VIS remove_cv { + typedef __remove_volatile_t<__remove_const_t<_Tp> > type; +}; + +template +using __remove_cv_t = __remove_volatile_t<__remove_const_t<_Tp> >; +#endif // __has_builtin(__remove_cv) #if _LIBCPP_STD_VER >= 14 template diff --git a/libcxx/include/__type_traits/remove_cvref.h b/libcxx/include/__type_traits/remove_cvref.h index 55f894dbd1d81..e8e8745ab0960 100644 --- a/libcxx/include/__type_traits/remove_cvref.h +++ b/libcxx/include/__type_traits/remove_cvref.h @@ -20,26 +20,21 @@ _LIBCPP_BEGIN_NAMESPACE_STD -#if defined(_LIBCPP_COMPILER_GCC) +#if __has_builtin(__remove_cvref) && !defined(_LIBCPP_COMPILER_GCC) template -struct __remove_cvref_gcc { - using type = __remove_cvref(_Tp); -}; - -template -using __remove_cvref_t _LIBCPP_NODEBUG = typename __remove_cvref_gcc<_Tp>::type; +using __remove_cvref_t _LIBCPP_NODEBUG = __remove_cvref(_Tp); #else template -using __remove_cvref_t _LIBCPP_NODEBUG = __remove_cvref(_Tp); +using __remove_cvref_t _LIBCPP_NODEBUG = __remove_cv_t<__libcpp_remove_reference_t<_Tp> >; #endif // __has_builtin(__remove_cvref) template -using __is_same_uncvref = _IsSame<__remove_cvref_t<_Tp>, __remove_cvref_t<_Up> >; +struct __is_same_uncvref : _IsSame<__remove_cvref_t<_Tp>, __remove_cvref_t<_Up> > {}; #if _LIBCPP_STD_VER >= 20 template struct remove_cvref { - using type _LIBCPP_NODEBUG = __remove_cvref(_Tp); + using type _LIBCPP_NODEBUG = __remove_cvref_t<_Tp>; }; template ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] 0e61520 - [libc++] Revert "Use GCC type traits builtins for remove_cv and remove_cvref (#81386)"
Author: Louis Dionne Date: 2024-08-01T09:21:02+02:00 New Revision: 0e615206e3b2c5f329cd612c09f3237c6060c06e URL: https://github.com/llvm/llvm-project/commit/0e615206e3b2c5f329cd612c09f3237c6060c06e DIFF: https://github.com/llvm/llvm-project/commit/0e615206e3b2c5f329cd612c09f3237c6060c06e.diff LOG: [libc++] Revert "Use GCC type traits builtins for remove_cv and remove_cvref (#81386)" This reverts commit 55357160d0e151c32f86e1d6683b4bddbb706aa1. This is only being reverted from the LLVM 19 branch as a convenience to avoid breaking some IDEs which were not ready for that change. Fixes #99464 Added: Modified: libcxx/include/__type_traits/remove_cv.h libcxx/include/__type_traits/remove_cvref.h Removed: diff --git a/libcxx/include/__type_traits/remove_cv.h b/libcxx/include/__type_traits/remove_cv.h index 50e9f3e8aa78d..c4bf612794bd5 100644 --- a/libcxx/include/__type_traits/remove_cv.h +++ b/libcxx/include/__type_traits/remove_cv.h @@ -10,6 +10,8 @@ #define _LIBCPP___TYPE_TRAITS_REMOVE_CV_H #include <__config> +#include <__type_traits/remove_const.h> +#include <__type_traits/remove_volatile.h> #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER) # pragma GCC system_header @@ -17,18 +19,23 @@ _LIBCPP_BEGIN_NAMESPACE_STD +#if __has_builtin(__remove_cv) && !defined(_LIBCPP_COMPILER_GCC) template struct remove_cv { using type _LIBCPP_NODEBUG = __remove_cv(_Tp); }; -#if defined(_LIBCPP_COMPILER_GCC) template -using __remove_cv_t = typename remove_cv<_Tp>::type; +using __remove_cv_t = __remove_cv(_Tp); #else template -using __remove_cv_t = __remove_cv(_Tp); -#endif +struct _LIBCPP_TEMPLATE_VIS remove_cv { + typedef __remove_volatile_t<__remove_const_t<_Tp> > type; +}; + +template +using __remove_cv_t = __remove_volatile_t<__remove_const_t<_Tp> >; +#endif // __has_builtin(__remove_cv) #if _LIBCPP_STD_VER >= 14 template diff --git a/libcxx/include/__type_traits/remove_cvref.h b/libcxx/include/__type_traits/remove_cvref.h index 55f894dbd1d81..e8e8745ab0960 100644 --- a/libcxx/include/__type_traits/remove_cvref.h +++ b/libcxx/include/__type_traits/remove_cvref.h @@ -20,26 +20,21 @@ _LIBCPP_BEGIN_NAMESPACE_STD -#if defined(_LIBCPP_COMPILER_GCC) +#if __has_builtin(__remove_cvref) && !defined(_LIBCPP_COMPILER_GCC) template -struct __remove_cvref_gcc { - using type = __remove_cvref(_Tp); -}; - -template -using __remove_cvref_t _LIBCPP_NODEBUG = typename __remove_cvref_gcc<_Tp>::type; +using __remove_cvref_t _LIBCPP_NODEBUG = __remove_cvref(_Tp); #else template -using __remove_cvref_t _LIBCPP_NODEBUG = __remove_cvref(_Tp); +using __remove_cvref_t _LIBCPP_NODEBUG = __remove_cv_t<__libcpp_remove_reference_t<_Tp> >; #endif // __has_builtin(__remove_cvref) template -using __is_same_uncvref = _IsSame<__remove_cvref_t<_Tp>, __remove_cvref_t<_Up> >; +struct __is_same_uncvref : _IsSame<__remove_cvref_t<_Tp>, __remove_cvref_t<_Up> > {}; #if _LIBCPP_STD_VER >= 20 template struct remove_cvref { - using type _LIBCPP_NODEBUG = __remove_cvref(_Tp); + using type _LIBCPP_NODEBUG = __remove_cvref_t<_Tp>; }; template ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++] Revert "Use GCC type traits builtins for remove_cv and remove_cvref (#81386)" (PR #101345)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101345 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++] Revert "Use GCC type traits builtins for remove_cv and remove_cvref (#81386)" (PR #101345)
github-actions[bot] wrote: @ldionne (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/101345 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [SLP] Order clustered load base pointers by ascending offsets (#100653) (PR #101033)
tru wrote: Is this still valid to be backported to 19? https://github.com/llvm/llvm-project/pull/101033 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [InstrProf] Remove duplicate definition of IntPtrT (PR #101061)
tru wrote: @minglotus-6 can you review this? https://github.com/llvm/llvm-project/pull/101061 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: Revert "[llvm][Bazel] Adapt to 4eb30cfb3474e3770b465cdb39db3b7f6404c3ef" (PR #101102)
tru wrote: Should this still be merged? I can squash it as I merge it if that's what we want. I am unsure if this bazel fix is important for the 19 release? What's the impact of this? https://github.com/llvm/llvm-project/pull/101102 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [RISCV] Use APInt in isSimpleVIDSequence to account for index overflow (#100072) (PR #101124)
tru wrote: @preames should this be backported? Is it a regression fix and what's the impact of this? https://github.com/llvm/llvm-project/pull/101124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [sanitizer_common][test] Fix SanitizerIoctl/KVM_GET_* tests on Linux/… (#100532) (PR #101136)
tru wrote: @vitalybuka is this safe to backport to 19? https://github.com/llvm/llvm-project/pull/101136 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [sanitizer_common] Don't use syscall(SYS_clone) on Linux/sparc64 (#100534) (PR #101137)
tru wrote: @vitalybuka is this safe to backport to 19? https://github.com/llvm/llvm-project/pull/101137 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [sanitizer_common] Fix signal_line.cpp on SPARC (#100535) (PR #101140)
tru wrote: @vitalybuka is this safe to backport to 19? https://github.com/llvm/llvm-project/pull/101140 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [sanitizer_common] Adjust signal_send.cpp for Linux/sparc64 (#100538) (PR #101141)
tru wrote: @vitalybuka same here ... safe to backport to 19? https://github.com/llvm/llvm-project/pull/101141 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [SLP] Order clustered load base pointers by ascending offsets (#100653) (PR #101033)
davemgreen wrote: Hi - What stage are we in for backports? Is it still OK to get this in if I add #101144 to it? https://github.com/llvm/llvm-project/pull/101033 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [RISCV] Use APInt in isSimpleVIDSequence to account for index overflow (#100072) (PR #101124)
lukel97 wrote: It's a miscompile, but it wasn't a regression since it looks like we've had it since LLVM 16 https://github.com/llvm/llvm-project/pull/101124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [SLP] Order clustered load base pointers by ascending offsets (#100653) (PR #101033)
tru wrote: Is it a regression/bug fix, finishing something unfinished or something completely new? https://github.com/llvm/llvm-project/pull/101033 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [RISCV] Use APInt in isSimpleVIDSequence to account for index overflow (#100072) (PR #101124)
tru wrote: I am fine with taking a miscompile at this point if @preames agrees and it's fairly safe. https://github.com/llvm/llvm-project/pull/101124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [sanitizer_common] Fix internal_*stat on Linux/sparc64 (PR #101236)
rorth wrote: I know, that's how I did it for the original PR. However, when it turned out 16e9bb9cd7f50ae2ec7f29a80bc3b95f528bfdbf was necessary too to unbreak the Solaris/sparcv9 build, I added a separate cherry pick just for that one. I guess I should just have closed the original PR and added a new cherry pick referring to both the base commit and the adjustment. https://github.com/llvm/llvm-project/pull/101236 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][FMV][AArch64] Improve streaming mode compatibility (PR #101007)
labrinea wrote: > @labrinea (or anyone else). If you would like to add a note about this fix in > the release notes (completely optional). Please reply to this comment with a > one or two sentence description of the fix. When you are done, please add the > release:note label to this PR. Streaming mode compatibilty has been improved for Multiversioned Functions which reside in separate translation units. All the versions of a function must have the same calling convention. If not, the compiler emits diagnostics. https://github.com/llvm/llvm-project/pull/101007 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: Reland: "[Clang] Demote always_inline error to warning for mismatching SME attrs" (#100991) (#100996) (PR #101303)
sdesmalen-arm wrote: > Is this safe enough to reland? Have it lived without a problem in main for a > bit? Thanks for checking. The only failures I would have expected are from lit tests, but the PR was merged on Monday and I've not seen any buildbot failures, so I believe it is safe. There should also be no impact to code that doesn't explicitly use these target-specific attributes. For the case where the attributes are used, the behaviour seems correct (e.g. when I try some examples in godbolt) and the change is specific and trivial enough not to cause any unrelated/unexpected issues either. I hope that answers your question. https://github.com/llvm/llvm-project/pull/101303 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [VP] Refactor VectorBuilder to avoid layering violation. NFC (#99276) (PR #101102)
https://github.com/chapuni edited https://github.com/llvm/llvm-project/pull/101102 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [VP] Refactor VectorBuilder to avoid layering violation. NFC (#99276) (PR #101102)
https://github.com/chapuni edited https://github.com/llvm/llvm-project/pull/101102 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [VP] Refactor VectorBuilder to avoid layering violation. NFC (#99276) (PR #101102)
chapuni wrote: @tru I didn't know this is editable since this was made automatically. This is not intended for bazel changes, but the `llvm/IR` header change. I supposed and hope individual cherry-picks (aka rebase) but it'd be okay with squashed. Please treat bazel changes as "cosmetic changes". https://github.com/llvm/llvm-project/pull/101102 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW (PR #101464)
https://github.com/lukel97 milestoned https://github.com/llvm/llvm-project/pull/101464 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW (PR #101464)
https://github.com/lukel97 created https://github.com/llvm/llvm-project/pull/101464 This is a backport of #101152 which fixes a miscompile on RISC-V, albeit not a regression. >From 6b7c614ad8a69dfb610ed02da541fb8d3bf009e3 Mon Sep 17 00:00:00 2001 From: Luke Lau Date: Wed, 31 Jul 2024 00:28:52 +0800 Subject: [PATCH] [RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW (#101152) As noted in https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we currently fold in vmerge.vvms and vmv.v.vs into their ops even if the EEW is different which leads to an incorrect transform. This checks the op's EEW via its simple value type for now since there doesn't seem to be any existing information about the EEW size of instructions. We'll probably need to encode this at some point if we want to be able to access it at the MachineInstr level in #100367 --- llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | 4 llvm/test/CodeGen/RISCV/rvv/combine-vmv.ll| 14 + .../RISCV/rvv/rvv-peephole-vmerge-vops.ll | 21 +++ 3 files changed, 39 insertions(+) diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp index eef6ae677ac85..db949f3476e2b 100644 --- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp @@ -3721,6 +3721,10 @@ bool RISCVDAGToDAGISel::performCombineVMergeAndVOps(SDNode *N) { assert(!Mask || cast(Mask)->getReg() == RISCV::V0); assert(!Glue || Glue.getValueType() == MVT::Glue); + // If the EEW of True is different from vmerge's SEW, then we can't fold. + if (True.getSimpleValueType() != N->getSimpleValueType(0)) +return false; + // We require that either merge and false are the same, or that merge // is undefined. if (Merge != False && !isImplicitDef(Merge)) diff --git a/llvm/test/CodeGen/RISCV/rvv/combine-vmv.ll b/llvm/test/CodeGen/RISCV/rvv/combine-vmv.ll index ec03f773c7108..dfc2b2bdda026 100644 --- a/llvm/test/CodeGen/RISCV/rvv/combine-vmv.ll +++ b/llvm/test/CodeGen/RISCV/rvv/combine-vmv.ll @@ -168,3 +168,17 @@ define @unfoldable_vredsum( %passthru, @llvm.riscv.vmv.v.v.nxv2i32( %passthru, %a, iXLen 1) ret %b } + +define @unfoldable_mismatched_sew( %passthru, %x, %y, iXLen %avl) { +; CHECK-LABEL: unfoldable_mismatched_sew: +; CHECK: # %bb.0: +; CHECK-NEXT:vsetvli zero, a0, e64, m1, ta, ma +; CHECK-NEXT:vadd.vv v9, v9, v10 +; CHECK-NEXT:vsetvli zero, a0, e32, m1, tu, ma +; CHECK-NEXT:vmv.v.v v8, v9 +; CHECK-NEXT:ret + %a = call @llvm.riscv.vadd.nxv1i64.nxv1i64( poison, %x, %y, iXLen %avl) + %a.bitcast = bitcast %a to + %b = call @llvm.riscv.vmv.v.v.nxv2i32( %passthru, %a.bitcast, iXLen %avl) + ret %b +} diff --git a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll index a08bcae074b9b..259515f160048 100644 --- a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll +++ b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll @@ -1196,3 +1196,24 @@ define @true_mask_vmerge_implicit_passthru( ) ret %b } + + +define @unfoldable_mismatched_sew( %passthru, %x, %y, %mask, i64 %avl) { +; CHECK-LABEL: unfoldable_mismatched_sew: +; CHECK: # %bb.0: +; CHECK-NEXT:vsetvli zero, a0, e64, m1, ta, ma +; CHECK-NEXT:vadd.vv v9, v9, v10 +; CHECK-NEXT:vsetvli zero, a0, e32, m1, tu, ma +; CHECK-NEXT:vmv.v.v v8, v9 +; CHECK-NEXT:ret + %a = call @llvm.riscv.vadd.nxv1i64.nxv1i64( poison, %x, %y, i64 %avl) + %a.bitcast = bitcast %a to + %b = call @llvm.riscv.vmerge.nxv2i32.nxv2i32( + %passthru, + %passthru, + %a.bitcast, + splat (i1 true), +i64 %avl + ) + ret %b +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW (PR #101464)
https://github.com/lukel97 edited https://github.com/llvm/llvm-project/pull/101464 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW (PR #101464)
llvmbot wrote: @llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) Changes This is a backport of #101152 which fixes a miscompile on RISC-V, albeit not a regression. --- Full diff: https://github.com/llvm/llvm-project/pull/101464.diff 3 Files Affected: - (modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (+4) - (modified) llvm/test/CodeGen/RISCV/rvv/combine-vmv.ll (+14) - (modified) llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll (+21) ``diff diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp index eef6ae677ac85..db949f3476e2b 100644 --- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp @@ -3721,6 +3721,10 @@ bool RISCVDAGToDAGISel::performCombineVMergeAndVOps(SDNode *N) { assert(!Mask || cast(Mask)->getReg() == RISCV::V0); assert(!Glue || Glue.getValueType() == MVT::Glue); + // If the EEW of True is different from vmerge's SEW, then we can't fold. + if (True.getSimpleValueType() != N->getSimpleValueType(0)) +return false; + // We require that either merge and false are the same, or that merge // is undefined. if (Merge != False && !isImplicitDef(Merge)) diff --git a/llvm/test/CodeGen/RISCV/rvv/combine-vmv.ll b/llvm/test/CodeGen/RISCV/rvv/combine-vmv.ll index ec03f773c7108..dfc2b2bdda026 100644 --- a/llvm/test/CodeGen/RISCV/rvv/combine-vmv.ll +++ b/llvm/test/CodeGen/RISCV/rvv/combine-vmv.ll @@ -168,3 +168,17 @@ define @unfoldable_vredsum( %passthru, @llvm.riscv.vmv.v.v.nxv2i32( %passthru, %a, iXLen 1) ret %b } + +define @unfoldable_mismatched_sew( %passthru, %x, %y, iXLen %avl) { +; CHECK-LABEL: unfoldable_mismatched_sew: +; CHECK: # %bb.0: +; CHECK-NEXT:vsetvli zero, a0, e64, m1, ta, ma +; CHECK-NEXT:vadd.vv v9, v9, v10 +; CHECK-NEXT:vsetvli zero, a0, e32, m1, tu, ma +; CHECK-NEXT:vmv.v.v v8, v9 +; CHECK-NEXT:ret + %a = call @llvm.riscv.vadd.nxv1i64.nxv1i64( poison, %x, %y, iXLen %avl) + %a.bitcast = bitcast %a to + %b = call @llvm.riscv.vmv.v.v.nxv2i32( %passthru, %a.bitcast, iXLen %avl) + ret %b +} diff --git a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll index a08bcae074b9b..259515f160048 100644 --- a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll +++ b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll @@ -1196,3 +1196,24 @@ define @true_mask_vmerge_implicit_passthru( ) ret %b } + + +define @unfoldable_mismatched_sew( %passthru, %x, %y, %mask, i64 %avl) { +; CHECK-LABEL: unfoldable_mismatched_sew: +; CHECK: # %bb.0: +; CHECK-NEXT:vsetvli zero, a0, e64, m1, ta, ma +; CHECK-NEXT:vadd.vv v9, v9, v10 +; CHECK-NEXT:vsetvli zero, a0, e32, m1, tu, ma +; CHECK-NEXT:vmv.v.v v8, v9 +; CHECK-NEXT:ret + %a = call @llvm.riscv.vadd.nxv1i64.nxv1i64( poison, %x, %y, i64 %avl) + %a.bitcast = bitcast %a to + %b = call @llvm.riscv.vmerge.nxv2i32.nxv2i32( + %passthru, + %passthru, + %a.bitcast, + splat (i1 true), +i64 %avl + ) + ret %b +} `` https://github.com/llvm/llvm-project/pull/101464 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/19.x: [lldb][FreeBSD] Fix NativeRegisterContextFreeBSD_{arm, mips64, powerpc} declarations (#101403) (PR #101465)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/101465 Backport 7088a5ed880f29129ec844c66068e8cb61ca98bf Requested by: @DimitryAndric >From 880f2b5c04979cf0793f65b31e5464ac562d1f02 Mon Sep 17 00:00:00 2001 From: Dimitry Andric Date: Thu, 1 Aug 2024 09:28:29 +0200 Subject: [PATCH] [lldb][FreeBSD] Fix NativeRegisterContextFreeBSD_{arm,mips64,powerpc} declarations (#101403) Similar to #97796, fix the type of the `native_thread` parameter for the arm, mips64 and powerpc variants of `NativeRegisterContextFreeBSD_*`. Otherwise, this leads to compile errors similar to: ``` lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.cpp:85:39: error: out-of-line definition of 'NativeRegisterContextFreeBSD_powerpc' does not match any declaration in 'lldb_private::process_freebsd::NativeRegisterContextFreeBSD_powerpc' 85 | NativeRegisterContextFreeBSD_powerpc::NativeRegisterContextFreeBSD_powerpc( | ^~~~ ``` (cherry picked from commit 7088a5ed880f29129ec844c66068e8cb61ca98bf) --- .../Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h | 2 +- .../Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h | 2 +- .../Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h index 89ffa617294aa..b9537e6952f6c 100644 --- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h +++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h @@ -30,7 +30,7 @@ class NativeProcessFreeBSD; class NativeRegisterContextFreeBSD_arm : public NativeRegisterContextFreeBSD { public: NativeRegisterContextFreeBSD_arm(const ArchSpec &target_arch, - NativeThreadProtocol &native_thread); + NativeThreadFreeBSD &native_thread); uint32_t GetRegisterSetCount() const override; diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h index 0b4a508a7d5dd..286b4fd8d8b99 100644 --- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h +++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h @@ -31,7 +31,7 @@ class NativeRegisterContextFreeBSD_mips64 : public NativeRegisterContextFreeBSD { public: NativeRegisterContextFreeBSD_mips64(const ArchSpec &target_arch, - NativeThreadProtocol &native_thread); + NativeThreadFreeBSD &native_thread); uint32_t GetRegisterSetCount() const override; diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h index 3df371036f915..420db822acc0f 100644 --- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h +++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h @@ -31,7 +31,7 @@ class NativeRegisterContextFreeBSD_powerpc : public NativeRegisterContextFreeBSD { public: NativeRegisterContextFreeBSD_powerpc(const ArchSpec &target_arch, - NativeThreadProtocol &native_thread); + NativeThreadFreeBSD &native_thread); uint32_t GetRegisterSetCount() const override; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/19.x: [lldb][FreeBSD] Fix NativeRegisterContextFreeBSD_{arm, mips64, powerpc} declarations (#101403) (PR #101465)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/101465 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/19.x: [lldb][FreeBSD] Fix NativeRegisterContextFreeBSD_{arm, mips64, powerpc} declarations (#101403) (PR #101465)
llvmbot wrote: @emaste What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/101465 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/19.x: [lldb][FreeBSD] Fix NativeRegisterContextFreeBSD_{arm, mips64, powerpc} declarations (#101403) (PR #101465)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/101465 >From 6a9a4be6d040f7f0e2aae024ebff7555641f85d3 Mon Sep 17 00:00:00 2001 From: Dimitry Andric Date: Thu, 1 Aug 2024 09:28:29 +0200 Subject: [PATCH] [lldb][FreeBSD] Fix NativeRegisterContextFreeBSD_{arm,mips64,powerpc} declarations (#101403) Similar to #97796, fix the type of the `native_thread` parameter for the arm, mips64 and powerpc variants of `NativeRegisterContextFreeBSD_*`. Otherwise, this leads to compile errors similar to: ``` lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.cpp:85:39: error: out-of-line definition of 'NativeRegisterContextFreeBSD_powerpc' does not match any declaration in 'lldb_private::process_freebsd::NativeRegisterContextFreeBSD_powerpc' 85 | NativeRegisterContextFreeBSD_powerpc::NativeRegisterContextFreeBSD_powerpc( | ^~~~ ``` (cherry picked from commit 7088a5ed880f29129ec844c66068e8cb61ca98bf) --- .../Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h | 2 +- .../Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h | 2 +- .../Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h index 89ffa617294aa..b9537e6952f6c 100644 --- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h +++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h @@ -30,7 +30,7 @@ class NativeProcessFreeBSD; class NativeRegisterContextFreeBSD_arm : public NativeRegisterContextFreeBSD { public: NativeRegisterContextFreeBSD_arm(const ArchSpec &target_arch, - NativeThreadProtocol &native_thread); + NativeThreadFreeBSD &native_thread); uint32_t GetRegisterSetCount() const override; diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h index 0b4a508a7d5dd..286b4fd8d8b99 100644 --- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h +++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h @@ -31,7 +31,7 @@ class NativeRegisterContextFreeBSD_mips64 : public NativeRegisterContextFreeBSD { public: NativeRegisterContextFreeBSD_mips64(const ArchSpec &target_arch, - NativeThreadProtocol &native_thread); + NativeThreadFreeBSD &native_thread); uint32_t GetRegisterSetCount() const override; diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h index 3df371036f915..420db822acc0f 100644 --- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h +++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h @@ -31,7 +31,7 @@ class NativeRegisterContextFreeBSD_powerpc : public NativeRegisterContextFreeBSD { public: NativeRegisterContextFreeBSD_powerpc(const ArchSpec &target_arch, - NativeThreadProtocol &native_thread); + NativeThreadFreeBSD &native_thread); uint32_t GetRegisterSetCount() const override; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/19.x: [lldb][FreeBSD] Fix NativeRegisterContextFreeBSD_{arm, mips64, powerpc} declarations (#101403) (PR #101465)
llvmbot wrote: @llvm/pr-subscribers-lldb Author: None (llvmbot) Changes Backport 7088a5ed880f29129ec844c66068e8cb61ca98bf Requested by: @DimitryAndric --- Full diff: https://github.com/llvm/llvm-project/pull/101465.diff 3 Files Affected: - (modified) lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h (+1-1) - (modified) lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h (+1-1) - (modified) lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h (+1-1) ``diff diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h index 89ffa617294aa..b9537e6952f6c 100644 --- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h +++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_arm.h @@ -30,7 +30,7 @@ class NativeProcessFreeBSD; class NativeRegisterContextFreeBSD_arm : public NativeRegisterContextFreeBSD { public: NativeRegisterContextFreeBSD_arm(const ArchSpec &target_arch, - NativeThreadProtocol &native_thread); + NativeThreadFreeBSD &native_thread); uint32_t GetRegisterSetCount() const override; diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h index 0b4a508a7d5dd..286b4fd8d8b99 100644 --- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h +++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_mips64.h @@ -31,7 +31,7 @@ class NativeRegisterContextFreeBSD_mips64 : public NativeRegisterContextFreeBSD { public: NativeRegisterContextFreeBSD_mips64(const ArchSpec &target_arch, - NativeThreadProtocol &native_thread); + NativeThreadFreeBSD &native_thread); uint32_t GetRegisterSetCount() const override; diff --git a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h index 3df371036f915..420db822acc0f 100644 --- a/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h +++ b/lldb/source/Plugins/Process/FreeBSD/NativeRegisterContextFreeBSD_powerpc.h @@ -31,7 +31,7 @@ class NativeRegisterContextFreeBSD_powerpc : public NativeRegisterContextFreeBSD { public: NativeRegisterContextFreeBSD_powerpc(const ArchSpec &target_arch, - NativeThreadProtocol &native_thread); + NativeThreadFreeBSD &native_thread); uint32_t GetRegisterSetCount() const override; `` https://github.com/llvm/llvm-project/pull/101465 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [VP] Refactor VectorBuilder to avoid layering violation. NFC (#99276) (PR #101102)
tru wrote: @topperc what do you think about this one? https://github.com/llvm/llvm-project/pull/101102 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [VP] Refactor VectorBuilder to avoid layering violation. NFC (#99276) (PR #101102)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101102 >From 0d61c98e83882497e189b8f89d07a85521dafe45 Mon Sep 17 00:00:00 2001 From: NAKAMURA Takumi Date: Sun, 28 Jul 2024 16:48:23 +0900 Subject: [PATCH 1/3] [Bazel] Use PACKAGE_VERSION for version string. This enables "-rc" suffix in release branches. (cherry picked from commit 25efb746d907ce0ffdd9195d191ff0f6944ea3ca) --- utils/bazel/llvm-project-overlay/clang/BUILD.bazel | 6 +++--- utils/bazel/llvm-project-overlay/llvm/config.bzl | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/utils/bazel/llvm-project-overlay/clang/BUILD.bazel b/utils/bazel/llvm-project-overlay/clang/BUILD.bazel index 2d7ce8702a5d9..c50dc174a1def 100644 --- a/utils/bazel/llvm-project-overlay/clang/BUILD.bazel +++ b/utils/bazel/llvm-project-overlay/clang/BUILD.bazel @@ -4,10 +4,10 @@ load( "//:vars.bzl", -"LLVM_VERSION", "LLVM_VERSION_MAJOR", "LLVM_VERSION_MINOR", "LLVM_VERSION_PATCH", +"PACKAGE_VERSION", ) load("//:workspace_root.bzl", "workspace_root") load("//llvm:binary_alias.bzl", "binary_alias") @@ -553,12 +553,12 @@ genrule( "echo '#define CLANG_VERSION_MAJOR_STRING \"{major}\"' >> $@\n" + "echo '#define CLANG_VERSION_MINOR {minor}' >> $@\n" + "echo '#define CLANG_VERSION_PATCHLEVEL {patch}' >> $@\n" + -"echo '#define CLANG_VERSION_STRING \"{vers}git\"' >> $@\n" +"echo '#define CLANG_VERSION_STRING \"{vers}\"' >> $@\n" ).format( major = LLVM_VERSION_MAJOR, minor = LLVM_VERSION_MINOR, patch = LLVM_VERSION_PATCH, -vers = LLVM_VERSION, +vers = PACKAGE_VERSION, ), ) diff --git a/utils/bazel/llvm-project-overlay/llvm/config.bzl b/utils/bazel/llvm-project-overlay/llvm/config.bzl index 2e3bff53ead9d..9de966688eda5 100644 --- a/utils/bazel/llvm-project-overlay/llvm/config.bzl +++ b/utils/bazel/llvm-project-overlay/llvm/config.bzl @@ -6,10 +6,10 @@ load( "//:vars.bzl", -"LLVM_VERSION", "LLVM_VERSION_MAJOR", "LLVM_VERSION_MINOR", "LLVM_VERSION_PATCH", +"PACKAGE_VERSION", ) def native_arch_defines(arch, triple): @@ -108,7 +108,7 @@ llvm_config_defines = os_defines + builtin_thread_pointer + select({ "LLVM_VERSION_MAJOR={}".format(LLVM_VERSION_MAJOR), "LLVM_VERSION_MINOR={}".format(LLVM_VERSION_MINOR), "LLVM_VERSION_PATCH={}".format(LLVM_VERSION_PATCH), -r'LLVM_VERSION_STRING=\"{}git\"'.format(LLVM_VERSION), +r'LLVM_VERSION_STRING=\"{}\"'.format(PACKAGE_VERSION), # These shouldn't be needed by the C++11 standard, but are for some # platforms (e.g. glibc < 2.18. See # https://sourceware.org/bugzilla/show_bug.cgi?id=15366). These are also >From 540426f906fd7b6ef48a3cb2deaafd4c751c1f2d Mon Sep 17 00:00:00 2001 From: Mel Chen Date: Thu, 25 Jul 2024 15:14:39 +0800 Subject: [PATCH 2/3] [VP] Refactor VectorBuilder to avoid layering violation. NFC (#99276) This patch refactors the handling of reduction to eliminate layering violations. * Introduced `getReductionIntrinsicID` in LoopUtils.h for mapping recurrence kinds to llvm.vector.reduce.* intrinsic IDs. * Updated `VectorBuilder::createSimpleTargetReduction` to accept llvm.vector.reduce.* intrinsic directly. * New function `VPIntrinsic::getForIntrinsic` for mapping intrinsic ID to the same functional VP intrinsic ID. (cherry picked from commit 6d12b3f67df429b6e1953d9f55867d7e2469) --- llvm/include/llvm/IR/IntrinsicInst.h | 4 ++ llvm/include/llvm/IR/VectorBuilder.h | 5 +- .../include/llvm/Transforms/Utils/LoopUtils.h | 4 ++ llvm/lib/IR/IntrinsicInst.cpp | 19 +++ llvm/lib/IR/VectorBuilder.cpp | 57 ++- llvm/lib/Transforms/Utils/LoopUtils.cpp | 44 +- llvm/unittests/IR/VPIntrinsicTest.cpp | 53 + 7 files changed, 129 insertions(+), 57 deletions(-) diff --git a/llvm/include/llvm/IR/IntrinsicInst.h b/llvm/include/llvm/IR/IntrinsicInst.h index fe3f92da400f8..94c8fa092f45e 100644 --- a/llvm/include/llvm/IR/IntrinsicInst.h +++ b/llvm/include/llvm/IR/IntrinsicInst.h @@ -569,6 +569,10 @@ class VPIntrinsic : public IntrinsicInst { /// The llvm.vp.* intrinsics for this instruction Opcode static Intrinsic::ID getForOpcode(unsigned OC); + /// The llvm.vp.* intrinsics for this intrinsic ID \p Id. Return \p Id if it + /// is already a VP intrinsic. + static Intrinsic::ID getForIntrinsic(Intrinsic::ID Id); + // Whether \p ID is a VP intrinsic ID. static bool isVPIntrinsic(Intrinsic::ID); diff --git a/llvm/include/llvm/IR/VectorBuilder.h b/llvm/include/llvm/IR/VectorBuilder.h index 6af7f6075551d..dbb9f4c7336d5 100644 --- a/llvm/include/llvm/IR/VectorBuilder.h +++ b/llvm/include/llvm/IR/VectorBuilder.h @@ -15,7 +15,6 @@ #ifndef LLVM_IR_VECTORBUILDER_H #define LLVM_IR_VECTORBUILDER_H -#include #include #include #include @@ -100,11 +99,11 @@ class
[llvm-branch-commits] [libc] [AArch64] - cannot build from release/18.x (PR #101358)
tru wrote: Hi! 18.x release is done and we don't accept any additional patches for that branch. Please test if this works as expected on LLVM 19.x https://github.com/llvm/llvm-project/pull/101358 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libc] [AArch64] - cannot build from release/18.x (PR #101358)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101358 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [Clang][OMPX] Add the code generation for multi-dim `num_teams` (PR #101407)
@@ -9576,6 +9576,20 @@ static void genMapInfo(const OMPExecutableDirective &D, CodeGenFunction &CGF, MappedVarSet, CombinedInfo); genMapInfo(MEHandler, CGF, CombinedInfo, OMPBuilder, MappedVarSet); } + +static void emitNumTeamsForBareTargetDirective( +CodeGenFunction &CGF, const OMPExecutableDirective &D, +llvm::SmallVectorImpl &NumTeams) { alexey-bataev wrote: ```suggestion llvm::ArrayRef NumTeams) { ``` https://github.com/llvm/llvm-project/pull/101407 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [SLP] Order clustered load base pointers by ascending offsets (#100653) (PR #101033)
davemgreen wrote: It is fixing a performance regression introduced in #98025 under AArch64. My only worry in it is the sorting algorithm, where there is a chance it isn't quite strict-weak. The regression wasn't huge, lets leave this one and we can pick it up in the next release. It's a bit of a shame to get the decreased performance but it might be better to be careful, considering this was already wrong once. https://github.com/llvm/llvm-project/pull/101033 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [SLP] Order clustered load base pointers by ascending offsets (#100653) (PR #101033)
https://github.com/davemgreen closed https://github.com/llvm/llvm-project/pull/101033 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
@@ -0,0 +1,259 @@ +//===- LowerWorkshare.cpp - special cases for bufferization ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// Lower omp workshare construct. +//===--===// + +#include "flang/Optimizer/Dialect/FIROps.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/BuiltinOps.h" +#include "mlir/IR/IRMapping.h" +#include "mlir/IR/OpDefinition.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" +#include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/iterator_range.h" + +#include + +namespace flangomp { +#define GEN_PASS_DEF_LOWERWORKSHARE +#include "flang/Optimizer/OpenMP/Passes.h.inc" +} // namespace flangomp + +#define DEBUG_TYPE "lower-workshare" + +using namespace mlir; + +namespace flangomp { +bool shouldUseWorkshareLowering(Operation *op) { + auto workshare = dyn_cast(op->getParentOp()); + if (!workshare) +return false; + return workshare->getParentOfType(); +} +} // namespace flangomp + +namespace { + +struct SingleRegion { + Block::iterator begin, end; +}; + +static bool isSupportedByFirAlloca(Type ty) { + return !isa(ty); +} + +static bool isSafeToParallelize(Operation *op) { + if (isa(op)) +return true; + + llvm::SmallVector effects; + MemoryEffectOpInterface interface = dyn_cast(op); + if (!interface) { +return false; + } + interface.getEffects(effects); + if (effects.empty()) +return true; + + return false; +} + +/// Lowers workshare to a sequence of single-thread regions and parallel loops +/// +/// For example: +/// +/// omp.workshare { +/// %a = fir.allocmem +/// omp.wsloop {} +/// fir.call Assign %b %a +/// fir.freemem %a +/// } +/// +/// becomes +/// +/// omp.single { +/// %a = fir.allocmem +/// fir.store %a %tmp +/// } +/// %a_reloaded = fir.load %tmp +/// omp.wsloop {} +/// omp.single { +/// fir.call Assign %b %a_reloaded +/// fir.freemem %a_reloaded +/// } +/// +/// Note that we allocate temporary memory for values in omp.single's which need +/// to be accessed in all threads in the closest omp.parallel +/// +/// TODO currently we need to be able to access the encompassing omp.parallel so +/// that we can allocate temporaries accessible by all threads outside of it. +/// In case we do not find it, we fall back to converting the omp.workshare to +/// omp.single. +/// To better handle this we should probably enable yielding values out of an +/// omp.single which will be supported by the omp runtime. tblah wrote: Could you use the copyprivate clause to broadcast values from the omp.single to all other threads? https://github.com/llvm/llvm-project/pull/101446 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/101446 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
https://github.com/tblah commented: Thank you for your work so far. This is a great start. What is the plan for transforming do loops generated by lowering (e.g. that do not become hlfir.elemental operations and are not generated by hlfir bufferization)? https://github.com/llvm/llvm-project/pull/101446 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
@@ -0,0 +1,259 @@ +//===- LowerWorkshare.cpp - special cases for bufferization ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// Lower omp workshare construct. +//===--===// + +#include "flang/Optimizer/Dialect/FIROps.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/BuiltinOps.h" +#include "mlir/IR/IRMapping.h" +#include "mlir/IR/OpDefinition.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" +#include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/iterator_range.h" + +#include + +namespace flangomp { +#define GEN_PASS_DEF_LOWERWORKSHARE +#include "flang/Optimizer/OpenMP/Passes.h.inc" +} // namespace flangomp + +#define DEBUG_TYPE "lower-workshare" + +using namespace mlir; + +namespace flangomp { +bool shouldUseWorkshareLowering(Operation *op) { + auto workshare = dyn_cast(op->getParentOp()); + if (!workshare) +return false; + return workshare->getParentOfType(); +} +} // namespace flangomp + +namespace { + +struct SingleRegion { + Block::iterator begin, end; +}; + +static bool isSupportedByFirAlloca(Type ty) { + return !isa(ty); +} + +static bool isSafeToParallelize(Operation *op) { + if (isa(op)) +return true; + + llvm::SmallVector effects; + MemoryEffectOpInterface interface = dyn_cast(op); + if (!interface) { +return false; + } + interface.getEffects(effects); + if (effects.empty()) +return true; + + return false; +} + +/// Lowers workshare to a sequence of single-thread regions and parallel loops +/// +/// For example: +/// +/// omp.workshare { +/// %a = fir.allocmem +/// omp.wsloop {} +/// fir.call Assign %b %a +/// fir.freemem %a +/// } +/// +/// becomes +/// +/// omp.single { +/// %a = fir.allocmem +/// fir.store %a %tmp +/// } +/// %a_reloaded = fir.load %tmp +/// omp.wsloop {} +/// omp.single { +/// fir.call Assign %b %a_reloaded +/// fir.freemem %a_reloaded +/// } +/// +/// Note that we allocate temporary memory for values in omp.single's which need +/// to be accessed in all threads in the closest omp.parallel +/// +/// TODO currently we need to be able to access the encompassing omp.parallel so +/// that we can allocate temporaries accessible by all threads outside of it. +/// In case we do not find it, we fall back to converting the omp.workshare to +/// omp.single. +/// To better handle this we should probably enable yielding values out of an +/// omp.single which will be supported by the omp runtime. +void lowerWorkshare(mlir::omp::WorkshareOp wsOp) { + assert(wsOp.getRegion().getBlocks().size() == 1); + + Location loc = wsOp->getLoc(); + + omp::ParallelOp parallelOp = wsOp->getParentOfType(); + if (!parallelOp) { +wsOp.emitWarning("cannot handle workshare, converting to single"); +Operation *terminator = wsOp.getRegion().front().getTerminator(); +wsOp->getBlock()->getOperations().splice( +wsOp->getIterator(), wsOp.getRegion().front().getOperations()); +terminator->erase(); +return; + } + + OpBuilder allocBuilder(parallelOp); + OpBuilder rootBuilder(wsOp); + IRMapping rootMapping; + + omp::SingleOp singleOp = nullptr; + + auto mapReloadedValue = [&](Value v, OpBuilder singleBuilder, + IRMapping singleMapping) { +if (auto reloaded = rootMapping.lookupOrNull(v)) + return; +Type llvmPtrTy = LLVM::LLVMPointerType::get(allocBuilder.getContext()); +Type ty = v.getType(); +Value alloc, reloaded; +if (isSupportedByFirAlloca(ty)) { + alloc = allocBuilder.create(loc, ty); + singleBuilder.create(loc, singleMapping.lookup(v), alloc); + reloaded = rootBuilder.create(loc, ty, alloc); +} else { tblah wrote: What types are you seeing used which cannot be supported by fir alloca? https://github.com/llvm/llvm-project/pull/101446 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
@@ -0,0 +1,259 @@ +//===- LowerWorkshare.cpp - special cases for bufferization ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// Lower omp workshare construct. +//===--===// + +#include "flang/Optimizer/Dialect/FIROps.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/BuiltinOps.h" +#include "mlir/IR/IRMapping.h" +#include "mlir/IR/OpDefinition.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" +#include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/iterator_range.h" + +#include + +namespace flangomp { +#define GEN_PASS_DEF_LOWERWORKSHARE +#include "flang/Optimizer/OpenMP/Passes.h.inc" +} // namespace flangomp + +#define DEBUG_TYPE "lower-workshare" + +using namespace mlir; + +namespace flangomp { +bool shouldUseWorkshareLowering(Operation *op) { + auto workshare = dyn_cast(op->getParentOp()); + if (!workshare) +return false; + return workshare->getParentOfType(); +} +} // namespace flangomp + +namespace { + +struct SingleRegion { + Block::iterator begin, end; +}; + +static bool isSupportedByFirAlloca(Type ty) { + return !isa(ty); +} + +static bool isSafeToParallelize(Operation *op) { + if (isa(op)) +return true; + + llvm::SmallVector effects; + MemoryEffectOpInterface interface = dyn_cast(op); + if (!interface) { +return false; + } + interface.getEffects(effects); + if (effects.empty()) +return true; + + return false; +} + +/// Lowers workshare to a sequence of single-thread regions and parallel loops +/// +/// For example: +/// +/// omp.workshare { +/// %a = fir.allocmem +/// omp.wsloop {} +/// fir.call Assign %b %a +/// fir.freemem %a +/// } +/// +/// becomes +/// +/// omp.single { +/// %a = fir.allocmem +/// fir.store %a %tmp +/// } +/// %a_reloaded = fir.load %tmp +/// omp.wsloop {} +/// omp.single { +/// fir.call Assign %b %a_reloaded +/// fir.freemem %a_reloaded +/// } +/// +/// Note that we allocate temporary memory for values in omp.single's which need +/// to be accessed in all threads in the closest omp.parallel +/// +/// TODO currently we need to be able to access the encompassing omp.parallel so +/// that we can allocate temporaries accessible by all threads outside of it. +/// In case we do not find it, we fall back to converting the omp.workshare to +/// omp.single. +/// To better handle this we should probably enable yielding values out of an +/// omp.single which will be supported by the omp runtime. +void lowerWorkshare(mlir::omp::WorkshareOp wsOp) { + assert(wsOp.getRegion().getBlocks().size() == 1); + + Location loc = wsOp->getLoc(); + + omp::ParallelOp parallelOp = wsOp->getParentOfType(); + if (!parallelOp) { +wsOp.emitWarning("cannot handle workshare, converting to single"); +Operation *terminator = wsOp.getRegion().front().getTerminator(); +wsOp->getBlock()->getOperations().splice( +wsOp->getIterator(), wsOp.getRegion().front().getOperations()); +terminator->erase(); +return; + } + + OpBuilder allocBuilder(parallelOp); + OpBuilder rootBuilder(wsOp); + IRMapping rootMapping; + + omp::SingleOp singleOp = nullptr; + + auto mapReloadedValue = [&](Value v, OpBuilder singleBuilder, + IRMapping singleMapping) { +if (auto reloaded = rootMapping.lookupOrNull(v)) + return; +Type llvmPtrTy = LLVM::LLVMPointerType::get(allocBuilder.getContext()); +Type ty = v.getType(); +Value alloc, reloaded; +if (isSupportedByFirAlloca(ty)) { + alloc = allocBuilder.create(loc, ty); + singleBuilder.create(loc, singleMapping.lookup(v), alloc); + reloaded = rootBuilder.create(loc, ty, alloc); +} else { + auto one = allocBuilder.create( + loc, allocBuilder.getI32Type(), 1); + alloc = + allocBuilder.create(loc, llvmPtrTy, llvmPtrTy, one); + Value toStore = singleBuilder + .create( + loc, llvmPtrTy, singleMapping.lookup(v)) + .getResult(0); + singleBuilder.create(loc, toStore, alloc); + reloaded = rootBuilder.create(loc, llvmPtrTy, alloc); + reloaded = + rootBuilder.create(loc, ty, reloaded) + .getResult(0); +} +rootMapping.map(v, reloaded); + }; + + auto moveToSingle = [&](SingleRegion sr, OpBuilder singleBuilder) { +IRMapping singleMapping = rootMapping; + +for (Operation &op : llvm::make_range(sr.begin, sr.end)) { + singleBuilder.clone(op, singleMapping); + if (i
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
@@ -344,6 +345,7 @@ inline void createHLFIRToFIRPassPipeline( pm.addPass(hlfir::createLowerHLFIRIntrinsics()); pm.addPass(hlfir::createBufferizeHLFIR()); pm.addPass(hlfir::createConvertHLFIRtoFIR()); + pm.addPass(flangomp::createLowerWorkshare()); tblah wrote: The other OpenMP passes are added in `createOpenMPFIRPassPipeline`, which is only called when `-fopenmp` is used. It would be convenient if this new pass could stay with the other OpenMP passes. Currently those passes are run immediately after lowering. There are comments which say they have to be run immediately after lowering, but at a glance it isn't obvious why they couldn't be run here after HLFIR. @agozillon what do you think? https://github.com/llvm/llvm-project/pull/101446 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
@@ -0,0 +1,18 @@ +//===-- Passes.td - HLFIR pass definition file -*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef FORTRAN_DIALECT_OPENMP_PASSES +#define FORTRAN_DIALECT_OPENMP_PASSES + +include "mlir/Pass/PassBase.td" + +def LowerWorkshare : Pass<"lower-workshare"> { tblah wrote: This pass doesn't have an operation type associated with it and so `pm.addPass` will run it on every operation in the module (and the module itself). I think we can only get workshare operations inside of functions so maybe this should be run on func.func. Maybe for more future proofing you could run it on all top level operations (e.g. `addNestedPassToAllTopLevelOperations` instead of `pm.addPass`). I think the pass has to be run on the parent of the workshare loop not on the workshare loop operation itself because operations are inserted and removed from that parent. https://github.com/llvm/llvm-project/pull/101446 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
@@ -0,0 +1,259 @@ +//===- LowerWorkshare.cpp - special cases for bufferization ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// Lower omp workshare construct. +//===--===// + +#include "flang/Optimizer/Dialect/FIROps.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/BuiltinOps.h" +#include "mlir/IR/IRMapping.h" +#include "mlir/IR/OpDefinition.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" +#include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/iterator_range.h" + +#include + +namespace flangomp { +#define GEN_PASS_DEF_LOWERWORKSHARE +#include "flang/Optimizer/OpenMP/Passes.h.inc" +} // namespace flangomp + +#define DEBUG_TYPE "lower-workshare" + +using namespace mlir; + +namespace flangomp { +bool shouldUseWorkshareLowering(Operation *op) { + auto workshare = dyn_cast(op->getParentOp()); + if (!workshare) +return false; + return workshare->getParentOfType(); +} +} // namespace flangomp + +namespace { + +struct SingleRegion { + Block::iterator begin, end; +}; + +static bool isSupportedByFirAlloca(Type ty) { + return !isa(ty); +} + +static bool isSafeToParallelize(Operation *op) { + if (isa(op)) +return true; + + llvm::SmallVector effects; + MemoryEffectOpInterface interface = dyn_cast(op); + if (!interface) { +return false; + } + interface.getEffects(effects); + if (effects.empty()) +return true; + + return false; +} + +/// Lowers workshare to a sequence of single-thread regions and parallel loops +/// +/// For example: +/// +/// omp.workshare { +/// %a = fir.allocmem +/// omp.wsloop {} +/// fir.call Assign %b %a +/// fir.freemem %a +/// } +/// +/// becomes +/// +/// omp.single { +/// %a = fir.allocmem +/// fir.store %a %tmp +/// } +/// %a_reloaded = fir.load %tmp +/// omp.wsloop {} +/// omp.single { +/// fir.call Assign %b %a_reloaded +/// fir.freemem %a_reloaded +/// } +/// +/// Note that we allocate temporary memory for values in omp.single's which need +/// to be accessed in all threads in the closest omp.parallel +/// +/// TODO currently we need to be able to access the encompassing omp.parallel so +/// that we can allocate temporaries accessible by all threads outside of it. +/// In case we do not find it, we fall back to converting the omp.workshare to +/// omp.single. +/// To better handle this we should probably enable yielding values out of an +/// omp.single which will be supported by the omp runtime. +void lowerWorkshare(mlir::omp::WorkshareOp wsOp) { + assert(wsOp.getRegion().getBlocks().size() == 1); + + Location loc = wsOp->getLoc(); + + omp::ParallelOp parallelOp = wsOp->getParentOfType(); + if (!parallelOp) { +wsOp.emitWarning("cannot handle workshare, converting to single"); +Operation *terminator = wsOp.getRegion().front().getTerminator(); +wsOp->getBlock()->getOperations().splice( +wsOp->getIterator(), wsOp.getRegion().front().getOperations()); +terminator->erase(); +return; + } + + OpBuilder allocBuilder(parallelOp); + OpBuilder rootBuilder(wsOp); + IRMapping rootMapping; + + omp::SingleOp singleOp = nullptr; + + auto mapReloadedValue = [&](Value v, OpBuilder singleBuilder, + IRMapping singleMapping) { +if (auto reloaded = rootMapping.lookupOrNull(v)) + return; +Type llvmPtrTy = LLVM::LLVMPointerType::get(allocBuilder.getContext()); +Type ty = v.getType(); +Value alloc, reloaded; +if (isSupportedByFirAlloca(ty)) { + alloc = allocBuilder.create(loc, ty); + singleBuilder.create(loc, singleMapping.lookup(v), alloc); tblah wrote: What about more complicated types e.g. an allocatable array? If you only shallow copy the descriptor won't every thread in the parallel region try to free the same allocation? https://github.com/llvm/llvm-project/pull/101446 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
@@ -2,3 +2,4 @@ add_subdirectory(CodeGen) add_subdirectory(Dialect) add_subdirectory(HLFIR) add_subdirectory(Transforms) +add_subdirectory(OpenMP) tblah wrote: There are some other OpenMP passes already in `flang/lib/Optimizer/Transforms/OMP*.cpp`. I prefer creating a separate subdirectory as you have done here. Please could you move the other passes here too. https://github.com/llvm/llvm-project/pull/101446 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
@@ -792,7 +793,8 @@ struct ElementalOpConversion // Generate a loop nest looping around the fir.elemental shape and clone // fir.elemental region inside the inner loop. hlfir::LoopNest loopNest = -hlfir::genLoopNest(loc, builder, extents, !elemental.isOrdered()); +hlfir::genLoopNest(loc, builder, extents, !elemental.isOrdered(), + flangomp::shouldUseWorkshareLowering(elemental)); tblah wrote: I'm a bit worried about hlfir.elementals with the isOrdered flag being lowered as workshare loops. This could happen for example for elemental calls of impure functions. Does the standard say how these should be handled? https://github.com/llvm/llvm-project/pull/101446 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
@@ -0,0 +1,259 @@ +//===- LowerWorkshare.cpp - special cases for bufferization ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// Lower omp workshare construct. +//===--===// + +#include "flang/Optimizer/Dialect/FIROps.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/BuiltinOps.h" +#include "mlir/IR/IRMapping.h" +#include "mlir/IR/OpDefinition.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" +#include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/iterator_range.h" + +#include + +namespace flangomp { +#define GEN_PASS_DEF_LOWERWORKSHARE +#include "flang/Optimizer/OpenMP/Passes.h.inc" +} // namespace flangomp + +#define DEBUG_TYPE "lower-workshare" + +using namespace mlir; + +namespace flangomp { +bool shouldUseWorkshareLowering(Operation *op) { + auto workshare = dyn_cast(op->getParentOp()); tblah wrote: why does the workshare op have to be the immediate parent? Couldn't there be another operation in between (e.g a `fir.if`?) https://github.com/llvm/llvm-project/pull/101446 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)
@@ -0,0 +1,259 @@ +//===- LowerWorkshare.cpp - special cases for bufferization ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// Lower omp workshare construct. +//===--===// + +#include "flang/Optimizer/Dialect/FIROps.h" +#include "flang/Optimizer/Dialect/FIRType.h" +#include "flang/Optimizer/OpenMP/Passes.h" +#include "mlir/Dialect/OpenMP/OpenMPDialect.h" +#include "mlir/IR/BuiltinOps.h" +#include "mlir/IR/IRMapping.h" +#include "mlir/IR/OpDefinition.h" +#include "mlir/IR/PatternMatch.h" +#include "mlir/Support/LLVM.h" +#include "mlir/Transforms/GreedyPatternRewriteDriver.h" +#include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/iterator_range.h" + +#include + +namespace flangomp { +#define GEN_PASS_DEF_LOWERWORKSHARE +#include "flang/Optimizer/OpenMP/Passes.h.inc" +} // namespace flangomp + +#define DEBUG_TYPE "lower-workshare" + +using namespace mlir; + +namespace flangomp { +bool shouldUseWorkshareLowering(Operation *op) { + auto workshare = dyn_cast(op->getParentOp()); + if (!workshare) +return false; + return workshare->getParentOfType(); +} +} // namespace flangomp + +namespace { + +struct SingleRegion { + Block::iterator begin, end; +}; + +static bool isSupportedByFirAlloca(Type ty) { + return !isa(ty); +} + +static bool isSafeToParallelize(Operation *op) { + if (isa(op)) +return true; + + llvm::SmallVector effects; + MemoryEffectOpInterface interface = dyn_cast(op); + if (!interface) { +return false; + } + interface.getEffects(effects); + if (effects.empty()) +return true; + + return false; +} + +/// Lowers workshare to a sequence of single-thread regions and parallel loops +/// +/// For example: +/// +/// omp.workshare { +/// %a = fir.allocmem +/// omp.wsloop {} +/// fir.call Assign %b %a +/// fir.freemem %a +/// } +/// +/// becomes +/// +/// omp.single { +/// %a = fir.allocmem +/// fir.store %a %tmp +/// } +/// %a_reloaded = fir.load %tmp +/// omp.wsloop {} +/// omp.single { +/// fir.call Assign %b %a_reloaded +/// fir.freemem %a_reloaded +/// } +/// +/// Note that we allocate temporary memory for values in omp.single's which need +/// to be accessed in all threads in the closest omp.parallel +/// +/// TODO currently we need to be able to access the encompassing omp.parallel so +/// that we can allocate temporaries accessible by all threads outside of it. +/// In case we do not find it, we fall back to converting the omp.workshare to +/// omp.single. +/// To better handle this we should probably enable yielding values out of an +/// omp.single which will be supported by the omp runtime. +void lowerWorkshare(mlir::omp::WorkshareOp wsOp) { + assert(wsOp.getRegion().getBlocks().size() == 1); + + Location loc = wsOp->getLoc(); + + omp::ParallelOp parallelOp = wsOp->getParentOfType(); + if (!parallelOp) { +wsOp.emitWarning("cannot handle workshare, converting to single"); +Operation *terminator = wsOp.getRegion().front().getTerminator(); +wsOp->getBlock()->getOperations().splice( +wsOp->getIterator(), wsOp.getRegion().front().getOperations()); +terminator->erase(); +return; + } + + OpBuilder allocBuilder(parallelOp); + OpBuilder rootBuilder(wsOp); + IRMapping rootMapping; + + omp::SingleOp singleOp = nullptr; + + auto mapReloadedValue = [&](Value v, OpBuilder singleBuilder, + IRMapping singleMapping) { +if (auto reloaded = rootMapping.lookupOrNull(v)) + return; +Type llvmPtrTy = LLVM::LLVMPointerType::get(allocBuilder.getContext()); +Type ty = v.getType(); +Value alloc, reloaded; +if (isSupportedByFirAlloca(ty)) { + alloc = allocBuilder.create(loc, ty); + singleBuilder.create(loc, singleMapping.lookup(v), alloc); + reloaded = rootBuilder.create(loc, ty, alloc); +} else { + auto one = allocBuilder.create( + loc, allocBuilder.getI32Type(), 1); + alloc = + allocBuilder.create(loc, llvmPtrTy, llvmPtrTy, one); + Value toStore = singleBuilder + .create( + loc, llvmPtrTy, singleMapping.lookup(v)) + .getResult(0); + singleBuilder.create(loc, toStore, alloc); + reloaded = rootBuilder.create(loc, llvmPtrTy, alloc); + reloaded = + rootBuilder.create(loc, ty, reloaded) + .getResult(0); +} +rootMapping.map(v, reloaded); + }; + + auto moveToSingle = [&](SingleRegion sr, OpBuilder singleBuilder) { +IRMapping singleMapping = rootMapping; + +for (Operation &op : llvm::make_range(sr.begin, sr.end)) { + singleBuilder.clone(op, singleMapping); + if (i
[llvm-branch-commits] [flang] [llvm] [mlir] [MLIR][OpenMP][OMPIRBuilder] Add lowering support for omp.target_triples (PR #100156)
@@ -7053,13 +7053,30 @@ OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::emitTargetTask( << "\n"); return Builder.saveIP(); } + static void emitTargetCall( OpenMPIRBuilder &OMPBuilder, IRBuilderBase &Builder, OpenMPIRBuilder::InsertPointTy AllocaIP, Function *OutlinedFn, Constant *OutlinedFnID, int32_t NumTeams, int32_t NumThreads, SmallVectorImpl &Args, OpenMPIRBuilder::GenMapInfoCallbackTy GenMapInfoCB, SmallVector Dependencies = {}) { + // Generate a function call to the host fallback implementation of the target + // region. This is called by the host when no offload entry was generated for + // the target region and when the offloading call fails at runtime. + auto &&EmitTargetCallFallbackCB = skatrak wrote: Thank you for pointing out this issue. I just pushed some changes to hopefully address it, though I think these should get a second review from you since you're more familiar with the handling of `target depend`. https://github.com/llvm/llvm-project/pull/100156 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)
alexfh wrote: > bool operator==(const LazySpecializationInfo &Other) Making this `operator==` `const` helps with the warning in C++20. And it's the right thing to do anyway. https://github.com/llvm/llvm-project/pull/83237 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/19.x: [lldb][FreeBSD] Fix NativeRegisterContextFreeBSD_{arm, mips64, powerpc} declarations (#101403) (PR #101465)
tru wrote: @emaste @JDevlieghere what do you think? safe for backport? (it sounds that way to me). https://github.com/llvm/llvm-project/pull/101465 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Handle new atomicrmw metadata for fadd case (PR #96760)
arsenm wrote: ping https://github.com/llvm/llvm-project/pull/96760 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] clang/AMDGPU: Emit atomicrmw for __builtin_amdgcn_global_atomic_fadd_{f32|f64} (PR #96872)
arsenm wrote: ping https://github.com/llvm/llvm-project/pull/96872 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f16 builtins (PR #96873)
arsenm wrote: ping https://github.com/llvm/llvm-project/pull/96873 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins (PR #96874)
arsenm wrote: ping https://github.com/llvm/llvm-project/pull/96874 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] clang/AMDGPU: Emit atomicrmw for __builtin_amdgcn_global_atomic_fadd_{f32|f64} (PR #96872)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96872 >From 2e27b153cf40498f64ef9f13b69e80804c45a6a4 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 11 Jun 2024 10:58:44 +0200 Subject: [PATCH 1/2] clang/AMDGPU: Emit atomicrmw for __builtin_amdgcn_global_atomic_fadd_{f32|f64} Need to emit syncscope and new metadata to get the native instruction, most of the time. --- clang/lib/CodeGen/CGBuiltin.cpp | 39 +-- .../CodeGenOpenCL/builtins-amdgcn-gfx11.cl| 2 +- .../builtins-fp-atomics-gfx12.cl | 4 +- .../builtins-fp-atomics-gfx90a.cl | 4 +- .../builtins-fp-atomics-gfx940.cl | 4 +- 5 files changed, 34 insertions(+), 19 deletions(-) diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 0c2ee446aa303..02f85f340893d 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -58,6 +58,7 @@ #include "llvm/IR/MDBuilder.h" #include "llvm/IR/MatrixBuilder.h" #include "llvm/IR/MemoryModelRelaxationAnnotations.h" +#include "llvm/Support/AMDGPUAddrSpace.h" #include "llvm/Support/ConvertUTF.h" #include "llvm/Support/MathExtras.h" #include "llvm/Support/ScopedPrinter.h" @@ -18776,8 +18777,6 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, Function *F = CGM.getIntrinsic(Intrin, { Src0->getType() }); return Builder.CreateCall(F, { Src0, Builder.getFalse() }); } - case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f64: - case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f32: case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16: case AMDGPU::BI__builtin_amdgcn_global_atomic_fmin_f64: case AMDGPU::BI__builtin_amdgcn_global_atomic_fmax_f64: @@ -18789,18 +18788,11 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, Intrinsic::ID IID; llvm::Type *ArgTy = llvm::Type::getDoubleTy(getLLVMContext()); switch (BuiltinID) { -case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f32: - ArgTy = llvm::Type::getFloatTy(getLLVMContext()); - IID = Intrinsic::amdgcn_global_atomic_fadd; - break; case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16: ArgTy = llvm::FixedVectorType::get( llvm::Type::getHalfTy(getLLVMContext()), 2); IID = Intrinsic::amdgcn_global_atomic_fadd; break; -case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f64: - IID = Intrinsic::amdgcn_global_atomic_fadd; - break; case AMDGPU::BI__builtin_amdgcn_global_atomic_fmin_f64: IID = Intrinsic::amdgcn_global_atomic_fmin; break; @@ -19223,7 +19215,9 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2bf16: case AMDGPU::BI__builtin_amdgcn_ds_faddf: case AMDGPU::BI__builtin_amdgcn_ds_fminf: - case AMDGPU::BI__builtin_amdgcn_ds_fmaxf: { + case AMDGPU::BI__builtin_amdgcn_ds_fmaxf: + case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f32: + case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f64: { llvm::AtomicRMWInst::BinOp BinOp; switch (BuiltinID) { case AMDGPU::BI__builtin_amdgcn_atomic_inc32: @@ -19239,6 +19233,8 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_f32: case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2f16: case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2bf16: +case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f32: +case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f64: BinOp = llvm::AtomicRMWInst::FAdd; break; case AMDGPU::BI__builtin_amdgcn_ds_fminf: @@ -19273,8 +19269,13 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, ProcessOrderScopeAMDGCN(EmitScalarExpr(E->getArg(2)), EmitScalarExpr(E->getArg(3)), AO, SSID); } else { - // The ds_atomic_fadd_* builtins do not have syncscope/order arguments. - SSID = llvm::SyncScope::System; + // Most of the builtins do not have syncscope/order arguments. For DS + // atomics the scope doesn't really matter, as they implicitly operate at + // workgroup scope. + // + // The global/flat cases need to use agent scope to consistently produce + // the native instruction instead of a cmpxchg expansion. + SSID = getLLVMContext().getOrInsertSyncScopeID("agent"); AO = AtomicOrdering::SequentiallyConsistent; // The v2bf16 builtin uses i16 instead of a natural bfloat type. @@ -19289,6 +19290,20 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, Builder.CreateAtomicRMW(BinOp, Ptr, Val, AO, SSID); if (Volatile) RMW->setVolatile(true); + +unsigned AddrSpace = Ptr.getType()->getAddressSpace(); +if (AddrSpace != llvm::AMDGPUAS::LOCAL_ADDRESS) { + // Most targets require "amdgpu.no.fine.grained.memory" to emit the nativ
[llvm-branch-commits] [clang] clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins (PR #96874)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96874 >From c8a9e8de2d0faf678ab8d67c85c4efd8312d5d10 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 26 Jun 2024 19:15:26 +0200 Subject: [PATCH] clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins --- clang/lib/CodeGen/CGBuiltin.cpp | 17 ++--- .../CodeGenOpenCL/builtins-fp-atomics-gfx90a.cl | 6 -- .../CodeGenOpenCL/builtins-fp-atomics-gfx940.cl | 3 ++- 3 files changed, 12 insertions(+), 14 deletions(-) diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index ef4bd9fb4af09..c19a80921beaf 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -18779,10 +18779,8 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, } case AMDGPU::BI__builtin_amdgcn_global_atomic_fmin_f64: case AMDGPU::BI__builtin_amdgcn_global_atomic_fmax_f64: - case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_f64: case AMDGPU::BI__builtin_amdgcn_flat_atomic_fmin_f64: - case AMDGPU::BI__builtin_amdgcn_flat_atomic_fmax_f64: - case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_f32: { + case AMDGPU::BI__builtin_amdgcn_flat_atomic_fmax_f64: { Intrinsic::ID IID; llvm::Type *ArgTy = llvm::Type::getDoubleTy(getLLVMContext()); switch (BuiltinID) { @@ -18792,19 +18790,12 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, case AMDGPU::BI__builtin_amdgcn_global_atomic_fmax_f64: IID = Intrinsic::amdgcn_global_atomic_fmax; break; -case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_f64: - IID = Intrinsic::amdgcn_flat_atomic_fadd; - break; case AMDGPU::BI__builtin_amdgcn_flat_atomic_fmin_f64: IID = Intrinsic::amdgcn_flat_atomic_fmin; break; case AMDGPU::BI__builtin_amdgcn_flat_atomic_fmax_f64: IID = Intrinsic::amdgcn_flat_atomic_fmax; break; -case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_f32: - ArgTy = llvm::Type::getFloatTy(getLLVMContext()); - IID = Intrinsic::amdgcn_flat_atomic_fadd; - break; } llvm::Value *Addr = EmitScalarExpr(E->getArg(0)); llvm::Value *Val = EmitScalarExpr(E->getArg(1)); @@ -19207,7 +19198,9 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f32: case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f64: case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16: - case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_v2f16: { + case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_v2f16: + case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_f32: + case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_f64: { llvm::AtomicRMWInst::BinOp BinOp; switch (BuiltinID) { case AMDGPU::BI__builtin_amdgcn_atomic_inc32: @@ -19227,6 +19220,8 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f64: case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16: case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_v2f16: +case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_f32: +case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_f64: BinOp = llvm::AtomicRMWInst::FAdd; break; case AMDGPU::BI__builtin_amdgcn_ds_fminf: diff --git a/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx90a.cl b/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx90a.cl index cd10777dbe079..02e289427238f 100644 --- a/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx90a.cl +++ b/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx90a.cl @@ -45,7 +45,8 @@ void test_global_max_f64(__global double *addr, double x){ } // CHECK-LABEL: test_flat_add_local_f64 -// CHECK: call double @llvm.amdgcn.flat.atomic.fadd.f64.p3.f64(ptr addrspace(3) %{{.*}}, double %{{.*}}) +// CHECK: = atomicrmw fadd ptr addrspace(3) %{{.+}}, double %{{.+}} syncscope("agent") seq_cst, align 8{{$}} + // GFX90A-LABEL: test_flat_add_local_f64$local // GFX90A: ds_add_rtn_f64 void test_flat_add_local_f64(__local double *addr, double x){ @@ -54,7 +55,8 @@ void test_flat_add_local_f64(__local double *addr, double x){ } // CHECK-LABEL: test_flat_global_add_f64 -// CHECK: call double @llvm.amdgcn.flat.atomic.fadd.f64.p1.f64(ptr addrspace(1) %{{.*}}, double %{{.*}}) +// CHECK: = atomicrmw fadd ptr addrspace(1) {{.+}}, double %{{.+}} syncscope("agent") seq_cst, align 8, !amdgpu.no.fine.grained.memory !{{[0-9]+$}} + // GFX90A-LABEL: test_flat_global_add_f64$local // GFX90A: global_atomic_add_f64 void test_flat_global_add_f64(__global double *addr, double x){ diff --git a/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx940.cl b/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx940.cl index 589dcd406630d..bd9b8c7268e06 100644 --- a/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx940.cl +++ b/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx940.cl @@ -10,7 +10,8 @@ typedef half _
[llvm-branch-commits] [clang] clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f16 builtins (PR #96873)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96873 >From 7305c0477711f7b26e4ebad3cca0afa33e1defa9 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 26 Jun 2024 19:12:59 +0200 Subject: [PATCH] clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f16 builtins --- clang/lib/CodeGen/CGBuiltin.cpp | 20 ++- .../builtins-fp-atomics-gfx12.cl | 9 ++--- .../builtins-fp-atomics-gfx90a.cl | 2 +- .../builtins-fp-atomics-gfx940.cl | 3 ++- 4 files changed, 15 insertions(+), 19 deletions(-) diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index ad4cce77221a6..ef4bd9fb4af09 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -18777,22 +18777,15 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, Function *F = CGM.getIntrinsic(Intrin, { Src0->getType() }); return Builder.CreateCall(F, { Src0, Builder.getFalse() }); } - case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16: case AMDGPU::BI__builtin_amdgcn_global_atomic_fmin_f64: case AMDGPU::BI__builtin_amdgcn_global_atomic_fmax_f64: case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_f64: case AMDGPU::BI__builtin_amdgcn_flat_atomic_fmin_f64: case AMDGPU::BI__builtin_amdgcn_flat_atomic_fmax_f64: - case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_f32: - case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_v2f16: { + case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_f32: { Intrinsic::ID IID; llvm::Type *ArgTy = llvm::Type::getDoubleTy(getLLVMContext()); switch (BuiltinID) { -case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16: - ArgTy = llvm::FixedVectorType::get( - llvm::Type::getHalfTy(getLLVMContext()), 2); - IID = Intrinsic::amdgcn_global_atomic_fadd; - break; case AMDGPU::BI__builtin_amdgcn_global_atomic_fmin_f64: IID = Intrinsic::amdgcn_global_atomic_fmin; break; @@ -18812,11 +18805,6 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, ArgTy = llvm::Type::getFloatTy(getLLVMContext()); IID = Intrinsic::amdgcn_flat_atomic_fadd; break; -case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_v2f16: - ArgTy = llvm::FixedVectorType::get( - llvm::Type::getHalfTy(getLLVMContext()), 2); - IID = Intrinsic::amdgcn_flat_atomic_fadd; - break; } llvm::Value *Addr = EmitScalarExpr(E->getArg(0)); llvm::Value *Val = EmitScalarExpr(E->getArg(1)); @@ -19217,7 +19205,9 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, case AMDGPU::BI__builtin_amdgcn_ds_fminf: case AMDGPU::BI__builtin_amdgcn_ds_fmaxf: case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f32: - case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f64: { + case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f64: + case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16: + case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_v2f16: { llvm::AtomicRMWInst::BinOp BinOp; switch (BuiltinID) { case AMDGPU::BI__builtin_amdgcn_atomic_inc32: @@ -19235,6 +19225,8 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2bf16: case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f32: case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f64: +case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16: +case AMDGPU::BI__builtin_amdgcn_flat_atomic_fadd_v2f16: BinOp = llvm::AtomicRMWInst::FAdd; break; case AMDGPU::BI__builtin_amdgcn_ds_fminf: diff --git a/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx12.cl b/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx12.cl index 6b8a6d14575db..07e63a8711c7f 100644 --- a/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx12.cl +++ b/clang/test/CodeGenOpenCL/builtins-fp-atomics-gfx12.cl @@ -48,7 +48,8 @@ void test_local_add_2f16_noret(__local half2 *addr, half2 x) { } // CHECK-LABEL: test_flat_add_2f16 -// CHECK: call <2 x half> @llvm.amdgcn.flat.atomic.fadd.v2f16.p0.v2f16(ptr %{{.*}}, <2 x half> %{{.*}}) +// CHECK: [[RMW:%.+]] = atomicrmw fadd ptr %{{.+}}, <2 x half> %{{.+}} syncscope("agent") seq_cst, align 4, !amdgpu.no.fine.grained.memory !{{[0-9]+$}} + // GFX12-LABEL: test_flat_add_2f16 // GFX12: flat_atomic_pk_add_f16 half2 test_flat_add_2f16(__generic half2 *addr, half2 x) { @@ -64,7 +65,8 @@ short2 test_flat_add_2bf16(__generic short2 *addr, short2 x) { } // CHECK-LABEL: test_global_add_half2 -// CHECK: call <2 x half> @llvm.amdgcn.global.atomic.fadd.v2f16.p1.v2f16(ptr addrspace(1) %{{.*}}, <2 x half> %{{.*}}) +// CHECK: [[RMW:%.+]] = atomicrmw fadd ptr addrspace(1) %{{.+}}, <2 x half> %{{.+}} syncscope("agent") seq_cst, align 4, !amdgpu.no.fine.grained.memory !{{[0-9]+$}} + // GFX12-LABEL: test_global_add_half2 // GFX12: global_atomic_pk_add_f16 v2, v[0:1], v2, off
[llvm-branch-commits] [llvm] AMDGPU: Remove global/flat atomic fadd intrinics (PR #97051)
@@ -322,4 +322,36 @@ define <2 x i16> @upgrade_amdgcn_global_atomic_fadd_v2bf16_p1(ptr addrspace(1) % ret <2 x i16> %result } +declare <2 x half> @llvm.amdgcn.flat.atomic.fadd.v2f16.p0.v2f16(ptr nocapture, <2 x half>) #0 Pierre-vh wrote: nit: could we auto-generate this test? Maybe as a future patch or just precommit it directly. https://github.com/llvm/llvm-project/pull/97051 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Remove global/flat atomic fadd intrinics (PR #97051)
@@ -75,6 +75,11 @@ Changes to the AArch64 Backend Changes to the AMDGPU Backend - +* Removed ``llvm.amdgcn.flat.atomic.fadd`` and + ``llvm.amdgcn.global.atomic.fadd`` intrinsics. Users should use the + :ref:`atomicrmw ` instruction with `fadd` and Pierre-vh wrote: Does `i_atomicrmw` work here? Did you try building the docs? https://github.com/llvm/llvm-project/pull/97051 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Remove global/flat atomic fadd intrinics (PR #97051)
@@ -1017,29 +1015,6 @@ main_body: ret void } -define amdgpu_kernel void @global_atomic_fadd_f64_noret(ptr addrspace(1) %ptr, double %data) { Pierre-vh wrote: Why are some tests deleted, and some others changed to use atomicrmw? https://github.com/llvm/llvm-project/pull/97051 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Remove global/flat atomic fadd intrinics (PR #97051)
@@ -322,4 +322,36 @@ define <2 x i16> @upgrade_amdgcn_global_atomic_fadd_v2bf16_p1(ptr addrspace(1) % ret <2 x i16> %result } +declare <2 x half> @llvm.amdgcn.flat.atomic.fadd.v2f16.p0.v2f16(ptr nocapture, <2 x half>) #0 arsenm wrote: Yes, but also no. These tests should use llvm-as/llvm-dis instead of opt, and the update scripts don't understand that https://github.com/llvm/llvm-project/pull/97051 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Remove global/flat atomic fadd intrinics (PR #97051)
@@ -1017,29 +1015,6 @@ main_body: ret void } -define amdgpu_kernel void @global_atomic_fadd_f64_noret(ptr addrspace(1) %ptr, double %data) { arsenm wrote: Depends if they are redundant or not. Some cases already tested atomicrmw, and had the intrinsic alongside it. We still have a lot of redundancy spread across multiple files https://github.com/llvm/llvm-project/pull/97051 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] release/19.x: [clang-tidy] Fix crash in modernize-use-ranges (#100427) (PR #101482)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/101482 Backport 0762db6533eda3453158c7b9b0631542c47093a8 Requested by: @njames93 >From 49228b7feaf8b132d4008fa88e02df0cec267a63 Mon Sep 17 00:00:00 2001 From: Nathan James Date: Thu, 25 Jul 2024 16:25:37 +0100 Subject: [PATCH] [clang-tidy] Fix crash in modernize-use-ranges (#100427) Crash seems to be caused by the check function not handling inline namespaces correctly for some instances. Changed how the Replacer is got from the MatchResult now which should alleviate any potential issues Fixes #100406 (cherry picked from commit 0762db6533eda3453158c7b9b0631542c47093a8) --- .../clang-tidy/utils/UseRangesCheck.cpp | 64 +-- .../clang-tidy/utils/UseRangesCheck.h | 2 +- .../modernize/Inputs/use-ranges/fake_std.h| 17 +++-- 3 files changed, 43 insertions(+), 40 deletions(-) diff --git a/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp b/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp index e2daa5010e2ae..aba4d17ccd035 100644 --- a/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp +++ b/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp @@ -39,12 +39,6 @@ static constexpr const char ArgName[] = "ArgName"; namespace clang::tidy::utils { -static bool operator==(const UseRangesCheck::Indexes &L, - const UseRangesCheck::Indexes &R) { - return std::tie(L.BeginArg, L.EndArg, L.ReplaceArg) == - std::tie(R.BeginArg, R.EndArg, R.ReplaceArg); -} - static std::string getFullPrefix(ArrayRef Signature) { std::string Output; llvm::raw_string_ostream OS(Output); @@ -54,15 +48,6 @@ static std::string getFullPrefix(ArrayRef Signature) { return Output; } -static llvm::hash_code hash_value(const UseRangesCheck::Indexes &Indexes) { - return llvm::hash_combine(Indexes.BeginArg, Indexes.EndArg, -Indexes.ReplaceArg); -} - -static llvm::hash_code hash_value(const UseRangesCheck::Signature &Sig) { - return llvm::hash_combine_range(Sig.begin(), Sig.end()); -} - namespace { AST_MATCHER(Expr, hasSideEffects) { @@ -123,24 +108,26 @@ makeMatcherPair(StringRef State, const UseRangesCheck::Indexes &Indexes, } void UseRangesCheck::registerMatchers(MatchFinder *Finder) { - Replaces = getReplacerMap(); + auto Replaces = getReplacerMap(); ReverseDescriptor = getReverseDescriptor(); auto BeginEndNames = getFreeBeginEndMethods(); llvm::SmallVector BeginNames{ llvm::make_first_range(BeginEndNames)}; llvm::SmallVector EndNames{ llvm::make_second_range(BeginEndNames)}; - llvm::DenseSet> Seen; + Replacers.clear(); + llvm::DenseSet SeenRepl; for (auto I = Replaces.begin(), E = Replaces.end(); I != E; ++I) { -const ArrayRef &Signatures = -I->getValue()->getReplacementSignatures(); -if (!Seen.insert(Signatures).second) +auto Replacer = I->getValue(); +if (!SeenRepl.insert(Replacer.get()).second) continue; -assert(!Signatures.empty() && - llvm::all_of(Signatures, [](auto Index) { return !Index.empty(); })); +Replacers.push_back(Replacer); +assert(!Replacer->getReplacementSignatures().empty() && + llvm::all_of(Replacer->getReplacementSignatures(), +[](auto Index) { return !Index.empty(); })); std::vector Names(1, I->getKey()); for (auto J = std::next(I); J != E; ++J) - if (J->getValue()->getReplacementSignatures() == Signatures) + if (J->getValue() == Replacer) Names.push_back(J->getKey()); std::vector TotalMatchers; @@ -148,7 +135,7 @@ void UseRangesCheck::registerMatchers(MatchFinder *Finder) { // signatures in order of length(longest to shortest). This way any // signature that is a subset of another signature will be matched after the // other. -SmallVector SigVec(Signatures); +SmallVector SigVec(Replacer->getReplacementSignatures()); llvm::sort(SigVec, [](auto &L, auto &R) { return R.size() < L.size(); }); for (const auto &Signature : SigVec) { std::vector Matchers; @@ -163,7 +150,8 @@ void UseRangesCheck::registerMatchers(MatchFinder *Finder) { } Finder->addMatcher( callExpr( -callee(functionDecl(hasAnyName(std::move(Names))).bind(FuncDecl)), +callee(functionDecl(hasAnyName(std::move(Names))) + .bind((FuncDecl + Twine(Replacers.size() - 1).str(, ast_matchers::internal::DynTypedMatcher::constructVariadic( ast_matchers::internal::DynTypedMatcher::VO_AnyOf, ASTNodeKind::getFromNodeKind(), @@ -205,21 +193,33 @@ static void removeFunctionArgs(DiagnosticBuilder &Diag, const CallExpr &Call, } void UseRangesCheck::check(const MatchFinder::MatchResult &Result) { - const auto *Function = Result.Nodes.getNodeAs(FuncDecl); - std::string Qualified = "::" + Function->getQualifiedNameAsString(); - auto Iter = Replaces.find
[llvm-branch-commits] [clang-tools-extra] release/19.x: [clang-tidy] Fix crash in modernize-use-ranges (#100427) (PR #101482)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/101482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] release/19.x: [clang-tidy] Fix crash in modernize-use-ranges (#100427) (PR #101482)
llvmbot wrote: @PiotrZSL What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/101482 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] release/19.x: [clang-tidy] Fix crash in modernize-use-ranges (#100427) (PR #101482)
llvmbot wrote: @llvm/pr-subscribers-clang-tidy Author: None (llvmbot) Changes Backport 0762db6533eda3453158c7b9b0631542c47093a8 Requested by: @njames93 --- Full diff: https://github.com/llvm/llvm-project/pull/101482.diff 3 Files Affected: - (modified) clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp (+32-32) - (modified) clang-tools-extra/clang-tidy/utils/UseRangesCheck.h (+1-1) - (modified) clang-tools-extra/test/clang-tidy/checkers/modernize/Inputs/use-ranges/fake_std.h (+10-7) ``diff diff --git a/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp b/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp index e2daa5010e2ae..aba4d17ccd035 100644 --- a/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp +++ b/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp @@ -39,12 +39,6 @@ static constexpr const char ArgName[] = "ArgName"; namespace clang::tidy::utils { -static bool operator==(const UseRangesCheck::Indexes &L, - const UseRangesCheck::Indexes &R) { - return std::tie(L.BeginArg, L.EndArg, L.ReplaceArg) == - std::tie(R.BeginArg, R.EndArg, R.ReplaceArg); -} - static std::string getFullPrefix(ArrayRef Signature) { std::string Output; llvm::raw_string_ostream OS(Output); @@ -54,15 +48,6 @@ static std::string getFullPrefix(ArrayRef Signature) { return Output; } -static llvm::hash_code hash_value(const UseRangesCheck::Indexes &Indexes) { - return llvm::hash_combine(Indexes.BeginArg, Indexes.EndArg, -Indexes.ReplaceArg); -} - -static llvm::hash_code hash_value(const UseRangesCheck::Signature &Sig) { - return llvm::hash_combine_range(Sig.begin(), Sig.end()); -} - namespace { AST_MATCHER(Expr, hasSideEffects) { @@ -123,24 +108,26 @@ makeMatcherPair(StringRef State, const UseRangesCheck::Indexes &Indexes, } void UseRangesCheck::registerMatchers(MatchFinder *Finder) { - Replaces = getReplacerMap(); + auto Replaces = getReplacerMap(); ReverseDescriptor = getReverseDescriptor(); auto BeginEndNames = getFreeBeginEndMethods(); llvm::SmallVector BeginNames{ llvm::make_first_range(BeginEndNames)}; llvm::SmallVector EndNames{ llvm::make_second_range(BeginEndNames)}; - llvm::DenseSet> Seen; + Replacers.clear(); + llvm::DenseSet SeenRepl; for (auto I = Replaces.begin(), E = Replaces.end(); I != E; ++I) { -const ArrayRef &Signatures = -I->getValue()->getReplacementSignatures(); -if (!Seen.insert(Signatures).second) +auto Replacer = I->getValue(); +if (!SeenRepl.insert(Replacer.get()).second) continue; -assert(!Signatures.empty() && - llvm::all_of(Signatures, [](auto Index) { return !Index.empty(); })); +Replacers.push_back(Replacer); +assert(!Replacer->getReplacementSignatures().empty() && + llvm::all_of(Replacer->getReplacementSignatures(), +[](auto Index) { return !Index.empty(); })); std::vector Names(1, I->getKey()); for (auto J = std::next(I); J != E; ++J) - if (J->getValue()->getReplacementSignatures() == Signatures) + if (J->getValue() == Replacer) Names.push_back(J->getKey()); std::vector TotalMatchers; @@ -148,7 +135,7 @@ void UseRangesCheck::registerMatchers(MatchFinder *Finder) { // signatures in order of length(longest to shortest). This way any // signature that is a subset of another signature will be matched after the // other. -SmallVector SigVec(Signatures); +SmallVector SigVec(Replacer->getReplacementSignatures()); llvm::sort(SigVec, [](auto &L, auto &R) { return R.size() < L.size(); }); for (const auto &Signature : SigVec) { std::vector Matchers; @@ -163,7 +150,8 @@ void UseRangesCheck::registerMatchers(MatchFinder *Finder) { } Finder->addMatcher( callExpr( -callee(functionDecl(hasAnyName(std::move(Names))).bind(FuncDecl)), +callee(functionDecl(hasAnyName(std::move(Names))) + .bind((FuncDecl + Twine(Replacers.size() - 1).str(, ast_matchers::internal::DynTypedMatcher::constructVariadic( ast_matchers::internal::DynTypedMatcher::VO_AnyOf, ASTNodeKind::getFromNodeKind(), @@ -205,21 +193,33 @@ static void removeFunctionArgs(DiagnosticBuilder &Diag, const CallExpr &Call, } void UseRangesCheck::check(const MatchFinder::MatchResult &Result) { - const auto *Function = Result.Nodes.getNodeAs(FuncDecl); - std::string Qualified = "::" + Function->getQualifiedNameAsString(); - auto Iter = Replaces.find(Qualified); - assert(Iter != Replaces.end()); + Replacer *Replacer = nullptr; + const FunctionDecl *Function = nullptr; + for (auto [Node, Value] : Result.Nodes.getMap()) { +StringRef NodeStr(Node); +if (!NodeStr.consume_front(FuncDecl)) + continue; +Function = Value.get(); +size_t Index; +if (NodeStr.getAsInteger(10, Index)) { + llvm_unreac
[llvm-branch-commits] [llvm] AMDGPU: Remove global/flat atomic fadd intrinics (PR #97051)
@@ -75,6 +75,11 @@ Changes to the AArch64 Backend Changes to the AMDGPU Backend - +* Removed ``llvm.amdgcn.flat.atomic.fadd`` and + ``llvm.amdgcn.global.atomic.fadd`` intrinsics. Users should use the + :ref:`atomicrmw ` instruction with `fadd` and arsenm wrote: This refers to i_atomicrmw? The documentation bot passes. https://github.com/llvm/llvm-project/pull/97051 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LLVM][PassBuilder] Extend the function signature of callback for optimizer pipeline extension point (PR #100953)
shiltian wrote: ping if it is preferred to split the AMDGPU related changes to another PR, I can do that. https://github.com/llvm/llvm-project/pull/100953 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [Support] Silence warnings when retrieving exported functions (#97905) (PR #101266)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101266 >From c3004032c244cb5264790dc535437b9c3b93acb6 Mon Sep 17 00:00:00 2001 From: Alexandre Ganea Date: Tue, 30 Jul 2024 19:06:03 -0400 Subject: [PATCH] [Support] Silence warnings when retrieving exported functions (#97905) Since functions exported from DLLs are type-erased, before this patch I was seeing the new Clang 19 warning `-Wcast-function-type-mismatch`. This happens when building LLVM on Windows. Following discussion in https://github.com/llvm/llvm-project/commit/593f708118aef792f434185547f74fedeaf51dd4#commitcomment-143905744 (cherry picked from commit 39e192b379362e9e645427631c35450d55ed517d) --- llvm/lib/Support/Windows/Process.inc | 3 ++- llvm/lib/Support/Windows/Signals.inc | 38 +++- 2 files changed, 23 insertions(+), 18 deletions(-) diff --git a/llvm/lib/Support/Windows/Process.inc b/llvm/lib/Support/Windows/Process.inc index 34d294b232c32..d525f5b16e862 100644 --- a/llvm/lib/Support/Windows/Process.inc +++ b/llvm/lib/Support/Windows/Process.inc @@ -482,7 +482,8 @@ static RTL_OSVERSIONINFOEXW GetWindowsVer() { HMODULE hMod = ::GetModuleHandleW(L"ntdll.dll"); assert(hMod); -auto getVer = (RtlGetVersionPtr)::GetProcAddress(hMod, "RtlGetVersion"); +auto getVer = +(RtlGetVersionPtr)(void *)::GetProcAddress(hMod, "RtlGetVersion"); assert(getVer); RTL_OSVERSIONINFOEXW info{}; diff --git a/llvm/lib/Support/Windows/Signals.inc b/llvm/lib/Support/Windows/Signals.inc index 29ebf7c696e04..f11ad09f37139 100644 --- a/llvm/lib/Support/Windows/Signals.inc +++ b/llvm/lib/Support/Windows/Signals.inc @@ -171,23 +171,27 @@ static bool load64BitDebugHelp(void) { HMODULE hLib = ::LoadLibraryExA("Dbghelp.dll", NULL, LOAD_LIBRARY_SEARCH_SYSTEM32); if (hLib) { -fMiniDumpWriteDump = -(fpMiniDumpWriteDump)::GetProcAddress(hLib, "MiniDumpWriteDump"); -fStackWalk64 = (fpStackWalk64)::GetProcAddress(hLib, "StackWalk64"); -fSymGetModuleBase64 = -(fpSymGetModuleBase64)::GetProcAddress(hLib, "SymGetModuleBase64"); -fSymGetSymFromAddr64 = -(fpSymGetSymFromAddr64)::GetProcAddress(hLib, "SymGetSymFromAddr64"); -fSymGetLineFromAddr64 = -(fpSymGetLineFromAddr64)::GetProcAddress(hLib, "SymGetLineFromAddr64"); -fSymGetModuleInfo64 = -(fpSymGetModuleInfo64)::GetProcAddress(hLib, "SymGetModuleInfo64"); -fSymFunctionTableAccess64 = (fpSymFunctionTableAccess64)::GetProcAddress( -hLib, "SymFunctionTableAccess64"); -fSymSetOptions = (fpSymSetOptions)::GetProcAddress(hLib, "SymSetOptions"); -fSymInitialize = (fpSymInitialize)::GetProcAddress(hLib, "SymInitialize"); -fEnumerateLoadedModules = (fpEnumerateLoadedModules)::GetProcAddress( -hLib, "EnumerateLoadedModules64"); +fMiniDumpWriteDump = (fpMiniDumpWriteDump)(void *)::GetProcAddress( +hLib, "MiniDumpWriteDump"); +fStackWalk64 = (fpStackWalk64)(void *)::GetProcAddress(hLib, "StackWalk64"); +fSymGetModuleBase64 = (fpSymGetModuleBase64)(void *)::GetProcAddress( +hLib, "SymGetModuleBase64"); +fSymGetSymFromAddr64 = (fpSymGetSymFromAddr64)(void *)::GetProcAddress( +hLib, "SymGetSymFromAddr64"); +fSymGetLineFromAddr64 = (fpSymGetLineFromAddr64)(void *)::GetProcAddress( +hLib, "SymGetLineFromAddr64"); +fSymGetModuleInfo64 = (fpSymGetModuleInfo64)(void *)::GetProcAddress( +hLib, "SymGetModuleInfo64"); +fSymFunctionTableAccess64 = +(fpSymFunctionTableAccess64)(void *)::GetProcAddress( +hLib, "SymFunctionTableAccess64"); +fSymSetOptions = +(fpSymSetOptions)(void *)::GetProcAddress(hLib, "SymSetOptions"); +fSymInitialize = +(fpSymInitialize)(void *)::GetProcAddress(hLib, "SymInitialize"); +fEnumerateLoadedModules = +(fpEnumerateLoadedModules)(void *)::GetProcAddress( +hLib, "EnumerateLoadedModules64"); } return isDebugHelpInitialized(); } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] c300403 - [Support] Silence warnings when retrieving exported functions (#97905)
Author: Alexandre Ganea Date: 2024-08-01T15:18:32+02:00 New Revision: c3004032c244cb5264790dc535437b9c3b93acb6 URL: https://github.com/llvm/llvm-project/commit/c3004032c244cb5264790dc535437b9c3b93acb6 DIFF: https://github.com/llvm/llvm-project/commit/c3004032c244cb5264790dc535437b9c3b93acb6.diff LOG: [Support] Silence warnings when retrieving exported functions (#97905) Since functions exported from DLLs are type-erased, before this patch I was seeing the new Clang 19 warning `-Wcast-function-type-mismatch`. This happens when building LLVM on Windows. Following discussion in https://github.com/llvm/llvm-project/commit/593f708118aef792f434185547f74fedeaf51dd4#commitcomment-143905744 (cherry picked from commit 39e192b379362e9e645427631c35450d55ed517d) Added: Modified: llvm/lib/Support/Windows/Process.inc llvm/lib/Support/Windows/Signals.inc Removed: diff --git a/llvm/lib/Support/Windows/Process.inc b/llvm/lib/Support/Windows/Process.inc index 34d294b232c32..d525f5b16e862 100644 --- a/llvm/lib/Support/Windows/Process.inc +++ b/llvm/lib/Support/Windows/Process.inc @@ -482,7 +482,8 @@ static RTL_OSVERSIONINFOEXW GetWindowsVer() { HMODULE hMod = ::GetModuleHandleW(L"ntdll.dll"); assert(hMod); -auto getVer = (RtlGetVersionPtr)::GetProcAddress(hMod, "RtlGetVersion"); +auto getVer = +(RtlGetVersionPtr)(void *)::GetProcAddress(hMod, "RtlGetVersion"); assert(getVer); RTL_OSVERSIONINFOEXW info{}; diff --git a/llvm/lib/Support/Windows/Signals.inc b/llvm/lib/Support/Windows/Signals.inc index 29ebf7c696e04..f11ad09f37139 100644 --- a/llvm/lib/Support/Windows/Signals.inc +++ b/llvm/lib/Support/Windows/Signals.inc @@ -171,23 +171,27 @@ static bool load64BitDebugHelp(void) { HMODULE hLib = ::LoadLibraryExA("Dbghelp.dll", NULL, LOAD_LIBRARY_SEARCH_SYSTEM32); if (hLib) { -fMiniDumpWriteDump = -(fpMiniDumpWriteDump)::GetProcAddress(hLib, "MiniDumpWriteDump"); -fStackWalk64 = (fpStackWalk64)::GetProcAddress(hLib, "StackWalk64"); -fSymGetModuleBase64 = -(fpSymGetModuleBase64)::GetProcAddress(hLib, "SymGetModuleBase64"); -fSymGetSymFromAddr64 = -(fpSymGetSymFromAddr64)::GetProcAddress(hLib, "SymGetSymFromAddr64"); -fSymGetLineFromAddr64 = -(fpSymGetLineFromAddr64)::GetProcAddress(hLib, "SymGetLineFromAddr64"); -fSymGetModuleInfo64 = -(fpSymGetModuleInfo64)::GetProcAddress(hLib, "SymGetModuleInfo64"); -fSymFunctionTableAccess64 = (fpSymFunctionTableAccess64)::GetProcAddress( -hLib, "SymFunctionTableAccess64"); -fSymSetOptions = (fpSymSetOptions)::GetProcAddress(hLib, "SymSetOptions"); -fSymInitialize = (fpSymInitialize)::GetProcAddress(hLib, "SymInitialize"); -fEnumerateLoadedModules = (fpEnumerateLoadedModules)::GetProcAddress( -hLib, "EnumerateLoadedModules64"); +fMiniDumpWriteDump = (fpMiniDumpWriteDump)(void *)::GetProcAddress( +hLib, "MiniDumpWriteDump"); +fStackWalk64 = (fpStackWalk64)(void *)::GetProcAddress(hLib, "StackWalk64"); +fSymGetModuleBase64 = (fpSymGetModuleBase64)(void *)::GetProcAddress( +hLib, "SymGetModuleBase64"); +fSymGetSymFromAddr64 = (fpSymGetSymFromAddr64)(void *)::GetProcAddress( +hLib, "SymGetSymFromAddr64"); +fSymGetLineFromAddr64 = (fpSymGetLineFromAddr64)(void *)::GetProcAddress( +hLib, "SymGetLineFromAddr64"); +fSymGetModuleInfo64 = (fpSymGetModuleInfo64)(void *)::GetProcAddress( +hLib, "SymGetModuleInfo64"); +fSymFunctionTableAccess64 = +(fpSymFunctionTableAccess64)(void *)::GetProcAddress( +hLib, "SymFunctionTableAccess64"); +fSymSetOptions = +(fpSymSetOptions)(void *)::GetProcAddress(hLib, "SymSetOptions"); +fSymInitialize = +(fpSymInitialize)(void *)::GetProcAddress(hLib, "SymInitialize"); +fEnumerateLoadedModules = +(fpEnumerateLoadedModules)(void *)::GetProcAddress( +hLib, "EnumerateLoadedModules64"); } return isDebugHelpInitialized(); } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [Support] Silence warnings when retrieving exported functions (#97905) (PR #101266)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101266 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits