[llvm-branch-commits] [llvm] ecd542d - Revert "Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054)…"
Author: Diana Picus Date: 2024-09-12T09:51:27+02:00 New Revision: ecd542d0e8ee3a37e979ff761ab3c633bcda5baf URL: https://github.com/llvm/llvm-project/commit/ecd542d0e8ee3a37e979ff761ab3c633bcda5baf DIFF: https://github.com/llvm/llvm-project/commit/ecd542d0e8ee3a37e979ff761ab3c633bcda5baf.diff LOG: Revert "Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054)…" This reverts commit 703ebca869e1e684147d316b7bdb15437c12206a. Added: Modified: llvm/include/llvm/IR/IntrinsicsAMDGPU.td llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.h llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp llvm/lib/Target/AMDGPU/SIFrameLowering.cpp llvm/lib/Target/AMDGPU/SIInstructions.td llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp llvm/test/CodeGen/AMDGPU/pei-amdgpu-cs-chain.mir llvm/test/CodeGen/MIR/AMDGPU/long-branch-reg-all-sgpr-used.ll llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-after-pei.ll llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg-debug.ll llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg.ll llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-no-ir.mir llvm/test/CodeGen/MIR/AMDGPU/machine-function-info.ll Removed: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.init.whole.wave-w32.ll llvm/test/CodeGen/AMDGPU/llvm.amdgcn.init.whole.wave-w64.ll llvm/test/CodeGen/AMDGPU/si-init-whole-wave.mir diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td index 4cd32a0502c66d..e20c26eb837875 100644 --- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td +++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td @@ -208,16 +208,6 @@ def int_amdgcn_init_exec_from_input : Intrinsic<[], [IntrConvergent, IntrHasSideEffects, IntrNoMem, IntrNoCallback, IntrNoFree, IntrWillReturn, ImmArg>]>; -// Sets the function into whole-wave-mode and returns whether the lane was -// active when entering the function. A branch depending on this return will -// revert the EXEC mask to what it was when entering the function, thus -// resulting in a no-op. This pattern is used to optimize branches when function -// tails need to be run in whole-wave-mode. It may also have other consequences -// (mostly related to WWM CSR handling) that diff erentiate it from using -// a plain `amdgcn.init.exec -1`. -def int_amdgcn_init_whole_wave : Intrinsic<[llvm_i1_ty], [], [ -IntrHasSideEffects, IntrNoMem, IntrConvergent]>; - def int_amdgcn_wavefrontsize : ClangBuiltin<"__builtin_amdgcn_wavefrontsize">, DefaultAttrsIntrinsic<[llvm_i32_ty], [], [NoUndef, IntrNoMem, IntrSpeculatable]>; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp index 380dc7d3312f32..0daaf6b6576030 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp @@ -2738,11 +2738,6 @@ void AMDGPUDAGToDAGISel::SelectINTRINSIC_W_CHAIN(SDNode *N) { case Intrinsic::amdgcn_ds_bvh_stack_rtn: SelectDSBvhStackIntrinsic(N); return; - case Intrinsic::amdgcn_init_whole_wave: -CurDAG->getMachineFunction() -.getInfo() -->setInitWholeWave(); -break; } SelectCode(N); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp index 53085d423cefb8..4dfd3f087c1ae4 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp @@ -1772,14 +1772,6 @@ bool AMDGPUInstructionSelector::selectDSAppendConsume(MachineInstr &MI, return constrainSelectedInstRegOperands(*MIB, TII, TRI, RBI); } -bool AMDGPUInstructionSelector::selectInitWholeWave(MachineInstr &MI) const { - MachineFunction *MF = MI.getParent()->getParent(); - SIMachineFunctionInfo *MFInfo = MF->getInfo(); - - MFInfo->setInitWholeWave(); - return selectImpl(MI, *CoverageInfo); -} - bool AMDGPUInstructionSelector::selectSBarrier(MachineInstr &MI) const { if (TM.getOptLevel() > CodeGenOptLevel::None) { unsigned WGSize = STI.getFlatWorkGroupSizes(MF->getFunction()).second; @@ -2107,8 +2099,6 @@ bool AMDGPUInstructionSelector::selectG_INTRINSIC_W_SIDE_EFFECTS( return selectDSAppendConsume(I, true); case Intrinsic::amdgcn_ds_consume: return selectDSAppendConsume(I, false); - case Intrinsic::amdgcn_init_whole_wave: -return selectInitWholeWave(I); case Intrinsic::amdgcn_s_barrier: return selectSBarrier(I); case Intrinsic::amdgcn_raw_buffer_load_lds: diff --git a/llvm/lib/Target/AMDGPU/AMDGP
[llvm-branch-commits] Test (PR #108349)
https://github.com/vitalybuka created https://github.com/llvm/llvm-project/pull/108349 None ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [sanitizer] Allow to override GetDTLSRange (PR #108348)
https://github.com/vitalybuka created https://github.com/llvm/llvm-project/pull/108348 And rename it into __sanitizer_get_dtls_size. The test will be in a separate patch, as I expected reverts of the test. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix 16-bit LDDs with immediate overflows (#104923) (PR #106993)
Patryk27 wrote: @benshi001 / @aykevl, is there something we can do to push this forward or it's waiting for someone else? https://github.com/llvm/llvm-project/pull/106993 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)
https://github.com/cor3ntin updated https://github.com/llvm/llvm-project/pull/107214 >From 8290ce0998788b6a575ed7b4988b093f48c25b3d Mon Sep 17 00:00:00 2001 From: cor3ntin Date: Tue, 3 Sep 2024 20:36:15 +0200 Subject: [PATCH] [Clang] Fix handling of placeholder variables name in init captures (#107055) We were incorrectly not deduplicating results when looking up `_` which, for a lambda init capture, would result in an ambiguous lookup. The same bug caused some diagnostic notes to be emitted twice. Fixes #107024 --- clang/docs/ReleaseNotes.rst | 1 + clang/lib/Sema/SemaLambda.cpp | 1 - clang/lib/Sema/SemaLookup.cpp | 2 +- clang/test/SemaCXX/cxx2c-placeholder-vars.cpp | 6 -- 4 files changed, 6 insertions(+), 4 deletions(-) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 53d819c6c44574..8c7a6ba70acd28 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -1122,6 +1122,7 @@ Bug Fixes to C++ Support - Fixed a crash-on-invalid bug involving extraneous template parameter with concept substitution. (#GH73885) - Fixed assertion failure by skipping the analysis of an invalid field declaration. (#GH99868) - Fix an issue with dependent source location expressions (#GH106428), (#GH81155), (#GH80210), (#GH85373) +- Fix handling of ``_`` as the name of a lambda's init capture variable. (#GH107024) Bug Fixes to AST Handling diff --git a/clang/lib/Sema/SemaLambda.cpp b/clang/lib/Sema/SemaLambda.cpp index 601077e9f3334d..809b94bb7412b9 100644 --- a/clang/lib/Sema/SemaLambda.cpp +++ b/clang/lib/Sema/SemaLambda.cpp @@ -1318,7 +1318,6 @@ void Sema::ActOnLambdaExpressionAfterIntroducer(LambdaIntroducer &Intro, if (C->Init.isUsable()) { addInitCapture(LSI, cast(Var), C->Kind == LCK_ByRef); - PushOnScopeChains(Var, CurScope, false); } else { TryCaptureKind Kind = C->Kind == LCK_ByRef ? TryCapture_ExplicitByRef : TryCapture_ExplicitByVal; diff --git a/clang/lib/Sema/SemaLookup.cpp b/clang/lib/Sema/SemaLookup.cpp index 7a6a64529f52ec..d3d4bf27ae7283 100644 --- a/clang/lib/Sema/SemaLookup.cpp +++ b/clang/lib/Sema/SemaLookup.cpp @@ -570,7 +570,7 @@ void LookupResult::resolveKind() { // For non-type declarations, check for a prior lookup result naming this // canonical declaration. -if (!D->isPlaceholderVar(getSema().getLangOpts()) && !ExistingI) { +if (!ExistingI) { auto UniqueResult = Unique.insert(std::make_pair(D, I)); if (!UniqueResult.second) { // We've seen this entity before. diff --git a/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp b/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp index 5cf66b48784e91..29ca3b5ef3df72 100644 --- a/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp +++ b/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp @@ -50,14 +50,16 @@ void f() { void lambda() { (void)[_ = 0, _ = 1] { // expected-warning {{placeholder variables are incompatible with C++ standards before C++2c}} \ - // expected-note 4{{placeholder declared here}} + // expected-note 2{{placeholder declared here}} (void)_++; // expected-error {{ambiguous reference to placeholder '_', which is defined multiple times}} }; { int _ = 12; -(void)[_ = 0]{}; // no warning (different scope) +(void)[_ = 0]{ return _;}; // no warning (different scope) } + +auto GH107024 = [_ = 42]() { return _; }(); } namespace global_var { ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)
https://github.com/cor3ntin closed https://github.com/llvm/llvm-project/pull/107214 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)
cor3ntin wrote: @tru done https://github.com/llvm/llvm-project/pull/107214 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)
https://github.com/cor3ntin reopened https://github.com/llvm/llvm-project/pull/107214 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Dialect conversion: Cache `UnresolvedMaterializationRewrite` (PR #108359)
https://github.com/matthias-springer created https://github.com/llvm/llvm-project/pull/108359 The dialect conversion maintains a set of unresolved materializations (`UnrealizedConversionCastOp`). Turn that set into a `DenseMap` that maps from ops to `UnresolvedMaterializationRewrite *`. This improves efficiency a bit, because an iteration over `ConversionPatternRewriterImpl::rewrites` can be avoided. Also delete some dead code. >From a9c69d1733662b3299bd3f4d41982422640dc034 Mon Sep 17 00:00:00 2001 From: Matthias Springer Date: Thu, 12 Sep 2024 12:45:44 +0200 Subject: [PATCH] [mlir][Transforms][NFC] Dialect conversion: Cache `UnresolvedMaterializationRewrite` The dialect conversion already maintains a set of unresolved materializations (`UnrealizedConversionCastOp`). Turn that set into a map that maps from ops to `UnresolvedMaterializationRewrite *`. This improves efficiency a bit, because an iteration over `ConversionPatternRewriterImpl::rewrites` can be avoided. Also delete some dead code. --- .../Transforms/Utils/DialectConversion.cpp| 60 +++ 1 file changed, 20 insertions(+), 40 deletions(-) diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp b/mlir/lib/Transforms/Utils/DialectConversion.cpp index b58a95c3baf70a..ed15b571f01883 100644 --- a/mlir/lib/Transforms/Utils/DialectConversion.cpp +++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp @@ -688,9 +688,7 @@ class UnresolvedMaterializationRewrite : public OperationRewrite { UnresolvedMaterializationRewrite( ConversionPatternRewriterImpl &rewriterImpl, UnrealizedConversionCastOp op, const TypeConverter *converter = nullptr, - MaterializationKind kind = MaterializationKind::Target) - : OperationRewrite(Kind::UnresolvedMaterialization, rewriterImpl, op), -converterAndKind(converter, kind) {} + MaterializationKind kind = MaterializationKind::Target); static bool classof(const IRRewrite *rewrite) { return rewrite->getKind() == Kind::UnresolvedMaterialization; @@ -730,26 +728,6 @@ static bool hasRewrite(R &&rewrites, Operation *op) { }); } -/// Find the single rewrite object of the specified type and block among the -/// given rewrites. In debug mode, asserts that there is mo more than one such -/// object. Return "nullptr" if no object was found. -template -static RewriteTy *findSingleRewrite(R &&rewrites, Block *block) { - RewriteTy *result = nullptr; - for (auto &rewrite : rewrites) { -auto *rewriteTy = dyn_cast(rewrite.get()); -if (rewriteTy && rewriteTy->getBlock() == block) { -#ifndef NDEBUG - assert(!result && "expected single matching rewrite"); - result = rewriteTy; -#else - return rewriteTy; -#endif // NDEBUG -} - } - return result; -} - //===--===// // ConversionPatternRewriterImpl //===--===// @@ -892,10 +870,6 @@ struct ConversionPatternRewriterImpl : public RewriterBase::Listener { bool wasErased(void *ptr) const { return erased.contains(ptr); } -bool wasErased(OperationRewrite *rewrite) const { - return wasErased(rewrite->getOperation()); -} - void notifyOperationErased(Operation *op) override { erased.insert(op); } void notifyBlockErased(Block *block) override { erased.insert(block); } @@ -935,8 +909,10 @@ struct ConversionPatternRewriterImpl : public RewriterBase::Listener { /// to modify/access them is invalid rewriter API usage. SetVector replacedOps; - /// A set of all unresolved materializations. - DenseSet unresolvedMaterializations; + /// A mapping of all unresolved materializations (UnrealizedConversionCastOp) + /// to the corresponding rewrite objects. + DenseMap + unresolvedMaterializations; /// The current type converter, or nullptr if no type converter is currently /// active. @@ -1058,6 +1034,14 @@ void CreateOperationRewrite::rollback() { op->erase(); } +UnresolvedMaterializationRewrite::UnresolvedMaterializationRewrite( +ConversionPatternRewriterImpl &rewriterImpl, UnrealizedConversionCastOp op, +const TypeConverter *converter, MaterializationKind kind) +: OperationRewrite(Kind::UnresolvedMaterialization, rewriterImpl, op), + converterAndKind(converter, kind) { + rewriterImpl.unresolvedMaterializations[op] = this; +} + void UnresolvedMaterializationRewrite::rollback() { if (getMaterializationKind() == MaterializationKind::Target) { for (Value input : op->getOperands()) @@ -1345,7 +1329,6 @@ Value ConversionPatternRewriterImpl::buildUnresolvedMaterialization( builder.setInsertionPoint(ip.getBlock(), ip.getPoint()); auto convertOp = builder.create(loc, outputType, inputs); - unresolvedMaterializations.insert(convertOp); appendRewrite(convertOp, converter, kind); return convertOp.getResult(0); } @@ -2499,15 +2482,12 @@ LogicalResult Operati
[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Dialect conversion: Cache `UnresolvedMaterializationRewrite` (PR #108359)
llvmbot wrote: @llvm/pr-subscribers-mlir-core Author: Matthias Springer (matthias-springer) Changes The dialect conversion maintains a set of unresolved materializations (`UnrealizedConversionCastOp`). Turn that set into a `DenseMap` that maps from ops to `UnresolvedMaterializationRewrite *`. This improves efficiency a bit, because an iteration over `ConversionPatternRewriterImpl::rewrites` can be avoided. Also delete some dead code. --- Full diff: https://github.com/llvm/llvm-project/pull/108359.diff 1 Files Affected: - (modified) mlir/lib/Transforms/Utils/DialectConversion.cpp (+20-40) ``diff diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp b/mlir/lib/Transforms/Utils/DialectConversion.cpp index b58a95c3baf70a..ed15b571f01883 100644 --- a/mlir/lib/Transforms/Utils/DialectConversion.cpp +++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp @@ -688,9 +688,7 @@ class UnresolvedMaterializationRewrite : public OperationRewrite { UnresolvedMaterializationRewrite( ConversionPatternRewriterImpl &rewriterImpl, UnrealizedConversionCastOp op, const TypeConverter *converter = nullptr, - MaterializationKind kind = MaterializationKind::Target) - : OperationRewrite(Kind::UnresolvedMaterialization, rewriterImpl, op), -converterAndKind(converter, kind) {} + MaterializationKind kind = MaterializationKind::Target); static bool classof(const IRRewrite *rewrite) { return rewrite->getKind() == Kind::UnresolvedMaterialization; @@ -730,26 +728,6 @@ static bool hasRewrite(R &&rewrites, Operation *op) { }); } -/// Find the single rewrite object of the specified type and block among the -/// given rewrites. In debug mode, asserts that there is mo more than one such -/// object. Return "nullptr" if no object was found. -template -static RewriteTy *findSingleRewrite(R &&rewrites, Block *block) { - RewriteTy *result = nullptr; - for (auto &rewrite : rewrites) { -auto *rewriteTy = dyn_cast(rewrite.get()); -if (rewriteTy && rewriteTy->getBlock() == block) { -#ifndef NDEBUG - assert(!result && "expected single matching rewrite"); - result = rewriteTy; -#else - return rewriteTy; -#endif // NDEBUG -} - } - return result; -} - //===--===// // ConversionPatternRewriterImpl //===--===// @@ -892,10 +870,6 @@ struct ConversionPatternRewriterImpl : public RewriterBase::Listener { bool wasErased(void *ptr) const { return erased.contains(ptr); } -bool wasErased(OperationRewrite *rewrite) const { - return wasErased(rewrite->getOperation()); -} - void notifyOperationErased(Operation *op) override { erased.insert(op); } void notifyBlockErased(Block *block) override { erased.insert(block); } @@ -935,8 +909,10 @@ struct ConversionPatternRewriterImpl : public RewriterBase::Listener { /// to modify/access them is invalid rewriter API usage. SetVector replacedOps; - /// A set of all unresolved materializations. - DenseSet unresolvedMaterializations; + /// A mapping of all unresolved materializations (UnrealizedConversionCastOp) + /// to the corresponding rewrite objects. + DenseMap + unresolvedMaterializations; /// The current type converter, or nullptr if no type converter is currently /// active. @@ -1058,6 +1034,14 @@ void CreateOperationRewrite::rollback() { op->erase(); } +UnresolvedMaterializationRewrite::UnresolvedMaterializationRewrite( +ConversionPatternRewriterImpl &rewriterImpl, UnrealizedConversionCastOp op, +const TypeConverter *converter, MaterializationKind kind) +: OperationRewrite(Kind::UnresolvedMaterialization, rewriterImpl, op), + converterAndKind(converter, kind) { + rewriterImpl.unresolvedMaterializations[op] = this; +} + void UnresolvedMaterializationRewrite::rollback() { if (getMaterializationKind() == MaterializationKind::Target) { for (Value input : op->getOperands()) @@ -1345,7 +1329,6 @@ Value ConversionPatternRewriterImpl::buildUnresolvedMaterialization( builder.setInsertionPoint(ip.getBlock(), ip.getPoint()); auto convertOp = builder.create(loc, outputType, inputs); - unresolvedMaterializations.insert(convertOp); appendRewrite(convertOp, converter, kind); return convertOp.getResult(0); } @@ -2499,15 +2482,12 @@ LogicalResult OperationConverter::convertOperations(ArrayRef ops) { // Gather all unresolved materializations. SmallVector allCastOps; - DenseMap rewriteMap; - for (std::unique_ptr &rewrite : rewriterImpl.rewrites) { -auto *mat = dyn_cast(rewrite.get()); -if (!mat) - continue; -if (rewriterImpl.eraseRewriter.wasErased(mat)) + const DenseMap + &materializations = rewriterImpl.unresolvedMaterializations; + for (auto it : materializations) { +if (rewriterImpl.eraseRewriter.wasErased(it.first))
[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)
https://github.com/gbMattN updated https://github.com/llvm/llvm-project/pull/95387 >From 8099113d68bd7c47c29f635bb10a048ddb99833b Mon Sep 17 00:00:00 2001 From: Matthew Nagy Date: Fri, 28 Jun 2024 16:12:31 + Subject: [PATCH 1/2] [TySan] Fixed false positive when accessing global object's member variables --- compiler-rt/lib/tysan/tysan.cpp | 19 +++- .../test/tysan/global-struct-members.c| 31 +++ 2 files changed, 49 insertions(+), 1 deletion(-) create mode 100644 compiler-rt/test/tysan/global-struct-members.c diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp index f627851d049e6a..8235b0ec2b55e7 100644 --- a/compiler-rt/lib/tysan/tysan.cpp +++ b/compiler-rt/lib/tysan/tysan.cpp @@ -221,7 +221,24 @@ __tysan_check(void *addr, int size, tysan_type_descriptor *td, int flags) { OldTDPtr -= i; OldTD = *OldTDPtr; -if (!isAliasingLegal(td, OldTD)) +// When shadow memory is set for global objects, the entire object is tagged with the struct type +// This means that when you access a member variable, tysan reads that as you accessing a struct midway +// through, with 'i' being the offset +// Therefore, if you are accessing a struct, we need to find the member type. We can go through the +// members of the struct type and see if there is a member at the offset you are accessing the struct by. +// If there is indeed a member starting at offset 'i' in the struct, we should check aliasing legality +// with that type. If there isn't, we run alias checking on the struct with will give us the correct error. +tysan_type_descriptor *InternalMember = OldTD; +if (OldTD->Tag == TYSAN_STRUCT_TD) { + for (int j = 0; j < OldTD->Struct.MemberCount; j++) { +if (OldTD->Struct.Members[j].Offset == i) { + InternalMember = OldTD->Struct.Members[j].Type; + break; +} + } +} + +if (!isAliasingLegal(td, InternalMember)) reportError(addr, size, td, OldTD, AccessStr, "accesses part of an existing object", -i, pc, bp, sp); diff --git a/compiler-rt/test/tysan/global-struct-members.c b/compiler-rt/test/tysan/global-struct-members.c new file mode 100644 index 00..76ea3c431dd7bc --- /dev/null +++ b/compiler-rt/test/tysan/global-struct-members.c @@ -0,0 +1,31 @@ +// RUN: %clang_tysan -O0 %s -o %t && %run %t >%t.out 2>&1 +// RUN: FileCheck %s < %t.out + +#include + +struct X { + int a, b, c; +} x; + +static struct X xArray[2]; + +int main() { + x.a = 1; + x.b = 2; + x.c = 3; + + printf("%d %d %d\n", x.a, x.b, x.c); + // CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation + + for (size_t i = 0; i < 2; i++) { +xArray[i].a = 1; +xArray[i].b = 1; +xArray[i].c = 1; + } + + struct X *xPtr = (struct X *)&(xArray[0].c); + xPtr->a = 1; + // CHECK: ERROR: TypeSanitizer: type-aliasing-violation + // CHECK: WRITE of size 4 at {{.*}} with type int (in X at offset 0) accesses an existing object of type int (in X at offset 8) + // CHECK: {{#0 0x.* in main .*struct-members.c:}}[[@LINE-3]] +} >From 83a368867533e316b4272c19d0bf61da842c5b4b Mon Sep 17 00:00:00 2001 From: Matthew Nagy Date: Thu, 12 Sep 2024 10:52:19 + Subject: [PATCH 2/2] Fix more member offset bugs --- compiler-rt/lib/tysan/tysan.cpp | 25 +-- .../tysan/struct-offset-different-base.cpp| 31 +++ 2 files changed, 47 insertions(+), 9 deletions(-) create mode 100644 compiler-rt/test/tysan/struct-offset-different-base.cpp diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp index 8235b0ec2b55e7..abad429de7ed9b 100644 --- a/compiler-rt/lib/tysan/tysan.cpp +++ b/compiler-rt/lib/tysan/tysan.cpp @@ -128,8 +128,13 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA, break; } - OffsetA -= TDA->Struct.Members[Idx].Offset; - TDA = TDA->Struct.Members[Idx].Type; + if (TDA->Struct.Members[Idx].Offset > OffsetA) { +OffsetA = TDA->Struct.Members[Idx].Offset - OffsetA; +TDA = TDA->Struct.Members[Idx - 1].Type; + } else { +OffsetA -= TDA->Struct.Members[Idx].Offset; +TDA = TDA->Struct.Members[Idx].Type; + } } else { DCHECK(0); break; @@ -221,13 +226,15 @@ __tysan_check(void *addr, int size, tysan_type_descriptor *td, int flags) { OldTDPtr -= i; OldTD = *OldTDPtr; -// When shadow memory is set for global objects, the entire object is tagged with the struct type -// This means that when you access a member variable, tysan reads that as you accessing a struct midway -// through, with 'i' being the offset -// Therefore, if you are accessing a struct, we need to find the member type. We can go through the -// members of the struct type and see if there is a member at the offset you are accessing the struct by. -// If there is indeed a m
[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Dialect conversion: Cache `UnresolvedMaterializationRewrite` (PR #108359)
@@ -935,8 +909,10 @@ struct ConversionPatternRewriterImpl : public RewriterBase::Listener { /// to modify/access them is invalid rewriter API usage. SetVector replacedOps; - /// A set of all unresolved materializations. - DenseSet unresolvedMaterializations; + /// A mapping of all unresolved materializations (UnrealizedConversionCastOp) + /// to the corresponding rewrite objects. + DenseMap joker-eph wrote: Can the key be directly `UnrealizedConversionCastOp` ? https://github.com/llvm/llvm-project/pull/108359 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Dialect conversion: Cache `UnresolvedMaterializationRewrite` (PR #108359)
https://github.com/joker-eph approved this pull request. https://github.com/llvm/llvm-project/pull/108359 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)
https://github.com/gbMattN updated https://github.com/llvm/llvm-project/pull/95387 >From 8099113d68bd7c47c29f635bb10a048ddb99833b Mon Sep 17 00:00:00 2001 From: Matthew Nagy Date: Fri, 28 Jun 2024 16:12:31 + Subject: [PATCH] [TySan] Fixed false positive when accessing global object's member variables --- compiler-rt/lib/tysan/tysan.cpp | 19 +++- .../test/tysan/global-struct-members.c| 31 +++ 2 files changed, 49 insertions(+), 1 deletion(-) create mode 100644 compiler-rt/test/tysan/global-struct-members.c diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp index f627851d049e6a..8235b0ec2b55e7 100644 --- a/compiler-rt/lib/tysan/tysan.cpp +++ b/compiler-rt/lib/tysan/tysan.cpp @@ -221,7 +221,24 @@ __tysan_check(void *addr, int size, tysan_type_descriptor *td, int flags) { OldTDPtr -= i; OldTD = *OldTDPtr; -if (!isAliasingLegal(td, OldTD)) +// When shadow memory is set for global objects, the entire object is tagged with the struct type +// This means that when you access a member variable, tysan reads that as you accessing a struct midway +// through, with 'i' being the offset +// Therefore, if you are accessing a struct, we need to find the member type. We can go through the +// members of the struct type and see if there is a member at the offset you are accessing the struct by. +// If there is indeed a member starting at offset 'i' in the struct, we should check aliasing legality +// with that type. If there isn't, we run alias checking on the struct with will give us the correct error. +tysan_type_descriptor *InternalMember = OldTD; +if (OldTD->Tag == TYSAN_STRUCT_TD) { + for (int j = 0; j < OldTD->Struct.MemberCount; j++) { +if (OldTD->Struct.Members[j].Offset == i) { + InternalMember = OldTD->Struct.Members[j].Type; + break; +} + } +} + +if (!isAliasingLegal(td, InternalMember)) reportError(addr, size, td, OldTD, AccessStr, "accesses part of an existing object", -i, pc, bp, sp); diff --git a/compiler-rt/test/tysan/global-struct-members.c b/compiler-rt/test/tysan/global-struct-members.c new file mode 100644 index 00..76ea3c431dd7bc --- /dev/null +++ b/compiler-rt/test/tysan/global-struct-members.c @@ -0,0 +1,31 @@ +// RUN: %clang_tysan -O0 %s -o %t && %run %t >%t.out 2>&1 +// RUN: FileCheck %s < %t.out + +#include + +struct X { + int a, b, c; +} x; + +static struct X xArray[2]; + +int main() { + x.a = 1; + x.b = 2; + x.c = 3; + + printf("%d %d %d\n", x.a, x.b, x.c); + // CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation + + for (size_t i = 0; i < 2; i++) { +xArray[i].a = 1; +xArray[i].b = 1; +xArray[i].c = 1; + } + + struct X *xPtr = (struct X *)&(xArray[0].c); + xPtr->a = 1; + // CHECK: ERROR: TypeSanitizer: type-aliasing-violation + // CHECK: WRITE of size 4 at {{.*}} with type int (in X at offset 0) accesses an existing object of type int (in X at offset 8) + // CHECK: {{#0 0x.* in main .*struct-members.c:}}[[@LINE-3]] +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix 16-bit LDDs with immediate overflows (#104923) (PR #106993)
aykevl wrote: I think it's up to the release managers now to merge this PR. https://github.com/llvm/llvm-project/pull/106993 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)
https://github.com/matthias-springer created https://github.com/llvm/llvm-project/pull/108381 PR #106760 aligned the handling of dropped block arguments and dropped op results. The two helper functions that insert source materializations for uses of replaced block arguments / op results that survived the conversion are now almost identical (`legalizeConvertedArgumentTypes` and `legalizeConvertedOpResultTypes`). This PR merges the two functions and moves the implementation directly into `finalize`. This PR simplifies the code base and improves the efficiency a bit: previously, `finalize` iterated over `ConversionPatternRewriterImpl::rewrites` twice. Now, only one iteration is needed. >From 1f215ac7861a76f653c9911a31bf484a5fd6dac4 Mon Sep 17 00:00:00 2001 From: Matthias Springer Date: Thu, 12 Sep 2024 14:49:23 +0200 Subject: [PATCH] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements PR #106760 aligned the handling of dropped block arguments and dropped op results. The two helper functions that insert source materializations for uses of replaced block arguments / op results that survived the conversion are now almost identical (`legalizeConvertedArgumentTypes` and `legalizeConvertedOpResultTypes`). This PR merges the two functions and moves the implementation directly into `finalize`. This PR simplifies the code base and improves the efficiency a bit: previously, `finalize` iterates over `ConversionPatternRewriterImpl::rewrites` twice. Now, only one iteration is needed. --- .../Transforms/Utils/DialectConversion.cpp| 134 ++ .../VectorToSPIRV/vector-to-spirv.mlir| 4 +- 2 files changed, 44 insertions(+), 94 deletions(-) diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp b/mlir/lib/Transforms/Utils/DialectConversion.cpp index ed15b571f01883..0556b4ab833c30 100644 --- a/mlir/lib/Transforms/Utils/DialectConversion.cpp +++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp @@ -2336,17 +2336,6 @@ struct OperationConverter { /// remaining artifacts and complete the conversion. LogicalResult finalize(ConversionPatternRewriter &rewriter); - /// Legalize the types of converted block arguments. - LogicalResult - legalizeConvertedArgumentTypes(ConversionPatternRewriter &rewriter, - ConversionPatternRewriterImpl &rewriterImpl); - - /// Legalize the types of converted op results. - LogicalResult legalizeConvertedOpResultTypes( - ConversionPatternRewriter &rewriter, - ConversionPatternRewriterImpl &rewriterImpl, - DenseMap> &inverseMapping); - /// Dialect conversion configuration. ConversionConfig config; @@ -2510,19 +2499,6 @@ LogicalResult OperationConverter::convertOperations(ArrayRef ops) { return success(); } -LogicalResult -OperationConverter::finalize(ConversionPatternRewriter &rewriter) { - ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl(); - if (failed(legalizeConvertedArgumentTypes(rewriter, rewriterImpl))) -return failure(); - DenseMap> inverseMapping = - rewriterImpl.mapping.getInverse(); - if (failed(legalizeConvertedOpResultTypes(rewriter, rewriterImpl, -inverseMapping))) -return failure(); - return success(); -} - /// Finds a user of the given value, or of any other value that the given value /// replaced, that was not replaced in the conversion process. static Operation *findLiveUserOfReplaced( @@ -2546,87 +2522,61 @@ static Operation *findLiveUserOfReplaced( return nullptr; } -LogicalResult OperationConverter::legalizeConvertedOpResultTypes( -ConversionPatternRewriter &rewriter, -ConversionPatternRewriterImpl &rewriterImpl, -DenseMap> &inverseMapping) { - // Process requested operation replacements. - for (unsigned i = 0; i < rewriterImpl.rewrites.size(); ++i) { -auto *opReplacement = -dyn_cast(rewriterImpl.rewrites[i].get()); -if (!opReplacement) - continue; -Operation *op = opReplacement->getOperation(); -for (OpResult result : op->getResults()) { - // If the type of this op result changed and the result is still live, - // we need to materialize a conversion. - if (rewriterImpl.mapping.lookupOrNull(result, result.getType())) +/// Helper function that returns the replaced values and the type converter if +/// the given rewrite object is an "operation replacement" or a "block type +/// conversion" (which corresponds to a "block replacement"). Otherwise, return +/// an empty ValueRange and a null type converter pointer. +static std::pair +getReplacedValues(IRRewrite *rewrite) { + if (auto *opRewrite = dyn_cast(rewrite)) +return std::make_pair(opRewrite->getOperation()->getResults(), + opRewrite->getConverter()); + if (auto *blockRewrite = dyn_cast(rewrite)) +return std::make_pair(blockRewrite->getOrigBlock()->getArguments(), + blockRewri
[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)
llvmbot wrote: @llvm/pr-subscribers-mlir-spirv @llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-core Author: Matthias Springer (matthias-springer) Changes PR #106760 aligned the handling of dropped block arguments and dropped op results. The two helper functions that insert source materializations for uses of replaced block arguments / op results that survived the conversion are now almost identical (`legalizeConvertedArgumentTypes` and `legalizeConvertedOpResultTypes`). This PR merges the two functions and moves the implementation directly into `finalize`. This PR simplifies the code base and improves the efficiency a bit: previously, `finalize` iterated over `ConversionPatternRewriterImpl::rewrites` twice. Now, only one iteration is needed. --- Full diff: https://github.com/llvm/llvm-project/pull/108381.diff 2 Files Affected: - (modified) mlir/lib/Transforms/Utils/DialectConversion.cpp (+42-92) - (modified) mlir/test/Conversion/VectorToSPIRV/vector-to-spirv.mlir (+2-2) ``diff diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp b/mlir/lib/Transforms/Utils/DialectConversion.cpp index ed15b571f01883..0556b4ab833c30 100644 --- a/mlir/lib/Transforms/Utils/DialectConversion.cpp +++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp @@ -2336,17 +2336,6 @@ struct OperationConverter { /// remaining artifacts and complete the conversion. LogicalResult finalize(ConversionPatternRewriter &rewriter); - /// Legalize the types of converted block arguments. - LogicalResult - legalizeConvertedArgumentTypes(ConversionPatternRewriter &rewriter, - ConversionPatternRewriterImpl &rewriterImpl); - - /// Legalize the types of converted op results. - LogicalResult legalizeConvertedOpResultTypes( - ConversionPatternRewriter &rewriter, - ConversionPatternRewriterImpl &rewriterImpl, - DenseMap> &inverseMapping); - /// Dialect conversion configuration. ConversionConfig config; @@ -2510,19 +2499,6 @@ LogicalResult OperationConverter::convertOperations(ArrayRef ops) { return success(); } -LogicalResult -OperationConverter::finalize(ConversionPatternRewriter &rewriter) { - ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl(); - if (failed(legalizeConvertedArgumentTypes(rewriter, rewriterImpl))) -return failure(); - DenseMap> inverseMapping = - rewriterImpl.mapping.getInverse(); - if (failed(legalizeConvertedOpResultTypes(rewriter, rewriterImpl, -inverseMapping))) -return failure(); - return success(); -} - /// Finds a user of the given value, or of any other value that the given value /// replaced, that was not replaced in the conversion process. static Operation *findLiveUserOfReplaced( @@ -2546,87 +2522,61 @@ static Operation *findLiveUserOfReplaced( return nullptr; } -LogicalResult OperationConverter::legalizeConvertedOpResultTypes( -ConversionPatternRewriter &rewriter, -ConversionPatternRewriterImpl &rewriterImpl, -DenseMap> &inverseMapping) { - // Process requested operation replacements. - for (unsigned i = 0; i < rewriterImpl.rewrites.size(); ++i) { -auto *opReplacement = -dyn_cast(rewriterImpl.rewrites[i].get()); -if (!opReplacement) - continue; -Operation *op = opReplacement->getOperation(); -for (OpResult result : op->getResults()) { - // If the type of this op result changed and the result is still live, - // we need to materialize a conversion. - if (rewriterImpl.mapping.lookupOrNull(result, result.getType())) +/// Helper function that returns the replaced values and the type converter if +/// the given rewrite object is an "operation replacement" or a "block type +/// conversion" (which corresponds to a "block replacement"). Otherwise, return +/// an empty ValueRange and a null type converter pointer. +static std::pair +getReplacedValues(IRRewrite *rewrite) { + if (auto *opRewrite = dyn_cast(rewrite)) +return std::make_pair(opRewrite->getOperation()->getResults(), + opRewrite->getConverter()); + if (auto *blockRewrite = dyn_cast(rewrite)) +return std::make_pair(blockRewrite->getOrigBlock()->getArguments(), + blockRewrite->getConverter()); + return std::make_pair(ValueRange(), nullptr); +} + +LogicalResult +OperationConverter::finalize(ConversionPatternRewriter &rewriter) { + ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl(); + DenseMap> inverseMapping = + rewriterImpl.mapping.getInverse(); + + // Process requested value replacements. + for (unsigned i = 0, e = rewriterImpl.rewrites.size(); i < e; ++i) { +ValueRange replacedValues; +const TypeConverter *converter; +std::tie(replacedValues, converter) = +getReplacedValues(rewriterImpl.rewrites[i].get()); +for (Value originalValue : replacedValues) { + // If the type of this value changed and the value is st
[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)
https://github.com/joker-eph approved this pull request. https://github.com/llvm/llvm-project/pull/108381 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)
@@ -558,8 +558,8 @@ func.func @deinterleave(%a: vector<4xf32>) -> (vector<2xf32>, vector<2xf32>) { // CHECK-LABEL: func @deinterleave_scalar // CHECK-SAME: (%[[ARG0:.+]]: vector<2xf32>) -// CHECK: %[[EXTRACT0:.*]] = spirv.CompositeExtract %[[ARG0]][0 : i32] : vector<2xf32> -// CHECK: %[[EXTRACT1:.*]] = spirv.CompositeExtract %[[ARG0]][1 : i32] : vector<2xf32> +// CHECK-DAG: %[[EXTRACT0:.*]] = spirv.CompositeExtract %[[ARG0]][0 : i32] : vector<2xf32> +// CHECK-DAG: %[[EXTRACT1:.*]] = spirv.CompositeExtract %[[ARG0]][1 : i32] : vector<2xf32> joker-eph wrote: Can you just push this separately ahead as its own commit? https://github.com/llvm/llvm-project/pull/108381 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
https://github.com/gbMattN created https://github.com/llvm/llvm-project/pull/108385 Fixes issue #105960 If a member in a struct is also a struct, accessing a member partway through this inner struct currently causes a false positive. This is because when checking aliasing, the access offset is seen as greater than the starting offset of the inner struct, so the loop continues one iteration, and believes we are accessing the member after the inner struct. The next member's offset is greater than the offset we are looking for, so when we subtract the next member's offset from what we are looking for, the offset underflows. To fix this, we check if the member we think we are accessing has a greater offset than the offset we are looking for. If so, we take a step back. We cannot do this in the loop, since the loop does not check the final member. This means the penultimate member would still cause false positives. >From 2dffe46bc8af4ccd5627478ba9546647907104cc Mon Sep 17 00:00:00 2001 From: Matthew Nagy Date: Thu, 12 Sep 2024 12:36:57 + Subject: [PATCH] [TySan] Fix struct access with different bases --- compiler-rt/lib/tysan/tysan.cpp | 4 +++ .../tysan/struct-offset-different-base.cpp| 31 +++ 2 files changed, 35 insertions(+) create mode 100644 compiler-rt/test/tysan/struct-offset-different-base.cpp diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp index f627851d049e6a..f2cb6faddf45ac 100644 --- a/compiler-rt/lib/tysan/tysan.cpp +++ b/compiler-rt/lib/tysan/tysan.cpp @@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA, break; } + //You can't have negative offset, you must be partially inside the last type + if (TDA->Struct.Members[Idx].Offset > OffsetA) +Idx -=1; + OffsetA -= TDA->Struct.Members[Idx].Offset; TDA = TDA->Struct.Members[Idx].Type; } else { diff --git a/compiler-rt/test/tysan/struct-offset-different-base.cpp b/compiler-rt/test/tysan/struct-offset-different-base.cpp new file mode 100644 index 00..c1ef5f8669c280 --- /dev/null +++ b/compiler-rt/test/tysan/struct-offset-different-base.cpp @@ -0,0 +1,31 @@ +// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1 +// RUN: FileCheck %s < %t.out + +#include + +struct inner { + char buffer; + int i; +}; + +void init_inner(inner *list) { + list->i = 0; +} + +struct outer { + inner foo; +char buffer; +}; + +int main(void) { + outer *l = new outer(); + +init_inner(&l->foo); + +int access_offsets_with_different_base = l->foo.i; + printf("%d\n", access_offsets_with_different_base); + + return 0; +} + +// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
github-actions[bot] wrote: Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using `@` followed by their GitHub username. If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the [LLVM GitHub User Guide](https://llvm.org/docs/GitHub.html). You can also ask questions in a comment on this PR, on the [LLVM Discord](https://discord.com/invite/xS7Z362) or on the [forums](https://discourse.llvm.org/). https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
llvmbot wrote: @llvm/pr-subscribers-compiler-rt-sanitizer Author: None (gbMattN) Changes Fixes issue #105960 If a member in a struct is also a struct, accessing a member partway through this inner struct currently causes a false positive. This is because when checking aliasing, the access offset is seen as greater than the starting offset of the inner struct, so the loop continues one iteration, and believes we are accessing the member after the inner struct. The next member's offset is greater than the offset we are looking for, so when we subtract the next member's offset from what we are looking for, the offset underflows. To fix this, we check if the member we think we are accessing has a greater offset than the offset we are looking for. If so, we take a step back. We cannot do this in the loop, since the loop does not check the final member. This means the penultimate member would still cause false positives. --- Full diff: https://github.com/llvm/llvm-project/pull/108385.diff 2 Files Affected: - (modified) compiler-rt/lib/tysan/tysan.cpp (+4) - (added) compiler-rt/test/tysan/struct-offset-different-base.cpp (+31) ``diff diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp index f627851d049e6a..f2cb6faddf45ac 100644 --- a/compiler-rt/lib/tysan/tysan.cpp +++ b/compiler-rt/lib/tysan/tysan.cpp @@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA, break; } + //You can't have negative offset, you must be partially inside the last type + if (TDA->Struct.Members[Idx].Offset > OffsetA) +Idx -=1; + OffsetA -= TDA->Struct.Members[Idx].Offset; TDA = TDA->Struct.Members[Idx].Type; } else { diff --git a/compiler-rt/test/tysan/struct-offset-different-base.cpp b/compiler-rt/test/tysan/struct-offset-different-base.cpp new file mode 100644 index 00..c1ef5f8669c280 --- /dev/null +++ b/compiler-rt/test/tysan/struct-offset-different-base.cpp @@ -0,0 +1,31 @@ +// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1 +// RUN: FileCheck %s < %t.out + +#include + +struct inner { + char buffer; + int i; +}; + +void init_inner(inner *list) { + list->i = 0; +} + +struct outer { + inner foo; +char buffer; +}; + +int main(void) { + outer *l = new outer(); + +init_inner(&l->foo); + +int access_offsets_with_different_base = l->foo.i; + printf("%d\n", access_offsets_with_different_base); + + return 0; +} + +// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation `` https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
https://github.com/gbMattN updated https://github.com/llvm/llvm-project/pull/108385 >From d312bd99486dc7fbff79b880026512e949e7d212 Mon Sep 17 00:00:00 2001 From: Matthew Nagy Date: Thu, 12 Sep 2024 12:36:57 + Subject: [PATCH] [TySan] Fix struct access with different bases --- compiler-rt/lib/tysan/tysan.cpp | 4 +++ .../tysan/struct-offset-different-base.cpp| 31 +++ 2 files changed, 35 insertions(+) create mode 100644 compiler-rt/test/tysan/struct-offset-different-base.cpp diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp index f627851d049e6a..f2cb6faddf45ac 100644 --- a/compiler-rt/lib/tysan/tysan.cpp +++ b/compiler-rt/lib/tysan/tysan.cpp @@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA, break; } + //You can't have negative offset, you must be partially inside the last type + if (TDA->Struct.Members[Idx].Offset > OffsetA) +Idx -=1; + OffsetA -= TDA->Struct.Members[Idx].Offset; TDA = TDA->Struct.Members[Idx].Type; } else { diff --git a/compiler-rt/test/tysan/struct-offset-different-base.cpp b/compiler-rt/test/tysan/struct-offset-different-base.cpp new file mode 100644 index 00..c1ef5f8669c280 --- /dev/null +++ b/compiler-rt/test/tysan/struct-offset-different-base.cpp @@ -0,0 +1,31 @@ +// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1 +// RUN: FileCheck %s < %t.out + +#include + +struct inner { + char buffer; + int i; +}; + +void init_inner(inner *list) { + list->i = 0; +} + +struct outer { + inner foo; +char buffer; +}; + +int main(void) { + outer *l = new outer(); + +init_inner(&l->foo); + +int access_offsets_with_different_base = l->foo.i; + printf("%d\n", access_offsets_with_different_base); + + return 0; +} + +// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
gbMattN wrote: (Manually pinging potential reviewers) @tavianator @fhahn https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
https://github.com/gbMattN updated https://github.com/llvm/llvm-project/pull/108385 >From 91f560d69a6dd21cf177f8969422b478cd4e5f5e Mon Sep 17 00:00:00 2001 From: Matthew Nagy Date: Thu, 12 Sep 2024 12:36:57 + Subject: [PATCH] [TySan] Fix struct access with different bases --- compiler-rt/lib/tysan/tysan.cpp | 4 +++ .../tysan/struct-offset-different-base.cpp| 31 +++ 2 files changed, 35 insertions(+) create mode 100644 compiler-rt/test/tysan/struct-offset-different-base.cpp diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp index f627851d049e6a..f2cb6faddf45ac 100644 --- a/compiler-rt/lib/tysan/tysan.cpp +++ b/compiler-rt/lib/tysan/tysan.cpp @@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA, break; } + //You can't have negative offset, you must be partially inside the last type + if (TDA->Struct.Members[Idx].Offset > OffsetA) +Idx -=1; + OffsetA -= TDA->Struct.Members[Idx].Offset; TDA = TDA->Struct.Members[Idx].Type; } else { diff --git a/compiler-rt/test/tysan/struct-offset-different-base.cpp b/compiler-rt/test/tysan/struct-offset-different-base.cpp new file mode 100644 index 00..c091975c956d24 --- /dev/null +++ b/compiler-rt/test/tysan/struct-offset-different-base.cpp @@ -0,0 +1,31 @@ +// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1 +// RUN: FileCheck %s < %t.out + +#include + +struct inner { + char buffer; + int i; +}; + +void init_inner(inner *iPtr) { + iPtr->i = 0; +} + +struct outer { + inner foo; +char buffer; +}; + +int main(void) { + outer *l = new outer(); + +init_inner(&l->foo); + +int access_offsets_with_different_base = l->foo.i; + printf("%d\n", access_offsets_with_different_base); + + return 0; +} + +// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
https://github.com/gbMattN updated https://github.com/llvm/llvm-project/pull/108385 >From 65f1e1fce67bfc9ae60f83abe6d3b487a174c6b1 Mon Sep 17 00:00:00 2001 From: Matthew Nagy Date: Thu, 12 Sep 2024 12:36:57 + Subject: [PATCH] [TySan] Fix struct access with different bases --- compiler-rt/lib/tysan/tysan.cpp | 4 +++ .../tysan/struct-offset-different-base.cpp| 31 +++ 2 files changed, 35 insertions(+) create mode 100644 compiler-rt/test/tysan/struct-offset-different-base.cpp diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp index f627851d049e6a..f2cb6faddf45ac 100644 --- a/compiler-rt/lib/tysan/tysan.cpp +++ b/compiler-rt/lib/tysan/tysan.cpp @@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA, break; } + //You can't have negative offset, you must be partially inside the last type + if (TDA->Struct.Members[Idx].Offset > OffsetA) +Idx -=1; + OffsetA -= TDA->Struct.Members[Idx].Offset; TDA = TDA->Struct.Members[Idx].Type; } else { diff --git a/compiler-rt/test/tysan/struct-offset-different-base.cpp b/compiler-rt/test/tysan/struct-offset-different-base.cpp new file mode 100644 index 00..716d21f844f96c --- /dev/null +++ b/compiler-rt/test/tysan/struct-offset-different-base.cpp @@ -0,0 +1,31 @@ +// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1 +// RUN: FileCheck %s < %t.out + +#include + +struct inner { + char buffer; + int i; +}; + +void init_inner(inner *iPtr) { + iPtr->i = 0; +} + +struct outer { + inner foo; +char buffer; +}; + +int main(void) { +outer *l = new outer(); + +init_inner(&l->foo); + +int access_offsets_with_different_base = l->foo.i; +printf("%d\n", access_offsets_with_different_base); + +return 0; +} + +// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)
https://github.com/matthias-springer updated https://github.com/llvm/llvm-project/pull/108381 >From 0fd4cb81dbf6e9c2766a086e2e3fdffd3cf67510 Mon Sep 17 00:00:00 2001 From: Matthias Springer Date: Thu, 12 Sep 2024 14:49:23 +0200 Subject: [PATCH] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements PR #106760 aligned the handling of dropped block arguments and dropped op results. The two helper functions that insert source materializations for uses of replaced block arguments / op results that survived the conversion are now almost identical (`legalizeConvertedArgumentTypes` and `legalizeConvertedOpResultTypes`). This PR merges the two functions and moves the implementation directly into `finalize`. This PR simplifies the code base and improves the efficiency a bit: previously, `finalize` iterates over `ConversionPatternRewriterImpl::rewrites` twice. Now, only one iteration is needed. --- .../Transforms/Utils/DialectConversion.cpp| 134 ++ 1 file changed, 42 insertions(+), 92 deletions(-) diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp b/mlir/lib/Transforms/Utils/DialectConversion.cpp index ed15b571f01883..0556b4ab833c30 100644 --- a/mlir/lib/Transforms/Utils/DialectConversion.cpp +++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp @@ -2336,17 +2336,6 @@ struct OperationConverter { /// remaining artifacts and complete the conversion. LogicalResult finalize(ConversionPatternRewriter &rewriter); - /// Legalize the types of converted block arguments. - LogicalResult - legalizeConvertedArgumentTypes(ConversionPatternRewriter &rewriter, - ConversionPatternRewriterImpl &rewriterImpl); - - /// Legalize the types of converted op results. - LogicalResult legalizeConvertedOpResultTypes( - ConversionPatternRewriter &rewriter, - ConversionPatternRewriterImpl &rewriterImpl, - DenseMap> &inverseMapping); - /// Dialect conversion configuration. ConversionConfig config; @@ -2510,19 +2499,6 @@ LogicalResult OperationConverter::convertOperations(ArrayRef ops) { return success(); } -LogicalResult -OperationConverter::finalize(ConversionPatternRewriter &rewriter) { - ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl(); - if (failed(legalizeConvertedArgumentTypes(rewriter, rewriterImpl))) -return failure(); - DenseMap> inverseMapping = - rewriterImpl.mapping.getInverse(); - if (failed(legalizeConvertedOpResultTypes(rewriter, rewriterImpl, -inverseMapping))) -return failure(); - return success(); -} - /// Finds a user of the given value, or of any other value that the given value /// replaced, that was not replaced in the conversion process. static Operation *findLiveUserOfReplaced( @@ -2546,87 +2522,61 @@ static Operation *findLiveUserOfReplaced( return nullptr; } -LogicalResult OperationConverter::legalizeConvertedOpResultTypes( -ConversionPatternRewriter &rewriter, -ConversionPatternRewriterImpl &rewriterImpl, -DenseMap> &inverseMapping) { - // Process requested operation replacements. - for (unsigned i = 0; i < rewriterImpl.rewrites.size(); ++i) { -auto *opReplacement = -dyn_cast(rewriterImpl.rewrites[i].get()); -if (!opReplacement) - continue; -Operation *op = opReplacement->getOperation(); -for (OpResult result : op->getResults()) { - // If the type of this op result changed and the result is still live, - // we need to materialize a conversion. - if (rewriterImpl.mapping.lookupOrNull(result, result.getType())) +/// Helper function that returns the replaced values and the type converter if +/// the given rewrite object is an "operation replacement" or a "block type +/// conversion" (which corresponds to a "block replacement"). Otherwise, return +/// an empty ValueRange and a null type converter pointer. +static std::pair +getReplacedValues(IRRewrite *rewrite) { + if (auto *opRewrite = dyn_cast(rewrite)) +return std::make_pair(opRewrite->getOperation()->getResults(), + opRewrite->getConverter()); + if (auto *blockRewrite = dyn_cast(rewrite)) +return std::make_pair(blockRewrite->getOrigBlock()->getArguments(), + blockRewrite->getConverter()); + return std::make_pair(ValueRange(), nullptr); +} + +LogicalResult +OperationConverter::finalize(ConversionPatternRewriter &rewriter) { + ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl(); + DenseMap> inverseMapping = + rewriterImpl.mapping.getInverse(); + + // Process requested value replacements. + for (unsigned i = 0, e = rewriterImpl.rewrites.size(); i < e; ++i) { +ValueRange replacedValues; +const TypeConverter *converter; +std::tie(replacedValues, converter) = +getReplacedValues(rewriterImpl.rewrites[i].get()); +for (Value originalValue : replacedValues) { + // If the type of
[llvm-branch-commits] [llvm] release/19.x: [DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949) (PR #108397)
llvmbot wrote: @arsenm What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/108397 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949) (PR #108397)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/108397 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949) (PR #108397)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/108397 Backport 8f77d37f256809766fd83a09c6d144b785e9165a Requested by: @nikic >From 3d14aafa9c51b816d9bc1792898de9df84cc2fd6 Mon Sep 17 00:00:00 2001 From: Princeton Ferro Date: Wed, 4 Sep 2024 07:18:53 -0700 Subject: [PATCH] [DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949) Cache negative search result from getStoreMergeCandidates() so that mergeConsecutiveStores() does not iterate quadratically over a potentially long sequence of unmergeable stores. (cherry picked from commit 8f77d37f256809766fd83a09c6d144b785e9165a) --- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 83 --- 1 file changed, 51 insertions(+), 32 deletions(-) diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp index 71cdec91e5f67a..7b1f1dc40211d5 100644 --- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp @@ -191,6 +191,11 @@ namespace { // AA - Used for DAG load/store alias analysis. AliasAnalysis *AA; +/// This caches all chains that have already been processed in +/// DAGCombiner::getStoreMergeCandidates() and found to have no mergeable +/// stores candidates. +SmallPtrSet ChainsWithoutMergeableStores; + /// When an instruction is simplified, add all users of the instruction to /// the work lists because they might get more simplified now. void AddUsersToWorklist(SDNode *N) { @@ -776,11 +781,10 @@ namespace { bool UseTrunc); /// This is a helper function for mergeConsecutiveStores. Stores that -/// potentially may be merged with St are placed in StoreNodes. RootNode is -/// a chain predecessor to all store candidates. -void getStoreMergeCandidates(StoreSDNode *St, - SmallVectorImpl &StoreNodes, - SDNode *&Root); +/// potentially may be merged with St are placed in StoreNodes. On success, +/// returns a chain predecessor to all store candidates. +SDNode *getStoreMergeCandidates(StoreSDNode *St, +SmallVectorImpl &StoreNodes); /// Helper function for mergeConsecutiveStores. Checks if candidate stores /// have indirect dependency through their operands. RootNode is the @@ -1782,6 +1786,9 @@ void DAGCombiner::Run(CombineLevel AtLevel) { ++NodesCombined; +// Invalidate cached info. +ChainsWithoutMergeableStores.clear(); + // If we get back the same node we passed in, rather than a new node or // zero, we know that the node must have defined multiple values and // CombineTo was used. Since CombineTo takes care of the worklist @@ -20372,15 +20379,15 @@ bool DAGCombiner::mergeStoresOfConstantsOrVecElts( return true; } -void DAGCombiner::getStoreMergeCandidates( -StoreSDNode *St, SmallVectorImpl &StoreNodes, -SDNode *&RootNode) { +SDNode * +DAGCombiner::getStoreMergeCandidates(StoreSDNode *St, + SmallVectorImpl &StoreNodes) { // This holds the base pointer, index, and the offset in bytes from the base // pointer. We must have a base and an offset. Do not handle stores to undef // base pointers. BaseIndexOffset BasePtr = BaseIndexOffset::match(St, DAG); if (!BasePtr.getBase().getNode() || BasePtr.getBase().isUndef()) -return; +return nullptr; SDValue Val = peekThroughBitcasts(St->getValue()); StoreSource StoreSrc = getStoreSource(Val); @@ -20396,14 +20403,14 @@ void DAGCombiner::getStoreMergeCandidates( LoadVT = Ld->getMemoryVT(); // Load and store should be the same type. if (MemVT != LoadVT) - return; + return nullptr; // Loads must only have one use. if (!Ld->hasNUsesOfValue(1, 0)) - return; + return nullptr; // The memory operands must not be volatile/indexed/atomic. // TODO: May be able to relax for unordered atomics (see D66309) if (!Ld->isSimple() || Ld->isIndexed()) - return; + return nullptr; } auto CandidateMatch = [&](StoreSDNode *Other, BaseIndexOffset &Ptr, int64_t &Offset) -> bool { @@ -20471,6 +20478,27 @@ void DAGCombiner::getStoreMergeCandidates( return (BasePtr.equalBaseIndex(Ptr, DAG, Offset)); }; + // We are looking for a root node which is an ancestor to all mergable + // stores. We search up through a load, to our root and then down + // through all children. For instance we will find Store{1,2,3} if + // St is Store1, Store2. or Store3 where the root is not a load + // which always true for nonvolatile ops. TODO: Expand + // the search to find all valid candidates through multiple layers of loads. + // + // Root + // |---|---| + // LoadLoadStore3 + // | | + // Store1 Store2 + // + // FIXME: We should be a
[llvm-branch-commits] [llvm] release/19.x: [DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949) (PR #108397)
llvmbot wrote: @llvm/pr-subscribers-llvm-selectiondag Author: None (llvmbot) Changes Backport 8f77d37f256809766fd83a09c6d144b785e9165a Requested by: @nikic --- Full diff: https://github.com/llvm/llvm-project/pull/108397.diff 1 Files Affected: - (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+51-32) ``diff diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp index 71cdec91e5f67a..7b1f1dc40211d5 100644 --- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp @@ -191,6 +191,11 @@ namespace { // AA - Used for DAG load/store alias analysis. AliasAnalysis *AA; +/// This caches all chains that have already been processed in +/// DAGCombiner::getStoreMergeCandidates() and found to have no mergeable +/// stores candidates. +SmallPtrSet ChainsWithoutMergeableStores; + /// When an instruction is simplified, add all users of the instruction to /// the work lists because they might get more simplified now. void AddUsersToWorklist(SDNode *N) { @@ -776,11 +781,10 @@ namespace { bool UseTrunc); /// This is a helper function for mergeConsecutiveStores. Stores that -/// potentially may be merged with St are placed in StoreNodes. RootNode is -/// a chain predecessor to all store candidates. -void getStoreMergeCandidates(StoreSDNode *St, - SmallVectorImpl &StoreNodes, - SDNode *&Root); +/// potentially may be merged with St are placed in StoreNodes. On success, +/// returns a chain predecessor to all store candidates. +SDNode *getStoreMergeCandidates(StoreSDNode *St, +SmallVectorImpl &StoreNodes); /// Helper function for mergeConsecutiveStores. Checks if candidate stores /// have indirect dependency through their operands. RootNode is the @@ -1782,6 +1786,9 @@ void DAGCombiner::Run(CombineLevel AtLevel) { ++NodesCombined; +// Invalidate cached info. +ChainsWithoutMergeableStores.clear(); + // If we get back the same node we passed in, rather than a new node or // zero, we know that the node must have defined multiple values and // CombineTo was used. Since CombineTo takes care of the worklist @@ -20372,15 +20379,15 @@ bool DAGCombiner::mergeStoresOfConstantsOrVecElts( return true; } -void DAGCombiner::getStoreMergeCandidates( -StoreSDNode *St, SmallVectorImpl &StoreNodes, -SDNode *&RootNode) { +SDNode * +DAGCombiner::getStoreMergeCandidates(StoreSDNode *St, + SmallVectorImpl &StoreNodes) { // This holds the base pointer, index, and the offset in bytes from the base // pointer. We must have a base and an offset. Do not handle stores to undef // base pointers. BaseIndexOffset BasePtr = BaseIndexOffset::match(St, DAG); if (!BasePtr.getBase().getNode() || BasePtr.getBase().isUndef()) -return; +return nullptr; SDValue Val = peekThroughBitcasts(St->getValue()); StoreSource StoreSrc = getStoreSource(Val); @@ -20396,14 +20403,14 @@ void DAGCombiner::getStoreMergeCandidates( LoadVT = Ld->getMemoryVT(); // Load and store should be the same type. if (MemVT != LoadVT) - return; + return nullptr; // Loads must only have one use. if (!Ld->hasNUsesOfValue(1, 0)) - return; + return nullptr; // The memory operands must not be volatile/indexed/atomic. // TODO: May be able to relax for unordered atomics (see D66309) if (!Ld->isSimple() || Ld->isIndexed()) - return; + return nullptr; } auto CandidateMatch = [&](StoreSDNode *Other, BaseIndexOffset &Ptr, int64_t &Offset) -> bool { @@ -20471,6 +20478,27 @@ void DAGCombiner::getStoreMergeCandidates( return (BasePtr.equalBaseIndex(Ptr, DAG, Offset)); }; + // We are looking for a root node which is an ancestor to all mergable + // stores. We search up through a load, to our root and then down + // through all children. For instance we will find Store{1,2,3} if + // St is Store1, Store2. or Store3 where the root is not a load + // which always true for nonvolatile ops. TODO: Expand + // the search to find all valid candidates through multiple layers of loads. + // + // Root + // |---|---| + // LoadLoadStore3 + // | | + // Store1 Store2 + // + // FIXME: We should be able to climb and + // descend TokenFactors to find candidates as well. + + SDNode *RootNode = St->getChain().getNode(); + // Bail out if we already analyzed this root node and found nothing. + if (ChainsWithoutMergeableStores.contains(RootNode)) +return nullptr; + // Check if the pair of StoreNode and the RootNode already bail out many // times which is over the limit in dependence check. auto OverLimit
[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)
@@ -2546,87 +2522,61 @@ static Operation *findLiveUserOfReplaced( return nullptr; } -LogicalResult OperationConverter::legalizeConvertedOpResultTypes( -ConversionPatternRewriter &rewriter, -ConversionPatternRewriterImpl &rewriterImpl, -DenseMap> &inverseMapping) { - // Process requested operation replacements. - for (unsigned i = 0; i < rewriterImpl.rewrites.size(); ++i) { -auto *opReplacement = -dyn_cast(rewriterImpl.rewrites[i].get()); -if (!opReplacement) - continue; -Operation *op = opReplacement->getOperation(); -for (OpResult result : op->getResults()) { - // If the type of this op result changed and the result is still live, - // we need to materialize a conversion. - if (rewriterImpl.mapping.lookupOrNull(result, result.getType())) +/// Helper function that returns the replaced values and the type converter if +/// the given rewrite object is an "operation replacement" or a "block type +/// conversion" (which corresponds to a "block replacement"). Otherwise, return +/// an empty ValueRange and a null type converter pointer. +static std::pair +getReplacedValues(IRRewrite *rewrite) { + if (auto *opRewrite = dyn_cast(rewrite)) +return std::make_pair(opRewrite->getOperation()->getResults(), + opRewrite->getConverter()); + if (auto *blockRewrite = dyn_cast(rewrite)) +return std::make_pair(blockRewrite->getOrigBlock()->getArguments(), + blockRewrite->getConverter()); + return std::make_pair(ValueRange(), nullptr); kuhar wrote: nit ```suggestion return {opRewrite->getOperation()->getResults(), opRewrite->getConverter()}; if (auto *blockRewrite = dyn_cast(rewrite)) return {blockRewrite->getOrigBlock()->getArguments(), blockRewrite->getConverter()}; return {}; ``` ```suggestion return std::make_pair(opRewrite->getOperation()->getResults(), opRewrite->getConverter()); if (auto *blockRewrite = dyn_cast(rewrite)) return std::make_pair(blockRewrite->getOrigBlock()->getArguments(), blockRewrite->getConverter()); return std::make_pair(ValueRange(), nullptr); ``` https://github.com/llvm/llvm-project/pull/108381 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)
https://github.com/kuhar edited https://github.com/llvm/llvm-project/pull/108381 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)
@@ -2546,87 +2522,61 @@ static Operation *findLiveUserOfReplaced( return nullptr; } -LogicalResult OperationConverter::legalizeConvertedOpResultTypes( -ConversionPatternRewriter &rewriter, -ConversionPatternRewriterImpl &rewriterImpl, -DenseMap> &inverseMapping) { - // Process requested operation replacements. - for (unsigned i = 0; i < rewriterImpl.rewrites.size(); ++i) { -auto *opReplacement = -dyn_cast(rewriterImpl.rewrites[i].get()); -if (!opReplacement) - continue; -Operation *op = opReplacement->getOperation(); -for (OpResult result : op->getResults()) { - // If the type of this op result changed and the result is still live, - // we need to materialize a conversion. - if (rewriterImpl.mapping.lookupOrNull(result, result.getType())) +/// Helper function that returns the replaced values and the type converter if +/// the given rewrite object is an "operation replacement" or a "block type +/// conversion" (which corresponds to a "block replacement"). Otherwise, return +/// an empty ValueRange and a null type converter pointer. +static std::pair +getReplacedValues(IRRewrite *rewrite) { + if (auto *opRewrite = dyn_cast(rewrite)) +return std::make_pair(opRewrite->getOperation()->getResults(), + opRewrite->getConverter()); + if (auto *blockRewrite = dyn_cast(rewrite)) +return std::make_pair(blockRewrite->getOrigBlock()->getArguments(), + blockRewrite->getConverter()); + return std::make_pair(ValueRange(), nullptr); +} + +LogicalResult +OperationConverter::finalize(ConversionPatternRewriter &rewriter) { + ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl(); + DenseMap> inverseMapping = + rewriterImpl.mapping.getInverse(); + + // Process requested value replacements. + for (unsigned i = 0, e = rewriterImpl.rewrites.size(); i < e; ++i) { kuhar wrote: Nit: use range for? I don't see `i` being use outside of indexing into `rewriters`. https://github.com/llvm/llvm-project/pull/108381 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)
ilya-biryukov wrote: I got to a small reproducer that only uses STL, but it only produces an error in our environment and if I try it with this patch, the error goes away. I am probably missing something subtle, will dig deeper tomorrow. Sorry for another delay. https://github.com/llvm/llvm-project/pull/83237 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
@@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA, break; } + //You can't have negative offset, you must be partially inside the last type + if (TDA->Struct.Members[Idx].Offset > OffsetA) +Idx -=1; + tavianator wrote: ```suggestion Idx -= 1; ``` https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
https://github.com/tavianator commented: This fixes my reduced testcase but not the unreduced one. I'll try to make a new reduction. https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
https://github.com/tavianator edited https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
lukel97 wrote: The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r, no regressions or improvements on the other benchmarks as far as I can see. Seems to check out with the number of memcmps inlined reported for perlbench! https://github.com/llvm/llvm-project/pull/107548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [sanitizer] Allow to override GetDTLSRange (PR #108348)
https://github.com/thurstond approved this pull request. https://github.com/llvm/llvm-project/pull/108348 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [sanitizer] Test for #108348 (PR #108349)
https://github.com/thurstond approved this pull request. https://github.com/llvm/llvm-project/pull/108349 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [sanitizer] Test for #108348 (PR #108349)
fmayer wrote: Maybe improve the message a bit so people don't have to look at another pull request to understand what this is about? https://github.com/llvm/llvm-project/pull/108349 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/108422 Backport ab96409180aaad5417030f06a386253722a99d71 Requested by: @tstellar >From 5ec4f6033e5ad37f3a6f30ca48b74305770e5796 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Thu, 12 Sep 2024 09:50:57 -0700 Subject: [PATCH] workflows/release-binaries: Fix automatic upload (#107315) (cherry picked from commit ab96409180aaad5417030f06a386253722a99d71) --- .github/workflows/release-binaries.yml | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/.github/workflows/release-binaries.yml b/.github/workflows/release-binaries.yml index 509016e5b89c45..fcd371d49e6c91 100644 --- a/.github/workflows/release-binaries.yml +++ b/.github/workflows/release-binaries.yml @@ -450,11 +450,22 @@ jobs: name: ${{ needs.prepare.outputs.release-binary-filename }}-attestation path: ${{ needs.prepare.outputs.release-binary-filename }}.jsonl +- name: Checkout Release Scripts + uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1 + with: +sparse-checkout: | + llvm/utils/release/github-upload-release.py + llvm/utils/git/requirements.txt +sparse-checkout-cone-mode: false + +- name: Install Python Requirements + run: | +pip install --require-hashes -r ./llvm/utils/git/requirements.txt + - name: Upload Release shell: bash run: | -sudo apt install python3-github -./llvm-project/llvm/utils/release/github-upload-release.py \ +./llvm/utils/release/github-upload-release.py \ --token ${{ github.token }} \ --release ${{ needs.prepare.outputs.release-version }} \ upload \ ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/108422 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)
llvmbot wrote: @llvm/pr-subscribers-github-workflow Author: None (llvmbot) Changes Backport ab96409180aaad5417030f06a386253722a99d71 Requested by: @tstellar --- Full diff: https://github.com/llvm/llvm-project/pull/108422.diff 1 Files Affected: - (modified) .github/workflows/release-binaries.yml (+13-2) ``diff diff --git a/.github/workflows/release-binaries.yml b/.github/workflows/release-binaries.yml index 509016e5b89c45..fcd371d49e6c91 100644 --- a/.github/workflows/release-binaries.yml +++ b/.github/workflows/release-binaries.yml @@ -450,11 +450,22 @@ jobs: name: ${{ needs.prepare.outputs.release-binary-filename }}-attestation path: ${{ needs.prepare.outputs.release-binary-filename }}.jsonl +- name: Checkout Release Scripts + uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1 + with: +sparse-checkout: | + llvm/utils/release/github-upload-release.py + llvm/utils/git/requirements.txt +sparse-checkout-cone-mode: false + +- name: Install Python Requirements + run: | +pip install --require-hashes -r ./llvm/utils/git/requirements.txt + - name: Upload Release shell: bash run: | -sudo apt install python3-github -./llvm-project/llvm/utils/release/github-upload-release.py \ +./llvm/utils/release/github-upload-release.py \ --token ${{ github.token }} \ --release ${{ needs.prepare.outputs.release-version }} \ upload \ `` https://github.com/llvm/llvm-project/pull/108422 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [sanitizer] Test for __sanitizer_get_dtls_size (PR #108349)
https://github.com/vitalybuka edited https://github.com/llvm/llvm-project/pull/108349 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [sanitizer] Test for __sanitizer_get_dtls_size (PR #108349)
https://github.com/vitalybuka edited https://github.com/llvm/llvm-project/pull/108349 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [sanitizer] Test for __sanitizer_get_dtls_size (PR #108349)
vitalybuka wrote: > Maybe improve the message a bit so people don't have to look at another pull > request to understand what this is about? done https://github.com/llvm/llvm-project/pull/108349 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [sanitizer] Allow to override GetDTLSRange (PR #108348)
https://github.com/vitalybuka updated https://github.com/llvm/llvm-project/pull/108348 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [sanitizer] Test for __sanitizer_get_dtls_size (PR #108349)
https://github.com/vitalybuka updated https://github.com/llvm/llvm-project/pull/108349 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [sanitizer] Test for __sanitizer_get_dtls_size (PR #108349)
https://github.com/vitalybuka updated https://github.com/llvm/llvm-project/pull/108349 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [sanitizer] Allow to override GetDTLSRange (PR #108348)
https://github.com/vitalybuka updated https://github.com/llvm/llvm-project/pull/108348 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Match blocks with pseudo probes (PR #99891)
https://github.com/aaupov updated https://github.com/llvm/llvm-project/pull/99891 >From 36197b175681d07b4704e576fb008cec3cc1e05e Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Wed, 28 Aug 2024 21:10:25 +0200 Subject: [PATCH 1/2] Reworked block probe matching Use new probe ifaces Get all function probes at once Drop ProfileUsePseudoProbes Unify matchWithBlockPseudoProbes Distinguish exact and loose probe match --- bolt/include/bolt/Core/BinaryContext.h| 20 +- bolt/lib/Passes/BinaryPasses.cpp | 40 ++- bolt/lib/Profile/StaleProfileMatching.cpp | 404 ++ bolt/lib/Rewrite/PseudoProbeRewriter.cpp | 8 +- 4 files changed, 237 insertions(+), 235 deletions(-) diff --git a/bolt/include/bolt/Core/BinaryContext.h b/bolt/include/bolt/Core/BinaryContext.h index 3e20cb607e657b..3f7b2ac0bc6cf9 100644 --- a/bolt/include/bolt/Core/BinaryContext.h +++ b/bolt/include/bolt/Core/BinaryContext.h @@ -724,14 +724,26 @@ class BinaryContext { uint32_t NumStaleBlocks{0}; /// the number of exactly matched basic blocks uint32_t NumExactMatchedBlocks{0}; -/// the number of pseudo probe matched basic blocks -uint32_t NumPseudoProbeMatchedBlocks{0}; +/// the number of loosely matched basic blocks +uint32_t NumLooseMatchedBlocks{0}; +/// the number of exactly pseudo probe matched basic blocks +uint32_t NumPseudoProbeExactMatchedBlocks{0}; +/// the number of loosely pseudo probe matched basic blocks +uint32_t NumPseudoProbeLooseMatchedBlocks{0}; +/// the number of call matched basic blocks +uint32_t NumCallMatchedBlocks{0}; /// the total count of samples in the profile uint64_t StaleSampleCount{0}; /// the count of exactly matched samples uint64_t ExactMatchedSampleCount{0}; -/// the count of pseudo probe matched samples -uint64_t PseudoProbeMatchedSampleCount{0}; +/// the count of exactly matched samples +uint64_t LooseMatchedSampleCount{0}; +/// the count of exactly pseudo probe matched samples +uint64_t PseudoProbeExactMatchedSampleCount{0}; +/// the count of loosely pseudo probe matched samples +uint64_t PseudoProbeLooseMatchedSampleCount{0}; +/// the count of call matched samples +uint64_t CallMatchedSampleCount{0}; /// the number of stale functions that have matching number of blocks in /// the profile uint64_t NumStaleFuncsWithEqualBlockCount{0}; diff --git a/bolt/lib/Passes/BinaryPasses.cpp b/bolt/lib/Passes/BinaryPasses.cpp index b786f07a6a6651..8edbd58c3ed3de 100644 --- a/bolt/lib/Passes/BinaryPasses.cpp +++ b/bolt/lib/Passes/BinaryPasses.cpp @@ -1524,15 +1524,43 @@ Error PrintProgramStats::runOnFunctions(BinaryContext &BC) { 100.0 * BC.Stats.ExactMatchedSampleCount / BC.Stats.StaleSampleCount, BC.Stats.ExactMatchedSampleCount, BC.Stats.StaleSampleCount); BC.outs() << format( -"BOLT-INFO: inference found a pseudo probe match for %.2f%% of basic " +"BOLT-INFO: inference found an exact pseudo probe match for %.2f%% of " +"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples" +" (%zu out of %zu stale)\n", +100.0 * BC.Stats.NumPseudoProbeExactMatchedBlocks / +BC.Stats.NumStaleBlocks, +BC.Stats.NumPseudoProbeExactMatchedBlocks, BC.Stats.NumStaleBlocks, +100.0 * BC.Stats.PseudoProbeExactMatchedSampleCount / +BC.Stats.StaleSampleCount, +BC.Stats.PseudoProbeExactMatchedSampleCount, BC.Stats.StaleSampleCount); +BC.outs() << format( +"BOLT-INFO: inference found a loose pseudo probe match for %.2f%% of " +"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples" +" (%zu out of %zu stale)\n", +100.0 * BC.Stats.NumPseudoProbeLooseMatchedBlocks / +BC.Stats.NumStaleBlocks, +BC.Stats.NumPseudoProbeLooseMatchedBlocks, BC.Stats.NumStaleBlocks, +100.0 * BC.Stats.PseudoProbeLooseMatchedSampleCount / +BC.Stats.StaleSampleCount, +BC.Stats.PseudoProbeLooseMatchedSampleCount, BC.Stats.StaleSampleCount); +BC.outs() << format( +"BOLT-INFO: inference found a call match for %.2f%% of basic " "blocks" " (%zu out of %zu stale) responsible for %.2f%% samples" " (%zu out of %zu stale)\n", -100.0 * BC.Stats.NumPseudoProbeMatchedBlocks / BC.Stats.NumStaleBlocks, -BC.Stats.NumPseudoProbeMatchedBlocks, BC.Stats.NumStaleBlocks, -100.0 * BC.Stats.PseudoProbeMatchedSampleCount / -BC.Stats.StaleSampleCount, -BC.Stats.PseudoProbeMatchedSampleCount, BC.Stats.StaleSampleCount); +100.0 * BC.Stats.NumCallMatchedBlocks / BC.Stats.NumStaleBlocks, +BC.Stats.NumCallMatchedBlocks, BC.Stats.NumStaleBlocks, +100.0 * BC.Stats.CallMatchedSampleCount / BC.Stats.StaleSampleCount, +BC.Stats.CallMatchedSampleCount, BC.Stats.StaleSampleCount); +BC
[llvm-branch-commits] [llvm] [BOLT] Match blocks with pseudo probes (PR #99891)
https://github.com/aaupov updated https://github.com/llvm/llvm-project/pull/99891 >From 36197b175681d07b4704e576fb008cec3cc1e05e Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Wed, 28 Aug 2024 21:10:25 +0200 Subject: [PATCH 1/2] Reworked block probe matching Use new probe ifaces Get all function probes at once Drop ProfileUsePseudoProbes Unify matchWithBlockPseudoProbes Distinguish exact and loose probe match --- bolt/include/bolt/Core/BinaryContext.h| 20 +- bolt/lib/Passes/BinaryPasses.cpp | 40 ++- bolt/lib/Profile/StaleProfileMatching.cpp | 404 ++ bolt/lib/Rewrite/PseudoProbeRewriter.cpp | 8 +- 4 files changed, 237 insertions(+), 235 deletions(-) diff --git a/bolt/include/bolt/Core/BinaryContext.h b/bolt/include/bolt/Core/BinaryContext.h index 3e20cb607e657b..3f7b2ac0bc6cf9 100644 --- a/bolt/include/bolt/Core/BinaryContext.h +++ b/bolt/include/bolt/Core/BinaryContext.h @@ -724,14 +724,26 @@ class BinaryContext { uint32_t NumStaleBlocks{0}; /// the number of exactly matched basic blocks uint32_t NumExactMatchedBlocks{0}; -/// the number of pseudo probe matched basic blocks -uint32_t NumPseudoProbeMatchedBlocks{0}; +/// the number of loosely matched basic blocks +uint32_t NumLooseMatchedBlocks{0}; +/// the number of exactly pseudo probe matched basic blocks +uint32_t NumPseudoProbeExactMatchedBlocks{0}; +/// the number of loosely pseudo probe matched basic blocks +uint32_t NumPseudoProbeLooseMatchedBlocks{0}; +/// the number of call matched basic blocks +uint32_t NumCallMatchedBlocks{0}; /// the total count of samples in the profile uint64_t StaleSampleCount{0}; /// the count of exactly matched samples uint64_t ExactMatchedSampleCount{0}; -/// the count of pseudo probe matched samples -uint64_t PseudoProbeMatchedSampleCount{0}; +/// the count of exactly matched samples +uint64_t LooseMatchedSampleCount{0}; +/// the count of exactly pseudo probe matched samples +uint64_t PseudoProbeExactMatchedSampleCount{0}; +/// the count of loosely pseudo probe matched samples +uint64_t PseudoProbeLooseMatchedSampleCount{0}; +/// the count of call matched samples +uint64_t CallMatchedSampleCount{0}; /// the number of stale functions that have matching number of blocks in /// the profile uint64_t NumStaleFuncsWithEqualBlockCount{0}; diff --git a/bolt/lib/Passes/BinaryPasses.cpp b/bolt/lib/Passes/BinaryPasses.cpp index b786f07a6a6651..8edbd58c3ed3de 100644 --- a/bolt/lib/Passes/BinaryPasses.cpp +++ b/bolt/lib/Passes/BinaryPasses.cpp @@ -1524,15 +1524,43 @@ Error PrintProgramStats::runOnFunctions(BinaryContext &BC) { 100.0 * BC.Stats.ExactMatchedSampleCount / BC.Stats.StaleSampleCount, BC.Stats.ExactMatchedSampleCount, BC.Stats.StaleSampleCount); BC.outs() << format( -"BOLT-INFO: inference found a pseudo probe match for %.2f%% of basic " +"BOLT-INFO: inference found an exact pseudo probe match for %.2f%% of " +"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples" +" (%zu out of %zu stale)\n", +100.0 * BC.Stats.NumPseudoProbeExactMatchedBlocks / +BC.Stats.NumStaleBlocks, +BC.Stats.NumPseudoProbeExactMatchedBlocks, BC.Stats.NumStaleBlocks, +100.0 * BC.Stats.PseudoProbeExactMatchedSampleCount / +BC.Stats.StaleSampleCount, +BC.Stats.PseudoProbeExactMatchedSampleCount, BC.Stats.StaleSampleCount); +BC.outs() << format( +"BOLT-INFO: inference found a loose pseudo probe match for %.2f%% of " +"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples" +" (%zu out of %zu stale)\n", +100.0 * BC.Stats.NumPseudoProbeLooseMatchedBlocks / +BC.Stats.NumStaleBlocks, +BC.Stats.NumPseudoProbeLooseMatchedBlocks, BC.Stats.NumStaleBlocks, +100.0 * BC.Stats.PseudoProbeLooseMatchedSampleCount / +BC.Stats.StaleSampleCount, +BC.Stats.PseudoProbeLooseMatchedSampleCount, BC.Stats.StaleSampleCount); +BC.outs() << format( +"BOLT-INFO: inference found a call match for %.2f%% of basic " "blocks" " (%zu out of %zu stale) responsible for %.2f%% samples" " (%zu out of %zu stale)\n", -100.0 * BC.Stats.NumPseudoProbeMatchedBlocks / BC.Stats.NumStaleBlocks, -BC.Stats.NumPseudoProbeMatchedBlocks, BC.Stats.NumStaleBlocks, -100.0 * BC.Stats.PseudoProbeMatchedSampleCount / -BC.Stats.StaleSampleCount, -BC.Stats.PseudoProbeMatchedSampleCount, BC.Stats.StaleSampleCount); +100.0 * BC.Stats.NumCallMatchedBlocks / BC.Stats.NumStaleBlocks, +BC.Stats.NumCallMatchedBlocks, BC.Stats.NumStaleBlocks, +100.0 * BC.Stats.CallMatchedSampleCount / BC.Stats.StaleSampleCount, +BC.Stats.CallMatchedSampleCount, BC.Stats.StaleSampleCount); +BC
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
@@ -58,8 +59,158 @@ const BinaryFunction *YAMLProfileWriter::setCSIDestination( return nullptr; } +std::vector +YAMLProfileWriter::collectInlineTree( +const MCPseudoProbeDecoder &Decoder, +const MCDecodedPseudoProbeInlineTree &Root) { + auto getHash = [&](const MCDecodedPseudoProbeInlineTree &Node) { +return Decoder.getFuncDescForGUID(Node.Guid)->FuncHash; + }; + std::vector InlineTree( + {InlineTreeNode{&Root, Root.Guid, getHash(Root), 0, 0}}); + uint32_t ParentId = 0; + while (ParentId != InlineTree.size()) { +const MCDecodedPseudoProbeInlineTree *Cur = InlineTree[ParentId].InlineTree; +for (const MCDecodedPseudoProbeInlineTree &Child : Cur->getChildren()) + InlineTree.emplace_back( + InlineTreeNode{&Child, Child.Guid, getHash(Child), ParentId, + std::get<1>(Child.getInlineSite())}); +++ParentId; + } + + return InlineTree; +} + +std::tuple +YAMLProfileWriter::convertPseudoProbeDesc(const MCPseudoProbeDecoder &Decoder) { + yaml::bolt::PseudoProbeDesc Desc; + InlineTreeDesc InlineTree; + + for (const MCDecodedPseudoProbeInlineTree &TopLev : + Decoder.getDummyInlineRoot().getChildren()) +InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev; + + for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap()) +++InlineTree.HashIdxMap[FuncDesc.FuncHash]; + + InlineTree.GUIDIdxMap.reserve(Decoder.getGUID2FuncDescMap().size()); + for (const auto &Node : Decoder.getInlineTreeVec()) +++InlineTree.GUIDIdxMap[Node.Guid]; + + std::vector> GUIDFreqVec; + GUIDFreqVec.reserve(InlineTree.GUIDIdxMap.size()); + for (const auto [GUID, Cnt] : InlineTree.GUIDIdxMap) +GUIDFreqVec.emplace_back(Cnt, GUID); + llvm::sort(GUIDFreqVec); + + std::vector> HashFreqVec; + HashFreqVec.reserve(InlineTree.HashIdxMap.size()); + for (const auto [Hash, Cnt] : InlineTree.HashIdxMap) +HashFreqVec.emplace_back(Cnt, Hash); + llvm::sort(HashFreqVec); + + uint32_t Index = 0; + Desc.Hash.reserve(HashFreqVec.size()); + for (uint64_t Hash : llvm::make_second_range(llvm::reverse(HashFreqVec))) { +Desc.Hash.emplace_back(Hash); +InlineTree.HashIdxMap[Hash] = Index++; + } + + Index = 0; + Desc.GUID.reserve(GUIDFreqVec.size()); + for (uint64_t GUID : llvm::make_second_range(llvm::reverse(GUIDFreqVec))) { +Desc.GUID.emplace_back(GUID); +InlineTree.GUIDIdxMap[GUID] = Index++; +uint64_t Hash = Decoder.getFuncDescForGUID(GUID)->FuncHash; +Desc.GUIDHashIdx.emplace_back(InlineTree.HashIdxMap[Hash]); + } + + return {Desc, InlineTree}; +} + +std::vector +YAMLProfileWriter::convertNodeProbes(NodeIdToProbes &NodeProbes) { + struct BlockProbeInfoHasher { +size_t operator()(const yaml::bolt::PseudoProbeInfo &BPI) const { + auto HashCombine = [](auto &Range) { +return llvm::hash_combine_range(Range.begin(), Range.end()); + }; + return llvm::hash_combine(HashCombine(BPI.BlockProbes), +HashCombine(BPI.CallProbes), +HashCombine(BPI.IndCallProbes)); +} + }; + + // Check identical BlockProbeInfo structs and merge them + std::unordered_map, + BlockProbeInfoHasher> + BPIToNodes; + for (auto &[NodeId, Probes] : NodeProbes) { +yaml::bolt::PseudoProbeInfo BPI; +BPI.BlockProbes = std::vector(Probes[0].begin(), Probes[0].end()); +BPI.IndCallProbes = std::vector(Probes[1].begin(), Probes[1].end()); +BPI.CallProbes = std::vector(Probes[2].begin(), Probes[2].end()); +BPIToNodes[BPI].push_back(NodeId); + } + + auto handleMask = [](const auto &Ids, auto &Vec, auto &Mask) { +for (auto Id : Ids) + if (Id > 64) +Vec.emplace_back(Id); + else +Mask |= 1ull << (Id - 1); + }; + + // Add to YAML with merged nodes/block mask optimizations + std::vector YamlProbes; + YamlProbes.reserve(BPIToNodes.size()); + for (const auto &[BPI, Nodes] : BPIToNodes) { +auto &YamlBPI = YamlProbes.emplace_back(yaml::bolt::PseudoProbeInfo()); +YamlBPI.CallProbes = BPI.CallProbes; +YamlBPI.IndCallProbes = BPI.IndCallProbes; +if (Nodes.size() == 1) + YamlBPI.InlineTreeIndex = Nodes.front(); +else + YamlBPI.InlineTreeNodes = Nodes; +handleMask(BPI.BlockProbes, YamlBPI.BlockProbes, YamlBPI.BlockMask); + } + return YamlProbes; +} + +std::tuple, + YAMLProfileWriter::InlineTreeMapTy> +YAMLProfileWriter::convertBFInlineTree(const MCPseudoProbeDecoder &Decoder, + const InlineTreeDesc &InlineTree, + uint64_t GUID) { + DenseMap InlineTreeNodeId; + std::vector YamlInlineTree; + auto It = InlineTree.TopLevelGUIDToInlineTree.find(GUID); + if (It == InlineTree.TopLevelGUIDToInlineTree.end()) +return {YamlInlineTree, InlineTreeNodeId}; + const MCDecodedPseudoProbeInlineTree *Root = It->second; + assert(Root && "Malformed TopLevelGUIDToInlineTree"); +
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
@@ -14,29 +14,31 @@ # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML # CHECK-YAML: name: bar # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, { guid: 0xE413754A191DB537, id: 4, type: 0 } ] -# CHECK-YAML: guid: 0xE413754A191DB537 -# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94 +# CHECK-YAML: probes: [ { blx: 9 } ] wlei-llvm wrote: There is no call probe case, IIRC, noinline-cs-pseudoprobe.test should contain some call probes, we can use that to create the test. there are still some cases not covered I think, but I guess that requires to create a large binary which we don't want to upload to the repo. https://github.com/llvm/llvm-project/pull/107137 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
@@ -2421,11 +2433,14 @@ std::error_code DataAggregator::writeBATYAML(BinaryContext &BC, const uint32_t InputOffset = BAT->translate( FuncAddr, OutputAddress - FuncAddr, /*IsBranchSrc=*/true); const unsigned BlockIndex = getBlock(InputOffset).second; -YamlBF.Blocks[BlockIndex].PseudoProbes.emplace_back( -yaml::bolt::PseudoProbeInfo{Probe.getGuid(), Probe.getIndex(), -Probe.getType()}); +BlockProbes[BlockIndex].emplace_back(Probe); } } + +for (auto &[Block, Probes] : BlockProbes) { + YamlBF.Blocks[Block].PseudoProbes = + YAMLProfileWriter::writeBlockProbes(Probes, InlineTreeNodeId); wlei-llvm wrote: Thanks for the context. https://github.com/llvm/llvm-project/pull/107137 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
@@ -58,8 +59,158 @@ const BinaryFunction *YAMLProfileWriter::setCSIDestination( return nullptr; } +std::vector +YAMLProfileWriter::collectInlineTree( +const MCPseudoProbeDecoder &Decoder, +const MCDecodedPseudoProbeInlineTree &Root) { + auto getHash = [&](const MCDecodedPseudoProbeInlineTree &Node) { +return Decoder.getFuncDescForGUID(Node.Guid)->FuncHash; + }; + std::vector InlineTree( + {InlineTreeNode{&Root, Root.Guid, getHash(Root), 0, 0}}); + uint32_t ParentId = 0; + while (ParentId != InlineTree.size()) { +const MCDecodedPseudoProbeInlineTree *Cur = InlineTree[ParentId].InlineTree; +for (const MCDecodedPseudoProbeInlineTree &Child : Cur->getChildren()) + InlineTree.emplace_back( + InlineTreeNode{&Child, Child.Guid, getHash(Child), ParentId, + std::get<1>(Child.getInlineSite())}); +++ParentId; + } + + return InlineTree; +} + +std::tuple +YAMLProfileWriter::convertPseudoProbeDesc(const MCPseudoProbeDecoder &Decoder) { + yaml::bolt::PseudoProbeDesc Desc; + InlineTreeDesc InlineTree; + + for (const MCDecodedPseudoProbeInlineTree &TopLev : + Decoder.getDummyInlineRoot().getChildren()) +InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev; + + for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap()) +++InlineTree.HashIdxMap[FuncDesc.FuncHash]; + + InlineTree.GUIDIdxMap.reserve(Decoder.getGUID2FuncDescMap().size()); + for (const auto &Node : Decoder.getInlineTreeVec()) +++InlineTree.GUIDIdxMap[Node.Guid]; + + std::vector> GUIDFreqVec; + GUIDFreqVec.reserve(InlineTree.GUIDIdxMap.size()); + for (const auto [GUID, Cnt] : InlineTree.GUIDIdxMap) +GUIDFreqVec.emplace_back(Cnt, GUID); + llvm::sort(GUIDFreqVec); + + std::vector> HashFreqVec; + HashFreqVec.reserve(InlineTree.HashIdxMap.size()); + for (const auto [Hash, Cnt] : InlineTree.HashIdxMap) +HashFreqVec.emplace_back(Cnt, Hash); + llvm::sort(HashFreqVec); + + uint32_t Index = 0; + Desc.Hash.reserve(HashFreqVec.size()); + for (uint64_t Hash : llvm::make_second_range(llvm::reverse(HashFreqVec))) { +Desc.Hash.emplace_back(Hash); +InlineTree.HashIdxMap[Hash] = Index++; + } + + Index = 0; + Desc.GUID.reserve(GUIDFreqVec.size()); + for (uint64_t GUID : llvm::make_second_range(llvm::reverse(GUIDFreqVec))) { +Desc.GUID.emplace_back(GUID); +InlineTree.GUIDIdxMap[GUID] = Index++; +uint64_t Hash = Decoder.getFuncDescForGUID(GUID)->FuncHash; +Desc.GUIDHashIdx.emplace_back(InlineTree.HashIdxMap[Hash]); + } + + return {Desc, InlineTree}; +} + +std::vector +YAMLProfileWriter::convertNodeProbes(NodeIdToProbes &NodeProbes) { + struct BlockProbeInfoHasher { +size_t operator()(const yaml::bolt::PseudoProbeInfo &BPI) const { + auto HashCombine = [](auto &Range) { +return llvm::hash_combine_range(Range.begin(), Range.end()); + }; + return llvm::hash_combine(HashCombine(BPI.BlockProbes), +HashCombine(BPI.CallProbes), +HashCombine(BPI.IndCallProbes)); +} + }; + + // Check identical BlockProbeInfo structs and merge them + std::unordered_map, + BlockProbeInfoHasher> + BPIToNodes; + for (auto &[NodeId, Probes] : NodeProbes) { +yaml::bolt::PseudoProbeInfo BPI; +BPI.BlockProbes = std::vector(Probes[0].begin(), Probes[0].end()); +BPI.IndCallProbes = std::vector(Probes[1].begin(), Probes[1].end()); +BPI.CallProbes = std::vector(Probes[2].begin(), Probes[2].end()); +BPIToNodes[BPI].push_back(NodeId); + } + + auto handleMask = [](const auto &Ids, auto &Vec, auto &Mask) { +for (auto Id : Ids) + if (Id > 64) +Vec.emplace_back(Id); + else +Mask |= 1ull << (Id - 1); + }; + + // Add to YAML with merged nodes/block mask optimizations + std::vector YamlProbes; + YamlProbes.reserve(BPIToNodes.size()); + for (const auto &[BPI, Nodes] : BPIToNodes) { +auto &YamlBPI = YamlProbes.emplace_back(yaml::bolt::PseudoProbeInfo()); +YamlBPI.CallProbes = BPI.CallProbes; +YamlBPI.IndCallProbes = BPI.IndCallProbes; +if (Nodes.size() == 1) + YamlBPI.InlineTreeIndex = Nodes.front(); +else + YamlBPI.InlineTreeNodes = Nodes; +handleMask(BPI.BlockProbes, YamlBPI.BlockProbes, YamlBPI.BlockMask); + } + return YamlProbes; +} + +std::tuple, + YAMLProfileWriter::InlineTreeMapTy> +YAMLProfileWriter::convertBFInlineTree(const MCPseudoProbeDecoder &Decoder, + const InlineTreeDesc &InlineTree, + uint64_t GUID) { + DenseMap InlineTreeNodeId; + std::vector YamlInlineTree; + auto It = InlineTree.TopLevelGUIDToInlineTree.find(GUID); + if (It == InlineTree.TopLevelGUIDToInlineTree.end()) +return {YamlInlineTree, InlineTreeNodeId}; + const MCDecodedPseudoProbeInlineTree *Root = It->second; + assert(Root && "Malformed TopLevelGUIDToInlineTree"); +
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
tavianator wrote: Here's the new testcase. Not sure if this bug is related or not. It has to do with `memcpy()`; if you replace the call with the commented-out line above it, it works. ```c struct node { struct node *next; }; struct list { struct node *head, **tail; }; int main(void) { struct list *list = __builtin_malloc(sizeof(*list)); list->head = 0; list->tail = &list->head; struct node *node = __builtin_malloc(sizeof(*node)); node->next = 0; *list->tail = node; list->tail = &node->next; while (list->head) { struct node *node = list->head; // list->head = node->next; __builtin_memcpy(&list->head, &node->next, sizeof(list->head)); node->next = 0; } return 0; } ``` ```console tavianator@tachyon $ ~/code/llvm/llvm-project/build/bin/clang -Wall -g -fsanitize=type foo.c -o foo tavianator@tachyon $ ./foo ==5885==ERROR: TypeSanitizer: type-aliasing-violation on address 0x55af02a8c2a0 (pc 0x55aef600fb36 bp 0x7ffcbf810cf0 sp 0x7ffcbf810c90 tid 5885) READ of size 8 at 0x55af02a8c2a0 with type any pointer (in list at offset 0) accesses an existing object of type any pointer (in node at offset 0) #0 0x55aef600fb35 in main /home/tavianator/code/bfs/foo.c:20:15 ``` https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
https://github.com/aaupov updated https://github.com/llvm/llvm-project/pull/107137 >From 50c021b09950cf7d6a8f25b1ac0dec246f2325f5 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Tue, 3 Sep 2024 11:38:04 -0700 Subject: [PATCH 1/6] update pseudoprobe-decoding-inline.test Created using spr 1.3.4 --- .../test/X86/pseudoprobe-decoding-inline.test | 31 --- 1 file changed, 20 insertions(+), 11 deletions(-) diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test b/bolt/test/X86/pseudoprobe-decoding-inline.test index 1fdd00c7ef6c4b..629dd84ab8e1dc 100644 --- a/bolt/test/X86/pseudoprobe-decoding-inline.test +++ b/bolt/test/X86/pseudoprobe-decoding-inline.test @@ -14,29 +14,38 @@ # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML # CHECK-YAML: name: bar # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, { guid: 0xE413754A191DB537, id: 4, type: 0 } ] -# CHECK-YAML: guid: 0xE413754A191DB537 -# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 +# CHECK-YAML-NEXT: - { id: 4, type: 0 +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 0 } # # CHECK-YAML: name: foo # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ] -# CHECK-YAML: guid: 0x5CF8C24CDB18BDAC -# CHECK-YAML: pseudo_probe_desc_hash: 0x200205A19C5B4 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 } +# CHECK-YAML-NEXT: - { id: 2, type: 0 } +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 0 } +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 1, callsite: 8 } # # CHECK-YAML: name: main # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0xDB956436E78DD5FA, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ] -# CHECK-YAML: guid: 0xDB956436E78DD5FA -# CHECK-YAML: pseudo_probe_desc_hash: 0x1 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 } +# CHECK-YAML-NEXT: - { id: 1, type: 0, inline_tree_id: 1 } +# CHECK-YAML-NEXT: - { id: 2, type: 0, inline_tree_id: 1 } +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0xDB956436E78DD5FA, hash: 0x1, id: 0 } +# CHECK-YAML-NEXT: - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 1, callsite: 2 } +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 2, parent: 1, callsite: 8 } # ## Check that without --profile-write-pseudo-probes option, no pseudo probes are ## generated # RUN: perf2bolt %S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin -p %t.preagg --pa -w %t.yaml -o %t.fdata # RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-NO-OPT # CHECK-NO-OPT-NOT: pseudo_probes -# CHECK-NO-OPT-NOT: guid -# CHECK-NO-OPT-NOT: pseudo_probe_desc_hash +# CHECK-NO-OPT-NOT: inline_tree CHECK: Report of decoding input pseudo probe binaries >From 6ec4cf6bf05551d02cbf17e9edbe8d6931588ff1 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Mon, 9 Sep 2024 21:37:28 -0700 Subject: [PATCH 2/6] clang-format Created using spr 1.3.4 --- bolt/lib/Profile/YAMLProfileWriter.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp b/bolt/lib/Profile/YAMLProfileWriter.cpp index 70e5e09e2920e5..f2609de18ce63c 100644 --- a/bolt/lib/Profile/YAMLProfileWriter.cpp +++ b/bolt/lib/Profile/YAMLProfileWriter.cpp @@ -90,7 +90,7 @@ YAMLProfileWriter::convertPseudoProbeDesc(const MCPseudoProbeDecoder &Decoder) { InlineTreeDesc InlineTree; for (const MCDecodedPseudoProbeInlineTree &TopLev : - Decoder.getDummyInlineRoot().getChildren()) + Decoder.getDummyInlineRoot().getChildren()) InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev; for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap()) >From 852eb07f345dd1d9e77a6faead8bf0f73ff64ba7 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Tue, 10 Sep 2024 12:26:11 -0700 Subject: [PATCH 3/6] Make pseudo_probe_desc optional Created using spr 1.3.4 --- bolt/include/bolt/Profile/ProfileYAMLMapping.h | 9 - bolt/test/X86/pseudoprobe-decoding-inline.test | 5 +++-- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/bolt/include/bolt/Profile/ProfileYAMLMapping.h b/bolt/include/bolt/Profile/ProfileYAMLMapping.h index 588e2f59d67e01..9cc33264d70718 100644 --- a/bolt/include/bolt/Profile/ProfileYAMLMapping.h +++ b/bolt/include/bolt/Profile/ProfileYAMLMapping.h @@ -275,6 +275,12 @@ struct PseudoProbeDesc { std::vector GUID; std::vector Hash; std::vector GUIDHash; // Index of hash for that GUID in Hash + + bool operator==(const PseudoProbeDesc &Ot
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
https://github.com/aaupov updated https://github.com/llvm/llvm-project/pull/107137 >From 50c021b09950cf7d6a8f25b1ac0dec246f2325f5 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Tue, 3 Sep 2024 11:38:04 -0700 Subject: [PATCH 1/6] update pseudoprobe-decoding-inline.test Created using spr 1.3.4 --- .../test/X86/pseudoprobe-decoding-inline.test | 31 --- 1 file changed, 20 insertions(+), 11 deletions(-) diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test b/bolt/test/X86/pseudoprobe-decoding-inline.test index 1fdd00c7ef6c4b..629dd84ab8e1dc 100644 --- a/bolt/test/X86/pseudoprobe-decoding-inline.test +++ b/bolt/test/X86/pseudoprobe-decoding-inline.test @@ -14,29 +14,38 @@ # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML # CHECK-YAML: name: bar # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, { guid: 0xE413754A191DB537, id: 4, type: 0 } ] -# CHECK-YAML: guid: 0xE413754A191DB537 -# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 +# CHECK-YAML-NEXT: - { id: 4, type: 0 +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 0 } # # CHECK-YAML: name: foo # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ] -# CHECK-YAML: guid: 0x5CF8C24CDB18BDAC -# CHECK-YAML: pseudo_probe_desc_hash: 0x200205A19C5B4 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 } +# CHECK-YAML-NEXT: - { id: 2, type: 0 } +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 0 } +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 1, callsite: 8 } # # CHECK-YAML: name: main # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0xDB956436E78DD5FA, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ] -# CHECK-YAML: guid: 0xDB956436E78DD5FA -# CHECK-YAML: pseudo_probe_desc_hash: 0x1 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 } +# CHECK-YAML-NEXT: - { id: 1, type: 0, inline_tree_id: 1 } +# CHECK-YAML-NEXT: - { id: 2, type: 0, inline_tree_id: 1 } +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0xDB956436E78DD5FA, hash: 0x1, id: 0 } +# CHECK-YAML-NEXT: - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 1, callsite: 2 } +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 2, parent: 1, callsite: 8 } # ## Check that without --profile-write-pseudo-probes option, no pseudo probes are ## generated # RUN: perf2bolt %S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin -p %t.preagg --pa -w %t.yaml -o %t.fdata # RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-NO-OPT # CHECK-NO-OPT-NOT: pseudo_probes -# CHECK-NO-OPT-NOT: guid -# CHECK-NO-OPT-NOT: pseudo_probe_desc_hash +# CHECK-NO-OPT-NOT: inline_tree CHECK: Report of decoding input pseudo probe binaries >From 6ec4cf6bf05551d02cbf17e9edbe8d6931588ff1 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Mon, 9 Sep 2024 21:37:28 -0700 Subject: [PATCH 2/6] clang-format Created using spr 1.3.4 --- bolt/lib/Profile/YAMLProfileWriter.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp b/bolt/lib/Profile/YAMLProfileWriter.cpp index 70e5e09e2920e5..f2609de18ce63c 100644 --- a/bolt/lib/Profile/YAMLProfileWriter.cpp +++ b/bolt/lib/Profile/YAMLProfileWriter.cpp @@ -90,7 +90,7 @@ YAMLProfileWriter::convertPseudoProbeDesc(const MCPseudoProbeDecoder &Decoder) { InlineTreeDesc InlineTree; for (const MCDecodedPseudoProbeInlineTree &TopLev : - Decoder.getDummyInlineRoot().getChildren()) + Decoder.getDummyInlineRoot().getChildren()) InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev; for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap()) >From 852eb07f345dd1d9e77a6faead8bf0f73ff64ba7 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Tue, 10 Sep 2024 12:26:11 -0700 Subject: [PATCH 3/6] Make pseudo_probe_desc optional Created using spr 1.3.4 --- bolt/include/bolt/Profile/ProfileYAMLMapping.h | 9 - bolt/test/X86/pseudoprobe-decoding-inline.test | 5 +++-- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/bolt/include/bolt/Profile/ProfileYAMLMapping.h b/bolt/include/bolt/Profile/ProfileYAMLMapping.h index 588e2f59d67e01..9cc33264d70718 100644 --- a/bolt/include/bolt/Profile/ProfileYAMLMapping.h +++ b/bolt/include/bolt/Profile/ProfileYAMLMapping.h @@ -275,6 +275,12 @@ struct PseudoProbeDesc { std::vector GUID; std::vector Hash; std::vector GUIDHash; // Index of hash for that GUID in Hash + + bool operator==(const PseudoProbeDesc &Ot
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
@@ -14,29 +14,31 @@ # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML # CHECK-YAML: name: bar # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, { guid: 0xE413754A191DB537, id: 4, type: 0 } ] -# CHECK-YAML: guid: 0xE413754A191DB537 -# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94 +# CHECK-YAML: probes: [ { blx: 9 } ] aaupov wrote: Added bolt/test/X86/pseudoprobe-decoding-noinline.test If there are any other binaries/tests in llvm tree with pseudo probes, I can check them as well. https://github.com/llvm/llvm-project/pull/107137 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)
tavianator wrote: I guess the bug there is that the memcpy() interceptor literally copies the dynamic type from `node->next` to `list->head`. Then `list->head` is accessed but tysan thinks the memory has type `struct node::next` which doesn't match. https://github.com/llvm/llvm-project/pull/108385 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CIR] Add .clang-tidy files to agree with our style convention (PR #108444)
https://github.com/lanza created https://github.com/llvm/llvm-project/pull/108444 https://llvm.github.io/clangir/GettingStarted/coding-guideline.html ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CIR] Add .clang-tidy files to agree with our style convention (PR #108444)
https://github.com/lanza closed https://github.com/llvm/llvm-project/pull/108444 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
https://github.com/rafaelauler approved this pull request. Not an expert but looks good. Why is operator== in struct InlineTreeInfo always returning false? Is this intentional? https://github.com/llvm/llvm-project/pull/107137 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
rafaelauler wrote: Sorry, didn't see lei was already reviewing this. Go ahead with his expert's opinion, please. https://github.com/llvm/llvm-project/pull/107137 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
aaupov wrote: > Not an expert but looks good. Why is operator== in struct InlineTreeInfo > always returning false? Is this intentional? It's a quirk of YAML: `BinaryFunctionProfile` has `std::vector InlineTree` as optional field. Optional fields compare against the default value using `operator==`, which for vector transitively requires `operator==` for `InlineTreeNode`. However the default value for `InlineTree` is empty vector, so no `InlineTreeNode` comparison is actually necessary. Hence we just say that `InlineTreeNode::operator==` is false. https://github.com/llvm/llvm-project/pull/107137 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
https://github.com/wlei-llvm approved this pull request. LGTM, thanks! https://github.com/llvm/llvm-project/pull/107137 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
aaupov wrote: Thanks for a review, @wlei-llvm, @rafaelauler, @WenleiHe! https://github.com/llvm/llvm-project/pull/107137 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
https://github.com/aaupov updated https://github.com/llvm/llvm-project/pull/107137 >From 50c021b09950cf7d6a8f25b1ac0dec246f2325f5 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Tue, 3 Sep 2024 11:38:04 -0700 Subject: [PATCH 1/6] update pseudoprobe-decoding-inline.test Created using spr 1.3.4 --- .../test/X86/pseudoprobe-decoding-inline.test | 31 --- 1 file changed, 20 insertions(+), 11 deletions(-) diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test b/bolt/test/X86/pseudoprobe-decoding-inline.test index 1fdd00c7ef6c4b..629dd84ab8e1dc 100644 --- a/bolt/test/X86/pseudoprobe-decoding-inline.test +++ b/bolt/test/X86/pseudoprobe-decoding-inline.test @@ -14,29 +14,38 @@ # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML # CHECK-YAML: name: bar # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, { guid: 0xE413754A191DB537, id: 4, type: 0 } ] -# CHECK-YAML: guid: 0xE413754A191DB537 -# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 +# CHECK-YAML-NEXT: - { id: 4, type: 0 +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 0 } # # CHECK-YAML: name: foo # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ] -# CHECK-YAML: guid: 0x5CF8C24CDB18BDAC -# CHECK-YAML: pseudo_probe_desc_hash: 0x200205A19C5B4 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 } +# CHECK-YAML-NEXT: - { id: 2, type: 0 } +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 0 } +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 1, callsite: 8 } # # CHECK-YAML: name: main # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0xDB956436E78DD5FA, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ] -# CHECK-YAML: guid: 0xDB956436E78DD5FA -# CHECK-YAML: pseudo_probe_desc_hash: 0x1 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 } +# CHECK-YAML-NEXT: - { id: 1, type: 0, inline_tree_id: 1 } +# CHECK-YAML-NEXT: - { id: 2, type: 0, inline_tree_id: 1 } +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0xDB956436E78DD5FA, hash: 0x1, id: 0 } +# CHECK-YAML-NEXT: - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 1, callsite: 2 } +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 2, parent: 1, callsite: 8 } # ## Check that without --profile-write-pseudo-probes option, no pseudo probes are ## generated # RUN: perf2bolt %S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin -p %t.preagg --pa -w %t.yaml -o %t.fdata # RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-NO-OPT # CHECK-NO-OPT-NOT: pseudo_probes -# CHECK-NO-OPT-NOT: guid -# CHECK-NO-OPT-NOT: pseudo_probe_desc_hash +# CHECK-NO-OPT-NOT: inline_tree CHECK: Report of decoding input pseudo probe binaries >From 6ec4cf6bf05551d02cbf17e9edbe8d6931588ff1 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Mon, 9 Sep 2024 21:37:28 -0700 Subject: [PATCH 2/6] clang-format Created using spr 1.3.4 --- bolt/lib/Profile/YAMLProfileWriter.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp b/bolt/lib/Profile/YAMLProfileWriter.cpp index 70e5e09e2920e5..f2609de18ce63c 100644 --- a/bolt/lib/Profile/YAMLProfileWriter.cpp +++ b/bolt/lib/Profile/YAMLProfileWriter.cpp @@ -90,7 +90,7 @@ YAMLProfileWriter::convertPseudoProbeDesc(const MCPseudoProbeDecoder &Decoder) { InlineTreeDesc InlineTree; for (const MCDecodedPseudoProbeInlineTree &TopLev : - Decoder.getDummyInlineRoot().getChildren()) + Decoder.getDummyInlineRoot().getChildren()) InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev; for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap()) >From 852eb07f345dd1d9e77a6faead8bf0f73ff64ba7 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Tue, 10 Sep 2024 12:26:11 -0700 Subject: [PATCH 3/6] Make pseudo_probe_desc optional Created using spr 1.3.4 --- bolt/include/bolt/Profile/ProfileYAMLMapping.h | 9 - bolt/test/X86/pseudoprobe-decoding-inline.test | 5 +++-- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/bolt/include/bolt/Profile/ProfileYAMLMapping.h b/bolt/include/bolt/Profile/ProfileYAMLMapping.h index 588e2f59d67e01..9cc33264d70718 100644 --- a/bolt/include/bolt/Profile/ProfileYAMLMapping.h +++ b/bolt/include/bolt/Profile/ProfileYAMLMapping.h @@ -275,6 +275,12 @@ struct PseudoProbeDesc { std::vector GUID; std::vector Hash; std::vector GUIDHash; // Index of hash for that GUID in Hash + + bool operator==(const PseudoProbeDesc &Ot
[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)
https://github.com/aaupov updated https://github.com/llvm/llvm-project/pull/107137 >From 50c021b09950cf7d6a8f25b1ac0dec246f2325f5 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Tue, 3 Sep 2024 11:38:04 -0700 Subject: [PATCH 1/6] update pseudoprobe-decoding-inline.test Created using spr 1.3.4 --- .../test/X86/pseudoprobe-decoding-inline.test | 31 --- 1 file changed, 20 insertions(+), 11 deletions(-) diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test b/bolt/test/X86/pseudoprobe-decoding-inline.test index 1fdd00c7ef6c4b..629dd84ab8e1dc 100644 --- a/bolt/test/X86/pseudoprobe-decoding-inline.test +++ b/bolt/test/X86/pseudoprobe-decoding-inline.test @@ -14,29 +14,38 @@ # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML # CHECK-YAML: name: bar # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, { guid: 0xE413754A191DB537, id: 4, type: 0 } ] -# CHECK-YAML: guid: 0xE413754A191DB537 -# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 +# CHECK-YAML-NEXT: - { id: 4, type: 0 +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 0 } # # CHECK-YAML: name: foo # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ] -# CHECK-YAML: guid: 0x5CF8C24CDB18BDAC -# CHECK-YAML: pseudo_probe_desc_hash: 0x200205A19C5B4 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 } +# CHECK-YAML-NEXT: - { id: 2, type: 0 } +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 0 } +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 1, callsite: 8 } # # CHECK-YAML: name: main # CHECK-YAML: - bid: 0 -# CHECK-YAML: pseudo_probes: [ { guid: 0xDB956436E78DD5FA, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ] -# CHECK-YAML: guid: 0xDB956436E78DD5FA -# CHECK-YAML: pseudo_probe_desc_hash: 0x1 +# CHECK-YAML: pseudo_probes: +# CHECK-YAML-NEXT: - { id: 1, type: 0 } +# CHECK-YAML-NEXT: - { id: 1, type: 0, inline_tree_id: 1 } +# CHECK-YAML-NEXT: - { id: 2, type: 0, inline_tree_id: 1 } +# CHECK-YAML: inline_tree: +# CHECK-YAML-NEXT: - { guid: 0xDB956436E78DD5FA, hash: 0x1, id: 0 } +# CHECK-YAML-NEXT: - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 1, callsite: 2 } +# CHECK-YAML-NEXT: - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 2, parent: 1, callsite: 8 } # ## Check that without --profile-write-pseudo-probes option, no pseudo probes are ## generated # RUN: perf2bolt %S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin -p %t.preagg --pa -w %t.yaml -o %t.fdata # RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-NO-OPT # CHECK-NO-OPT-NOT: pseudo_probes -# CHECK-NO-OPT-NOT: guid -# CHECK-NO-OPT-NOT: pseudo_probe_desc_hash +# CHECK-NO-OPT-NOT: inline_tree CHECK: Report of decoding input pseudo probe binaries >From 6ec4cf6bf05551d02cbf17e9edbe8d6931588ff1 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Mon, 9 Sep 2024 21:37:28 -0700 Subject: [PATCH 2/6] clang-format Created using spr 1.3.4 --- bolt/lib/Profile/YAMLProfileWriter.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp b/bolt/lib/Profile/YAMLProfileWriter.cpp index 70e5e09e2920e5..f2609de18ce63c 100644 --- a/bolt/lib/Profile/YAMLProfileWriter.cpp +++ b/bolt/lib/Profile/YAMLProfileWriter.cpp @@ -90,7 +90,7 @@ YAMLProfileWriter::convertPseudoProbeDesc(const MCPseudoProbeDecoder &Decoder) { InlineTreeDesc InlineTree; for (const MCDecodedPseudoProbeInlineTree &TopLev : - Decoder.getDummyInlineRoot().getChildren()) + Decoder.getDummyInlineRoot().getChildren()) InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev; for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap()) >From 852eb07f345dd1d9e77a6faead8bf0f73ff64ba7 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Tue, 10 Sep 2024 12:26:11 -0700 Subject: [PATCH 3/6] Make pseudo_probe_desc optional Created using spr 1.3.4 --- bolt/include/bolt/Profile/ProfileYAMLMapping.h | 9 - bolt/test/X86/pseudoprobe-decoding-inline.test | 5 +++-- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/bolt/include/bolt/Profile/ProfileYAMLMapping.h b/bolt/include/bolt/Profile/ProfileYAMLMapping.h index 588e2f59d67e01..9cc33264d70718 100644 --- a/bolt/include/bolt/Profile/ProfileYAMLMapping.h +++ b/bolt/include/bolt/Profile/ProfileYAMLMapping.h @@ -275,6 +275,12 @@ struct PseudoProbeDesc { std::vector GUID; std::vector Hash; std::vector GUIDHash; // Index of hash for that GUID in Hash + + bool operator==(const PseudoProbeDesc &Ot
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
wangpc-pp wrote: > The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r on > the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions or improvements on the > other benchmarks as far as I can see. Seems to check out with the number of > memcmps inlined reported for perlbench! Thanks a lot! The result is within my expectation. Is it OK to merge? The next step will be tuning for vectors. https://github.com/llvm/llvm-project/pull/107548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
@@ -2113,3 +2113,17 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion( } return Considerable; } + +RISCVTTIImpl::TTI::MemCmpExpansionOptions +RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const { + TTI::MemCmpExpansionOptions Options; + // FIXME: Vector haven't been tested. + Options.AllowOverlappingLoads = + (ST->enableUnalignedScalarMem() || ST->enableUnalignedVectorMem()); + Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize); + Options.NumLoadsPerBlock = Options.MaxNumLoads; + if (ST->is64Bit()) topperc wrote: I wonder if this might be better ``` if (ST->is64Bit()) Options.LoadSize = {8, 4, 2, 1}; else Options.LoadSize = {4, 2, 1}; ``` https://github.com/llvm/llvm-project/pull/107548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
@@ -112,42 +104,46 @@ entry: define i32 @bcmp_size_2(ptr %s1, ptr %s2) nounwind optsize { ; CHECK-ALIGNED-RV32-LABEL: bcmp_size_2: ; CHECK-ALIGNED-RV32: # %bb.0: # %entry -; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, -16 -; CHECK-ALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill -; CHECK-ALIGNED-RV32-NEXT:li a2, 2 -; CHECK-ALIGNED-RV32-NEXT:call bcmp -; CHECK-ALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload -; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, 16 +; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0) topperc wrote: Would it be cheaper to Xor all the bytes individually and then Or the xor results together? https://github.com/llvm/llvm-project/pull/107548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
@@ -1144,42 +2872,116 @@ entry: define i32 @memcmp_size_4(ptr %s1, ptr %s2) nounwind { ; CHECK-ALIGNED-RV32-LABEL: memcmp_size_4: ; CHECK-ALIGNED-RV32: # %bb.0: # %entry -; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, -16 -; CHECK-ALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill -; CHECK-ALIGNED-RV32-NEXT:li a2, 4 -; CHECK-ALIGNED-RV32-NEXT:call memcmp -; CHECK-ALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload -; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, 16 +; CHECK-ALIGNED-RV32-NEXT:lbu a2, 0(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a4, 3(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a0, 2(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a5, 0(a1) +; CHECK-ALIGNED-RV32-NEXT:lbu a6, 1(a1) +; CHECK-ALIGNED-RV32-NEXT:lbu a7, 3(a1) +; CHECK-ALIGNED-RV32-NEXT:lbu a1, 2(a1) +; CHECK-ALIGNED-RV32-NEXT:slli a0, a0, 8 +; CHECK-ALIGNED-RV32-NEXT:or a0, a0, a4 +; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 16 +; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 24 +; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3 +; CHECK-ALIGNED-RV32-NEXT:or a0, a2, a0 +; CHECK-ALIGNED-RV32-NEXT:slli a1, a1, 8 +; CHECK-ALIGNED-RV32-NEXT:or a1, a1, a7 +; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 16 +; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24 +; CHECK-ALIGNED-RV32-NEXT:or a2, a5, a6 +; CHECK-ALIGNED-RV32-NEXT:or a1, a2, a1 +; CHECK-ALIGNED-RV32-NEXT:sltu a2, a1, a0 +; CHECK-ALIGNED-RV32-NEXT:sltu a0, a0, a1 +; CHECK-ALIGNED-RV32-NEXT:sub a0, a2, a0 ; CHECK-ALIGNED-RV32-NEXT:ret ; ; CHECK-ALIGNED-RV64-LABEL: memcmp_size_4: ; CHECK-ALIGNED-RV64: # %bb.0: # %entry -; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, -16 -; CHECK-ALIGNED-RV64-NEXT:sd ra, 8(sp) # 8-byte Folded Spill -; CHECK-ALIGNED-RV64-NEXT:li a2, 4 -; CHECK-ALIGNED-RV64-NEXT:call memcmp -; CHECK-ALIGNED-RV64-NEXT:ld ra, 8(sp) # 8-byte Folded Reload -; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, 16 +; CHECK-ALIGNED-RV64-NEXT:lbu a2, 0(a0) +; CHECK-ALIGNED-RV64-NEXT:lbu a3, 1(a0) +; CHECK-ALIGNED-RV64-NEXT:lbu a4, 2(a0) +; CHECK-ALIGNED-RV64-NEXT:lb a0, 3(a0) +; CHECK-ALIGNED-RV64-NEXT:lbu a5, 0(a1) +; CHECK-ALIGNED-RV64-NEXT:lbu a6, 1(a1) +; CHECK-ALIGNED-RV64-NEXT:lbu a7, 2(a1) +; CHECK-ALIGNED-RV64-NEXT:lb a1, 3(a1) +; CHECK-ALIGNED-RV64-NEXT:andi a0, a0, 255 +; CHECK-ALIGNED-RV64-NEXT:slli a4, a4, 8 +; CHECK-ALIGNED-RV64-NEXT:or a0, a4, a0 +; CHECK-ALIGNED-RV64-NEXT:slli a3, a3, 16 +; CHECK-ALIGNED-RV64-NEXT:slliw a2, a2, 24 +; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a3 +; CHECK-ALIGNED-RV64-NEXT:or a0, a2, a0 +; CHECK-ALIGNED-RV64-NEXT:andi a1, a1, 255 +; CHECK-ALIGNED-RV64-NEXT:slli a7, a7, 8 +; CHECK-ALIGNED-RV64-NEXT:or a1, a7, a1 +; CHECK-ALIGNED-RV64-NEXT:slli a6, a6, 16 +; CHECK-ALIGNED-RV64-NEXT:slliw a2, a5, 24 +; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a6 +; CHECK-ALIGNED-RV64-NEXT:or a1, a2, a1 +; CHECK-ALIGNED-RV64-NEXT:sltu a2, a1, a0 +; CHECK-ALIGNED-RV64-NEXT:sltu a0, a0, a1 +; CHECK-ALIGNED-RV64-NEXT:sub a0, a2, a0 ; CHECK-ALIGNED-RV64-NEXT:ret ; ; CHECK-UNALIGNED-RV32-LABEL: memcmp_size_4: ; CHECK-UNALIGNED-RV32: # %bb.0: # %entry -; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, -16 -; CHECK-UNALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill -; CHECK-UNALIGNED-RV32-NEXT:li a2, 4 -; CHECK-UNALIGNED-RV32-NEXT:call memcmp -; CHECK-UNALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload -; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, 16 +; CHECK-UNALIGNED-RV32-NEXT:lw a0, 0(a0) topperc wrote: What is this code doing? It seems way more complicated than a 4 byte memcmp should be. https://github.com/llvm/llvm-project/pull/107548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
https://github.com/topperc edited https://github.com/llvm/llvm-project/pull/107548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
@@ -1144,42 +2872,116 @@ entry: define i32 @memcmp_size_4(ptr %s1, ptr %s2) nounwind { ; CHECK-ALIGNED-RV32-LABEL: memcmp_size_4: ; CHECK-ALIGNED-RV32: # %bb.0: # %entry -; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, -16 -; CHECK-ALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill -; CHECK-ALIGNED-RV32-NEXT:li a2, 4 -; CHECK-ALIGNED-RV32-NEXT:call memcmp -; CHECK-ALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload -; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, 16 +; CHECK-ALIGNED-RV32-NEXT:lbu a2, 0(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a4, 3(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a0, 2(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a5, 0(a1) +; CHECK-ALIGNED-RV32-NEXT:lbu a6, 1(a1) +; CHECK-ALIGNED-RV32-NEXT:lbu a7, 3(a1) +; CHECK-ALIGNED-RV32-NEXT:lbu a1, 2(a1) +; CHECK-ALIGNED-RV32-NEXT:slli a0, a0, 8 +; CHECK-ALIGNED-RV32-NEXT:or a0, a0, a4 +; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 16 +; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 24 +; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3 +; CHECK-ALIGNED-RV32-NEXT:or a0, a2, a0 +; CHECK-ALIGNED-RV32-NEXT:slli a1, a1, 8 +; CHECK-ALIGNED-RV32-NEXT:or a1, a1, a7 +; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 16 +; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24 +; CHECK-ALIGNED-RV32-NEXT:or a2, a5, a6 +; CHECK-ALIGNED-RV32-NEXT:or a1, a2, a1 +; CHECK-ALIGNED-RV32-NEXT:sltu a2, a1, a0 +; CHECK-ALIGNED-RV32-NEXT:sltu a0, a0, a1 +; CHECK-ALIGNED-RV32-NEXT:sub a0, a2, a0 ; CHECK-ALIGNED-RV32-NEXT:ret ; ; CHECK-ALIGNED-RV64-LABEL: memcmp_size_4: ; CHECK-ALIGNED-RV64: # %bb.0: # %entry -; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, -16 -; CHECK-ALIGNED-RV64-NEXT:sd ra, 8(sp) # 8-byte Folded Spill -; CHECK-ALIGNED-RV64-NEXT:li a2, 4 -; CHECK-ALIGNED-RV64-NEXT:call memcmp -; CHECK-ALIGNED-RV64-NEXT:ld ra, 8(sp) # 8-byte Folded Reload -; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, 16 +; CHECK-ALIGNED-RV64-NEXT:lbu a2, 0(a0) +; CHECK-ALIGNED-RV64-NEXT:lbu a3, 1(a0) +; CHECK-ALIGNED-RV64-NEXT:lbu a4, 2(a0) +; CHECK-ALIGNED-RV64-NEXT:lb a0, 3(a0) +; CHECK-ALIGNED-RV64-NEXT:lbu a5, 0(a1) +; CHECK-ALIGNED-RV64-NEXT:lbu a6, 1(a1) +; CHECK-ALIGNED-RV64-NEXT:lbu a7, 2(a1) +; CHECK-ALIGNED-RV64-NEXT:lb a1, 3(a1) +; CHECK-ALIGNED-RV64-NEXT:andi a0, a0, 255 +; CHECK-ALIGNED-RV64-NEXT:slli a4, a4, 8 +; CHECK-ALIGNED-RV64-NEXT:or a0, a4, a0 +; CHECK-ALIGNED-RV64-NEXT:slli a3, a3, 16 +; CHECK-ALIGNED-RV64-NEXT:slliw a2, a2, 24 +; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a3 +; CHECK-ALIGNED-RV64-NEXT:or a0, a2, a0 +; CHECK-ALIGNED-RV64-NEXT:andi a1, a1, 255 +; CHECK-ALIGNED-RV64-NEXT:slli a7, a7, 8 +; CHECK-ALIGNED-RV64-NEXT:or a1, a7, a1 +; CHECK-ALIGNED-RV64-NEXT:slli a6, a6, 16 +; CHECK-ALIGNED-RV64-NEXT:slliw a2, a5, 24 +; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a6 +; CHECK-ALIGNED-RV64-NEXT:or a1, a2, a1 +; CHECK-ALIGNED-RV64-NEXT:sltu a2, a1, a0 +; CHECK-ALIGNED-RV64-NEXT:sltu a0, a0, a1 +; CHECK-ALIGNED-RV64-NEXT:sub a0, a2, a0 ; CHECK-ALIGNED-RV64-NEXT:ret ; ; CHECK-UNALIGNED-RV32-LABEL: memcmp_size_4: ; CHECK-UNALIGNED-RV32: # %bb.0: # %entry -; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, -16 -; CHECK-UNALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill -; CHECK-UNALIGNED-RV32-NEXT:li a2, 4 -; CHECK-UNALIGNED-RV32-NEXT:call memcmp -; CHECK-UNALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload -; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, 16 +; CHECK-UNALIGNED-RV32-NEXT:lw a0, 0(a0) topperc wrote: I guess this is an expanded byteswap? https://github.com/llvm/llvm-project/pull/107548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
topperc wrote: > The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r on > the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions or improvements on the > other benchmarks as far as I can see. Seems to check out with the number of > memcmps inlined reported for perlbench! Does spacemit-x60 support unaligned scalar memory and was your test with or without that enabled? https://github.com/llvm/llvm-project/pull/107548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
wangpc-pp wrote: > > The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r on > > the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions or improvements on the > > other benchmarks as far as I can see. Seems to check out with the number of > > memcmps inlined reported for perlbench! > > Does spacemit-x60 support unaligned scalar memory and was your test with or > without that enabled? It supports unaligned scalar but not unaligned vector. And it seems we don't add these features to `-mcpu=spacemit-x60`. So I think @lukel97 ran the SPEC without unaligned scalar. https://github.com/llvm/llvm-project/pull/107548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
lukel97 wrote: > > > The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r > > > on the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions or improvements on > > > the other benchmarks as far as I can see. Seems to check out with the > > > number of memcmps inlined reported for perlbench! > > > > > > Does spacemit-x60 support unaligned scalar memory and was your test with or > > without that enabled? > > > > It supports unaligned scalar but not unaligned vector. And it seems we don't > add these features to `-mcpu=spacemit-x60`. So I think @lukel97 ran the SPEC > without unaligned scalar. Yeah, -mno-strict-align gave a bus error. I ultimately built it without unaligned scalar since I wasn't sure if unaligned scalar was performant or not. https://github.com/llvm/llvm-project/pull/107548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
wangpc-pp wrote: > > > > The run just finished, I'm seeing a 0.75% improvement on > > > > 500.perlbench_r on the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions > > > > or improvements on the other benchmarks as far as I can see. Seems to > > > > check out with the number of memcmps inlined reported for perlbench! > > > > > > > > > > Does spacemit-x60 support unaligned scalar memory and was your test with > > > or without that enabled? > > > > > > > > It supports unaligned scalar but not unaligned vector. And it seems we > > don't add these features to `-mcpu=spacemit-x60`. So I think @lukel97 ran > > the SPEC without unaligned scalar. > > Yeah, -mno-strict-align gave a bus error. I ultimately built it without > unaligned scalar since I wasn't sure if unaligned scalar was performant or > not. IIRC, we have separated this into two options(scalar/vector) now. So maybe we can specify the scalar one. https://github.com/llvm/llvm-project/pull/107548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/107548 >From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Fri, 6 Sep 2024 17:20:51 +0800 Subject: [PATCH 1/5] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?= =?UTF-8?q?itial=20version?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Created using spr 1.3.6-beta.1 --- .../Target/RISCV/RISCVTargetTransformInfo.cpp | 15 + .../Target/RISCV/RISCVTargetTransformInfo.h | 3 + llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++ 3 files changed, 950 insertions(+) create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp index e809e15eacf696..ad532aadc83266 100644 --- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp @@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion( } return Considerable; } + +RISCVTTIImpl::TTI::MemCmpExpansionOptions +RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const { + TTI::MemCmpExpansionOptions Options; + // FIXME: Vector haven't been tested. + Options.AllowOverlappingLoads = + (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem()); + Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize); + Options.NumLoadsPerBlock = Options.MaxNumLoads; + if (ST->is64Bit()) +Options.LoadSizes.push_back(8); + llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1})); + Options.AllowedTailExpansions = {3, 5, 6}; + return Options; +} diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h index 763b89bfec0a66..ee9bed09df97f3 100644 --- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h +++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h @@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase { shouldConsiderAddressTypePromotion(const Instruction &I, bool &AllowPromotionWithoutCommonHeader); std::optional getMinPageSize() const { return 4096; } + + TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize, +bool IsZeroCmp) const; }; } // end namespace llvm diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll b/llvm/test/CodeGen/RISCV/memcmp.ll new file mode 100644 index 00..652cd02e2c750a --- /dev/null +++ b/llvm/test/CodeGen/RISCV/memcmp.ll @@ -0,0 +1,932 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s --check-prefix=CHECK-ALIGNED-RV32 +; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s --check-prefix=CHECK-ALIGNED-RV64 +; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -mattr=+unaligned-scalar-mem -O2 \ +; RUN: | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32 +; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -mattr=+unaligned-scalar-mem -O2 \ +; RUN: | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64 + +declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly +declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly + +define i1 @bcmp_size_15(i8* %s1, i8* %s2) { +; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15: +; CHECK-ALIGNED-RV32: # %bb.0: # %entry +; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0) +; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8 +; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3 +; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16 +; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24 +; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4 +; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2 +; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1) +; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1) +; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1) +; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1) +; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8 +; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4 +; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16 +; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24 +; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5 +; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3 +; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3 +; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0) +; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0) +; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8 +; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4 +; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16 +; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24 +; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5 +; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3 +; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1) +; CHECK-ALIGNED-RV32-NEXT
[llvm-branch-commits] [llvm] release/19.x: [DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949) (PR #108397)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/108397 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] 8290ce0 - [Clang] Fix handling of placeholder variables name in init captures (#107055)
Author: cor3ntin Date: 2024-09-12T12:41:44+02:00 New Revision: 8290ce0998788b6a575ed7b4988b093f48c25b3d URL: https://github.com/llvm/llvm-project/commit/8290ce0998788b6a575ed7b4988b093f48c25b3d DIFF: https://github.com/llvm/llvm-project/commit/8290ce0998788b6a575ed7b4988b093f48c25b3d.diff LOG: [Clang] Fix handling of placeholder variables name in init captures (#107055) We were incorrectly not deduplicating results when looking up `_` which, for a lambda init capture, would result in an ambiguous lookup. The same bug caused some diagnostic notes to be emitted twice. Fixes #107024 Added: Modified: clang/docs/ReleaseNotes.rst clang/lib/Sema/SemaLambda.cpp clang/lib/Sema/SemaLookup.cpp clang/test/SemaCXX/cxx2c-placeholder-vars.cpp Removed: diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 53d819c6c44574..8c7a6ba70acd28 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -1122,6 +1122,7 @@ Bug Fixes to C++ Support - Fixed a crash-on-invalid bug involving extraneous template parameter with concept substitution. (#GH73885) - Fixed assertion failure by skipping the analysis of an invalid field declaration. (#GH99868) - Fix an issue with dependent source location expressions (#GH106428), (#GH81155), (#GH80210), (#GH85373) +- Fix handling of ``_`` as the name of a lambda's init capture variable. (#GH107024) Bug Fixes to AST Handling diff --git a/clang/lib/Sema/SemaLambda.cpp b/clang/lib/Sema/SemaLambda.cpp index 601077e9f3334d..809b94bb7412b9 100644 --- a/clang/lib/Sema/SemaLambda.cpp +++ b/clang/lib/Sema/SemaLambda.cpp @@ -1318,7 +1318,6 @@ void Sema::ActOnLambdaExpressionAfterIntroducer(LambdaIntroducer &Intro, if (C->Init.isUsable()) { addInitCapture(LSI, cast(Var), C->Kind == LCK_ByRef); - PushOnScopeChains(Var, CurScope, false); } else { TryCaptureKind Kind = C->Kind == LCK_ByRef ? TryCapture_ExplicitByRef : TryCapture_ExplicitByVal; diff --git a/clang/lib/Sema/SemaLookup.cpp b/clang/lib/Sema/SemaLookup.cpp index 7a6a64529f52ec..d3d4bf27ae7283 100644 --- a/clang/lib/Sema/SemaLookup.cpp +++ b/clang/lib/Sema/SemaLookup.cpp @@ -570,7 +570,7 @@ void LookupResult::resolveKind() { // For non-type declarations, check for a prior lookup result naming this // canonical declaration. -if (!D->isPlaceholderVar(getSema().getLangOpts()) && !ExistingI) { +if (!ExistingI) { auto UniqueResult = Unique.insert(std::make_pair(D, I)); if (!UniqueResult.second) { // We've seen this entity before. diff --git a/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp b/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp index 5cf66b48784e91..29ca3b5ef3df72 100644 --- a/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp +++ b/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp @@ -50,14 +50,16 @@ void f() { void lambda() { (void)[_ = 0, _ = 1] { // expected-warning {{placeholder variables are incompatible with C++ standards before C++2c}} \ - // expected-note 4{{placeholder declared here}} + // expected-note 2{{placeholder declared here}} (void)_++; // expected-error {{ambiguous reference to placeholder '_', which is defined multiple times}} }; { int _ = 12; -(void)[_ = 0]{}; // no warning ( diff erent scope) +(void)[_ = 0]{ return _;}; // no warning ( diff erent scope) } + +auto GH107024 = [_ = 42]() { return _; }(); } namespace global_var { ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/107214 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)
github-actions[bot] wrote: @cor3ntin (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/107214 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/108422 >From 32a8b56bbf0a3c7678d44ba690427915446a9a72 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Thu, 12 Sep 2024 09:50:57 -0700 Subject: [PATCH] workflows/release-binaries: Fix automatic upload (#107315) (cherry picked from commit ab96409180aaad5417030f06a386253722a99d71) --- .github/workflows/release-binaries.yml | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/.github/workflows/release-binaries.yml b/.github/workflows/release-binaries.yml index 509016e5b89c45..fcd371d49e6c91 100644 --- a/.github/workflows/release-binaries.yml +++ b/.github/workflows/release-binaries.yml @@ -450,11 +450,22 @@ jobs: name: ${{ needs.prepare.outputs.release-binary-filename }}-attestation path: ${{ needs.prepare.outputs.release-binary-filename }}.jsonl +- name: Checkout Release Scripts + uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1 + with: +sparse-checkout: | + llvm/utils/release/github-upload-release.py + llvm/utils/git/requirements.txt +sparse-checkout-cone-mode: false + +- name: Install Python Requirements + run: | +pip install --require-hashes -r ./llvm/utils/git/requirements.txt + - name: Upload Release shell: bash run: | -sudo apt install python3-github -./llvm-project/llvm/utils/release/github-upload-release.py \ +./llvm/utils/release/github-upload-release.py \ --token ${{ github.token }} \ --release ${{ needs.prepare.outputs.release-version }} \ upload \ ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/108422 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)
github-actions[bot] wrote: @tstellar (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/108422 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [SLP]Fix PR104422: Wrong value truncation (PR #104747)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/104747 >From e475814473c5990a1409e24d4ecd56ce01546fd0 Mon Sep 17 00:00:00 2001 From: Alexey Bataev Date: Thu, 15 Aug 2024 07:21:10 -0700 Subject: [PATCH 1/2] [SLP][NFC]Add a test with incorrect minbitwidth analysis for reduced operands (cherry picked from commit 65ac12d3c9877ecf5b97552364e7eead887d94eb) --- .../X86/operand-is-reduced-val.ll | 46 +++ 1 file changed, 46 insertions(+) create mode 100644 llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll diff --git a/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll new file mode 100644 index 00..5fb93e27539d8e --- /dev/null +++ b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll @@ -0,0 +1,46 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 +; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-unknown-linux < %s -slp-threshold=-10 | FileCheck %s + +define i64 @src(i32 %a) { +; CHECK-LABEL: define i64 @src( +; CHECK-SAME: i32 [[A:%.*]]) { +; CHECK-NEXT: [[ENTRY:.*:]] +; CHECK-NEXT:[[TMP17:%.*]] = sext i32 [[A]] to i64 +; CHECK-NEXT:[[TMP1:%.*]] = insertelement <4 x i32> poison, i32 [[A]], i32 0 +; CHECK-NEXT:[[TMP2:%.*]] = shufflevector <4 x i32> [[TMP1]], <4 x i32> poison, <4 x i32> zeroinitializer +; CHECK-NEXT:[[TMP3:%.*]] = add <4 x i32> [[TMP2]], +; CHECK-NEXT:[[TMP4:%.*]] = sext <4 x i32> [[TMP3]] to <4 x i64> +; CHECK-NEXT:[[TMP5:%.*]] = and <4 x i32> [[TMP3]], +; CHECK-NEXT:[[TMP6:%.*]] = zext <4 x i32> [[TMP5]] to <4 x i64> +; CHECK-NEXT:[[TMP18:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> [[TMP6]]) +; CHECK-NEXT:[[TMP16:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> [[TMP4]]) +; CHECK-NEXT:[[TMP19:%.*]] = add i64 [[TMP18]], [[TMP16]] +; CHECK-NEXT:[[OP_RDX1:%.*]] = add i64 [[TMP19]], 4294967297 +; CHECK-NEXT:[[TMP21:%.*]] = add i64 [[OP_RDX1]], [[TMP17]] +; CHECK-NEXT:ret i64 [[TMP21]] +; +entry: + %0 = sext i32 %a to i64 + %1 = add nsw i64 %0, 4294967297 + %2 = sext i32 %a to i64 + %3 = add nsw i64 %2, 4294967297 + %4 = add i64 %3, %1 + %5 = and i64 %3, 1 + %6 = add i64 %4, %5 + %7 = sext i32 %a to i64 + %8 = add nsw i64 %7, 4294967297 + %9 = add i64 %8, %6 + %10 = and i64 %8, 1 + %11 = add i64 %9, %10 + %12 = sext i32 %a to i64 + %13 = add nsw i64 %12, 4294967297 + %14 = add i64 %13, %11 + %15 = and i64 %13, 1 + %16 = add i64 %14, %15 + %17 = sext i32 %a to i64 + %18 = add nsw i64 %17, 4294967297 + %19 = add i64 %18, %16 + %20 = and i64 %18, 1 + %21 = add i64 %19, %20 + ret i64 %21 +} >From a6a1f2ba8cc54e674e0a9f9790c9f226b9cd6a2b Mon Sep 17 00:00:00 2001 From: Alexey Bataev Date: Thu, 15 Aug 2024 07:57:37 -0700 Subject: [PATCH 2/2] [SLP]Fix PR104422: Wrong value truncation The minbitwidth restrictions can be skipped only for immediate reduced values, for other nodes still need to check if external users allow bitwidth reduction. Fixes https://github.com/llvm/llvm-project/issues/104422 (cherry picked from commit 56140a8258a3498cfcd9f0f05c182457d43cbfd2) --- llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | 3 ++- .../SLPVectorizer/X86/operand-is-reduced-val.ll | 17 ++--- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp index 2f3d6b27378aee..ab2b96cdc42db8 100644 --- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp +++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp @@ -15211,7 +15211,8 @@ bool BoUpSLP::collectValuesToDemote( if (any_of(E.Scalars, [&](Value *V) { return !all_of(V->users(), [=](User *U) { return getTreeEntry(U) || - (UserIgnoreList && UserIgnoreList->contains(U)) || + (E.Idx == 0 && UserIgnoreList && + UserIgnoreList->contains(U)) || (!isa(U) && U->getType()->isSized() && !U->getType()->isScalableTy() && DL->getTypeSizeInBits(U->getType()) <= BitWidth); diff --git a/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll index 5fb93e27539d8e..5fcac3fbf3bafe 100644 --- a/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll +++ b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll @@ -8,15 +8,18 @@ define i64 @src(i32 %a) { ; CHECK-NEXT:[[TMP17:%.*]] = sext i32 [[A]] to i64 ; CHECK-NEXT:[[TMP1:%.*]] = insertelement <4 x i32> poison, i32 [[A]], i32 0 ; CHECK-NEXT:[[TMP2:%.*]] = shufflevector <4 x i32> [[TMP1]], <4 x i32> poison, <4 x i32> zeroinitializer -; CHECK-NEXT:[[TMP3:%.*]] = add <4 x i32> [[TMP2]], -; CHECK-NEXT:[[TMP4:%.*]] = sext <4 x i32> [[TMP3]] to <4 x i
[llvm-branch-commits] [llvm] release/19.x: [SLP]Fix PR104422: Wrong value truncation (PR #104747)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/104747 >From 373180b440d04dc3cc0f6111b06684d18779d7c8 Mon Sep 17 00:00:00 2001 From: Alexey Bataev Date: Thu, 15 Aug 2024 07:21:10 -0700 Subject: [PATCH] [SLP]Fix PR104422: Wrong value truncation The minbitwidth restrictions can be skipped only for immediate reduced values, for other nodes still need to check if external users allow bitwidth reduction. Fixes https://github.com/llvm/llvm-project/issues/104422 (cherry picked from commit 56140a8258a3498cfcd9f0f05c182457d43cbfd2) --- .../Transforms/Vectorize/SLPVectorizer.cpp| 3 +- .../X86/operand-is-reduced-val.ll | 49 +++ 2 files changed, 51 insertions(+), 1 deletion(-) create mode 100644 llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp index 2f3d6b27378aee..ab2b96cdc42db8 100644 --- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp +++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp @@ -15211,7 +15211,8 @@ bool BoUpSLP::collectValuesToDemote( if (any_of(E.Scalars, [&](Value *V) { return !all_of(V->users(), [=](User *U) { return getTreeEntry(U) || - (UserIgnoreList && UserIgnoreList->contains(U)) || + (E.Idx == 0 && UserIgnoreList && + UserIgnoreList->contains(U)) || (!isa(U) && U->getType()->isSized() && !U->getType()->isScalableTy() && DL->getTypeSizeInBits(U->getType()) <= BitWidth); diff --git a/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll new file mode 100644 index 00..5fcac3fbf3bafe --- /dev/null +++ b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll @@ -0,0 +1,49 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 +; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-unknown-linux < %s -slp-threshold=-10 | FileCheck %s + +define i64 @src(i32 %a) { +; CHECK-LABEL: define i64 @src( +; CHECK-SAME: i32 [[A:%.*]]) { +; CHECK-NEXT: [[ENTRY:.*:]] +; CHECK-NEXT:[[TMP17:%.*]] = sext i32 [[A]] to i64 +; CHECK-NEXT:[[TMP1:%.*]] = insertelement <4 x i32> poison, i32 [[A]], i32 0 +; CHECK-NEXT:[[TMP2:%.*]] = shufflevector <4 x i32> [[TMP1]], <4 x i32> poison, <4 x i32> zeroinitializer +; CHECK-NEXT:[[TMP3:%.*]] = sext <4 x i32> [[TMP2]] to <4 x i64> +; CHECK-NEXT:[[TMP4:%.*]] = add nsw <4 x i64> [[TMP3]], +; CHECK-NEXT:[[TMP6:%.*]] = and <4 x i64> [[TMP4]], +; CHECK-NEXT:[[TMP18:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> [[TMP6]]) +; CHECK-NEXT:[[TMP16:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> [[TMP4]]) +; CHECK-NEXT:[[TMP8:%.*]] = insertelement <2 x i64> poison, i64 [[TMP16]], i32 0 +; CHECK-NEXT:[[TMP9:%.*]] = insertelement <2 x i64> [[TMP8]], i64 [[TMP18]], i32 1 +; CHECK-NEXT:[[TMP10:%.*]] = insertelement <2 x i64> , i64 [[TMP17]], i32 0 +; CHECK-NEXT:[[TMP11:%.*]] = add <2 x i64> [[TMP9]], [[TMP10]] +; CHECK-NEXT:[[TMP12:%.*]] = extractelement <2 x i64> [[TMP11]], i32 0 +; CHECK-NEXT:[[TMP13:%.*]] = extractelement <2 x i64> [[TMP11]], i32 1 +; CHECK-NEXT:[[TMP21:%.*]] = add i64 [[TMP12]], [[TMP13]] +; CHECK-NEXT:ret i64 [[TMP21]] +; +entry: + %0 = sext i32 %a to i64 + %1 = add nsw i64 %0, 4294967297 + %2 = sext i32 %a to i64 + %3 = add nsw i64 %2, 4294967297 + %4 = add i64 %3, %1 + %5 = and i64 %3, 1 + %6 = add i64 %4, %5 + %7 = sext i32 %a to i64 + %8 = add nsw i64 %7, 4294967297 + %9 = add i64 %8, %6 + %10 = and i64 %8, 1 + %11 = add i64 %9, %10 + %12 = sext i32 %a to i64 + %13 = add nsw i64 %12, 4294967297 + %14 = add i64 %13, %11 + %15 = and i64 %13, 1 + %16 = add i64 %14, %15 + %17 = sext i32 %a to i64 + %18 = add nsw i64 %17, 4294967297 + %19 = add i64 %18, %16 + %20 = and i64 %18, 1 + %21 = add i64 %19, %20 + ret i64 %21 +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 373180b - [SLP]Fix PR104422: Wrong value truncation
Author: Alexey Bataev Date: 2024-09-13T07:58:38+02:00 New Revision: 373180b440d04dc3cc0f6111b06684d18779d7c8 URL: https://github.com/llvm/llvm-project/commit/373180b440d04dc3cc0f6111b06684d18779d7c8 DIFF: https://github.com/llvm/llvm-project/commit/373180b440d04dc3cc0f6111b06684d18779d7c8.diff LOG: [SLP]Fix PR104422: Wrong value truncation The minbitwidth restrictions can be skipped only for immediate reduced values, for other nodes still need to check if external users allow bitwidth reduction. Fixes https://github.com/llvm/llvm-project/issues/104422 (cherry picked from commit 56140a8258a3498cfcd9f0f05c182457d43cbfd2) Added: llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll Modified: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp Removed: diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp index 2f3d6b27378aee..ab2b96cdc42db8 100644 --- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp +++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp @@ -15211,7 +15211,8 @@ bool BoUpSLP::collectValuesToDemote( if (any_of(E.Scalars, [&](Value *V) { return !all_of(V->users(), [=](User *U) { return getTreeEntry(U) || - (UserIgnoreList && UserIgnoreList->contains(U)) || + (E.Idx == 0 && UserIgnoreList && + UserIgnoreList->contains(U)) || (!isa(U) && U->getType()->isSized() && !U->getType()->isScalableTy() && DL->getTypeSizeInBits(U->getType()) <= BitWidth); diff --git a/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll new file mode 100644 index 00..5fcac3fbf3bafe --- /dev/null +++ b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll @@ -0,0 +1,49 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 +; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-unknown-linux < %s -slp-threshold=-10 | FileCheck %s + +define i64 @src(i32 %a) { +; CHECK-LABEL: define i64 @src( +; CHECK-SAME: i32 [[A:%.*]]) { +; CHECK-NEXT: [[ENTRY:.*:]] +; CHECK-NEXT:[[TMP17:%.*]] = sext i32 [[A]] to i64 +; CHECK-NEXT:[[TMP1:%.*]] = insertelement <4 x i32> poison, i32 [[A]], i32 0 +; CHECK-NEXT:[[TMP2:%.*]] = shufflevector <4 x i32> [[TMP1]], <4 x i32> poison, <4 x i32> zeroinitializer +; CHECK-NEXT:[[TMP3:%.*]] = sext <4 x i32> [[TMP2]] to <4 x i64> +; CHECK-NEXT:[[TMP4:%.*]] = add nsw <4 x i64> [[TMP3]], +; CHECK-NEXT:[[TMP6:%.*]] = and <4 x i64> [[TMP4]], +; CHECK-NEXT:[[TMP18:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> [[TMP6]]) +; CHECK-NEXT:[[TMP16:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> [[TMP4]]) +; CHECK-NEXT:[[TMP8:%.*]] = insertelement <2 x i64> poison, i64 [[TMP16]], i32 0 +; CHECK-NEXT:[[TMP9:%.*]] = insertelement <2 x i64> [[TMP8]], i64 [[TMP18]], i32 1 +; CHECK-NEXT:[[TMP10:%.*]] = insertelement <2 x i64> , i64 [[TMP17]], i32 0 +; CHECK-NEXT:[[TMP11:%.*]] = add <2 x i64> [[TMP9]], [[TMP10]] +; CHECK-NEXT:[[TMP12:%.*]] = extractelement <2 x i64> [[TMP11]], i32 0 +; CHECK-NEXT:[[TMP13:%.*]] = extractelement <2 x i64> [[TMP11]], i32 1 +; CHECK-NEXT:[[TMP21:%.*]] = add i64 [[TMP12]], [[TMP13]] +; CHECK-NEXT:ret i64 [[TMP21]] +; +entry: + %0 = sext i32 %a to i64 + %1 = add nsw i64 %0, 4294967297 + %2 = sext i32 %a to i64 + %3 = add nsw i64 %2, 4294967297 + %4 = add i64 %3, %1 + %5 = and i64 %3, 1 + %6 = add i64 %4, %5 + %7 = sext i32 %a to i64 + %8 = add nsw i64 %7, 4294967297 + %9 = add i64 %8, %6 + %10 = and i64 %8, 1 + %11 = add i64 %9, %10 + %12 = sext i32 %a to i64 + %13 = add nsw i64 %12, 4294967297 + %14 = add i64 %13, %11 + %15 = and i64 %13, 1 + %16 = add i64 %14, %15 + %17 = sext i32 %a to i64 + %18 = add nsw i64 %17, 4294967297 + %19 = add i64 %18, %16 + %20 = and i64 %18, 1 + %21 = add i64 %19, %20 + ret i64 %21 +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [SLP]Fix PR104422: Wrong value truncation (PR #104747)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/104747 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [SLP]Fix PR104422: Wrong value truncation (PR #104747)
github-actions[bot] wrote: @nikic (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/104747 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/106008 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)
tru wrote: This will have to wait for LLVM 20. I know it's not optimal for zig, but getting it in this late and it being abi breaking is tricky. https://github.com/llvm/llvm-project/pull/106008 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [Clang][Concepts] Fix the constraint equivalence checking involving parameter packs (#102131) (PR #106043)
tru wrote: Can this be reviewed @cor3ntin @mizvekov https://github.com/llvm/llvm-project/pull/106043 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix 16-bit LDDs with immediate overflows (#104923) (PR #106993)
tru wrote: Hi, since we are wrapping up LLVM 19.1.0 we are very strict with the fixes we pick at this point. Can you please respond to the following questions to help me understand if this has to be included in the final release or not. Is this PR a fix for a regression or a critical issue? What is the risk of accepting this into the release branch? What is the risk of NOT accepting this into the release branch? https://github.com/llvm/llvm-project/pull/106993 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [LoongArch] Eliminate the redundant sign extension of division (#107971) (PR #107990)
tru wrote: Just re-run the cherry-pick comment on the updated SHA. https://github.com/llvm/llvm-project/pull/107990 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits