[clang-tools-extra] [llvm] [compiler-rt] [clang] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #668
@@ -1441,6 +1531,9 @@ void OverlapStats::dump(raw_fd_ostream &OS) const { case IPVK_MemOPSize: strncpy(ProfileKindName, "MemOP", 19); break; +case IPVK_VTableTarget: + strncpy(ProfileKindName, "VTable", 6); snehasish wrote: I think this usage of strncpy is incorrect. It will not copy the null terminator from the "VTable" const char * since the null terminator is the 7th char. Since ProfileKindName is uninitialized it could lead to issues when reading it. The existing code should initialize `ProfileKindName[20] = {0}` and then the value of 19 used above makes sense. It will attempt to copy 19 chars unless it encounters the null char in which case the rest will be padded. What do you think? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[compiler-rt] [llvm] [clang-tools-extra] [clang] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #668
@@ -1070,8 +1084,138 @@ void InstrProfiling::maybeSetComdat(GlobalVariable *GV, Function *Fn, GV->setLinkage(GlobalValue::InternalLinkage); } -GlobalVariable *InstrProfiling::setupProfileSection(InstrProfInstBase *Inc, -InstrProfSectKind IPSK) { +static inline bool shouldRecordVTableAddr(GlobalVariable *GV) { + if (!profDataReferencedByCode(*GV->getParent())) +return false; + + if (!GV->hasLinkOnceLinkage() && !GV->hasLocalLinkage() && + !GV->hasAvailableExternallyLinkage()) +return true; + + // This avoids the profile data from referencing internal symbols in + // COMDAT. + if (GV->hasLocalLinkage() && GV->hasComdat()) +return false; + + return true; +} + +// FIXME: Does symbollic relocation from 'getFuncAddrForProfData' matter here? +static inline Constant *getVTableAddrForProfData(GlobalVariable *GV) { + auto *Int8PtrTy = Type::getInt8PtrTy(GV->getContext()); + + // Store a nullptr in __profvt_ if a real address shouldn't be used. + if (!shouldRecordVTableAddr(GV)) +return ConstantPointerNull::get(Int8PtrTy); + + return ConstantExpr::getBitCast(GV, Int8PtrTy); +} + +/// Get the name of a profiling variable for a particular variable. +static std::string getVarName(GlobalVariable *GV, StringRef Prefix) { + StringRef Name = getPGOName(*GV); + return (Prefix + Name).str(); +} + +void InstrProfiling::getOrCreateVTableProfData(GlobalVariable *GV) { + assert(!DebugInfoCorrelate && + "Value profiling is not supported with lightweight instrumentation"); + if (GV->isDeclaration() || GV->hasAvailableExternallyLinkage()) +return; + + if (GV->getName().starts_with("llvm.") || + GV->getName().starts_with("__llvm") || + GV->getName().starts_with("__prof")) +return; + + // VTableProfData already created + auto It = VTableDataMap.find(GV); + if (It != VTableDataMap.end() && It->second) +return; + + GlobalValue::LinkageTypes Linkage = GV->getLinkage(); + GlobalValue::VisibilityTypes Visibility = GV->getVisibility(); + + // This is to keep consistent with per-function profile data + // for correctness. + if (TT.isOSBinFormatXCOFF()) { +Linkage = GlobalValue::InternalLinkage; +Visibility = GlobalValue::DefaultVisibility; + } + + LLVMContext &Ctx = M->getContext(); + Type *DataTypes[] = { +#define INSTR_PROF_VTABLE_DATA(Type, LLVMType, Name, Init) LLVMType, +#include "llvm/ProfileData/InstrProfData.inc" + }; + + auto *DataTy = StructType::get(Ctx, ArrayRef(DataTypes)); + + // Used by INSTR_PROF_VTABLE_DATA MACRO + Constant *VTableAddr = getVTableAddrForProfData(GV); + StringRef VTableName = GV->getName(); + StringRef PGOVTableName = getPGOName(*GV); + // Record the length of the vtable. This is needed since vtable pointers + // loaded from C++ objects might be from the middle of a vtable definition. + uint32_t VTableSizeVal = + M->getDataLayout().getTypeAllocSize(GV->getValueType()); + + Constant *DataVals[] = { +#define INSTR_PROF_VTABLE_DATA(Type, LLVMType, Name, Init) Init, +#include "llvm/ProfileData/InstrProfData.inc" + }; + + auto *Data = + new GlobalVariable(*M, DataTy, false /* constant */, Linkage, + ConstantStruct::get(DataTy, DataVals), + getVarName(GV, getInstrProfVTableVarPrefix())); + + Data->setVisibility(Visibility); + Data->setSection(getInstrProfSectionName(IPSK_vtab, TT.getObjectFormat())); + Data->setAlignment(Align(8)); + + const bool NeedComdat = needsComdatForCounter(*GV, *M); + + // GV is the data structure to record vtable information. + // Place the global variable for per-vtable profile data in a comdat group + // if the associated vtable definition is a COMDAT. This makes sure only one + // copy of the variable for the vtable will be emitted after linking. + auto MaybeSetComdat = [&](GlobalVariable *GV, StringRef GroupName) { +bool UseComdat = (NeedComdat || TT.isOSBinFormatELF()); +if (UseComdat) { + // Create a new comdat group using the name of the global variable as + // opposed to using the comdat group of the vtable. + Comdat *C = M->getOrInsertComdat(GroupName); + // For ELF, when not using COMDAT, put the vtable profile data into a + // nodeduplicate COMDAT which is lowered to a zero-flag zero group. + // This allows -z -start-top-gc to discard the entire group when the snehasish wrote: s/start-top-gc/start-stop-gc https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang] [compiler-rt] [llvm] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #668
@@ -275,18 +282,28 @@ lprofWriteDataImpl(ProfDataWriter *Writer, const __llvm_profile_data *DataBegin, const uint64_t NumBitmapBytes = __llvm_profile_get_num_bitmap_bytes(BitmapBegin, BitmapEnd); const uint64_t NamesSize = __llvm_profile_get_name_size(NamesBegin, NamesEnd); + const uint64_t NumVTables = + __llvm_profile_get_num_vtable(VTableBegin, VTableEnd); + const uint64_t VTableSectionSize = + __llvm_profile_get_vtable_section_size(VTableBegin, VTableEnd); + // Value profiling is not supported when DebugInfoCorrelate is true. snehasish wrote: Is this comment still valid? Or in the right place? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang-tools-extra] [llvm] [compiler-rt] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #668
@@ -1070,8 +1084,138 @@ void InstrProfiling::maybeSetComdat(GlobalVariable *GV, Function *Fn, GV->setLinkage(GlobalValue::InternalLinkage); } -GlobalVariable *InstrProfiling::setupProfileSection(InstrProfInstBase *Inc, -InstrProfSectKind IPSK) { +static inline bool shouldRecordVTableAddr(GlobalVariable *GV) { + if (!profDataReferencedByCode(*GV->getParent())) +return false; + + if (!GV->hasLinkOnceLinkage() && !GV->hasLocalLinkage() && + !GV->hasAvailableExternallyLinkage()) +return true; + + // This avoids the profile data from referencing internal symbols in + // COMDAT. + if (GV->hasLocalLinkage() && GV->hasComdat()) +return false; + + return true; +} + +// FIXME: Does symbollic relocation from 'getFuncAddrForProfData' matter here? snehasish wrote: typo: symbolic? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [clang-tools-extra] [compiler-rt] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #668
@@ -489,30 +520,66 @@ class InstrProfSymtab { /// \p IterRange. This interface is used by IndexedProfReader. template Error create(const NameIterRange &IterRange); - /// Update the symtab by adding \p FuncName to the table. This interface - /// is used by the raw and text profile readers. - Error addFuncName(StringRef FuncName) { -if (FuncName.empty()) + /// Create InstrProfSymtab from a set of function names and vtable + /// names iteratable from \p IterRange. This interface is used by + /// IndexedProfReader. + template + Error create(const FuncNameIterRange &FuncIterRange, + const VTableNameIterRange &VTableIterRange); + + Error addSymbolName(StringRef SymbolName) { +if (SymbolName.empty()) return make_error(instrprof_error::malformed, -"function name is empty"); -auto Ins = NameTab.insert(FuncName); +"symbol name is empty"); + +// Insert into NameTab so that MD5NameMap (a vector that will be sorted) +// won't have duplicated entries in the first place. +auto Ins = NameTab.insert(SymbolName); if (Ins.second) { MD5NameMap.push_back(std::make_pair( - IndexedInstrProf::ComputeHash(FuncName), Ins.first->getKey())); + IndexedInstrProf::ComputeHash(SymbolName), Ins.first->getKey())); Sorted = false; } return Error::success(); } + /// The method name is kept since there are many callers. + /// It just forwards to 'addSymbolName'. + Error addFuncName(StringRef FuncName) { return addSymbolName(FuncName); } + + /// Adds VTableName as a known symbol, and inserts it to a map that + /// tracks all vtable names. + Error addVTableName(StringRef VTableName) { +if (Error E = addSymbolName(VTableName)) + return E; + +// Record VTableName. InstrProfWriter uses this map. The comment around +// class member explains why. +VTableNames.insert(VTableName); +return Error::success(); + } + + const StringSet<> &getVTableNames() const { return VTableNames; } + /// Map a function address to its name's MD5 hash. This interface /// is only used by the raw profiler reader. void mapAddress(uint64_t Addr, uint64_t MD5Val) { AddrToMD5Map.push_back(std::make_pair(Addr, MD5Val)); } + /// Map the address range (i.e., [start_address, end_address]) of a variable to + /// its names' MD5 hash. This interface is only used by the raw profile header. snehasish wrote: Did you mean reader instead of header? I didn't follow the last sentence here. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [compiler-rt] [clang-tools-extra] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #668
@@ -12,27 +12,78 @@ #ifndef LLVM_ANALYSIS_INDIRECTCALLVISITOR_H #define LLVM_ANALYSIS_INDIRECTCALLVISITOR_H +#include "llvm/ADT/SetVector.h" #include "llvm/IR/InstVisitor.h" #include namespace llvm { -// Visitor class that finds all indirect call. +// Visitor class that finds indirect calls or instructions that gives vtable +// value, depending on Type. struct PGOIndirectCallVisitor : public InstVisitor { + enum class InstructionType { +kIndirectCall = 0, +kVTableVal = 1, + }; std::vector IndirectCalls; - PGOIndirectCallVisitor() = default; + std::vector ProfiledAddresses; + PGOIndirectCallVisitor(InstructionType Type) : Type(Type) {} void visitCallBase(CallBase &Call) { -if (Call.isIndirectCall()) +if (Call.isIndirectCall()) { IndirectCalls.push_back(&Call); + + if (Type != InstructionType::kVTableVal) +return; + + LoadInst *LI = dyn_cast(Call.getCalledOperand()); + // The code pattern to look for + // + // %vtable = load ptr, ptr %b + // %vfn = getelementptr inbounds ptr, ptr %vtable, i64 1 + // %2 = load ptr, ptr %vfn + // %call = tail call i32 %2(ptr %b) + // + // %vtable is the vtable address value to profile, and + // %2 is the indirect call target address to profile. + if (LI != nullptr) { +Value *Ptr = LI->getPointerOperand(); +Value *VTablePtr = Ptr->stripInBoundsConstantOffsets(); +// This is a heuristic to find address feeding instructions. +// FIXME: Add support in the frontend so LLVM type intrinsics are +// emitted without LTO. This way, added intrinsics could filter +// non-vtable instructions and reduce instrumentation overhead. +// Since a non-vtable profiled address is not within the address +// range of vtable objects, it's stored as zero in indexed profiles. snehasish wrote: Can we drop the ones that don't belong to a vtable add range during merge instead of storing zero? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[compiler-rt] [clang-tools-extra] [llvm] [clang] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #668
@@ -123,19 +137,29 @@ static int needsCounterPadding(void) { COMPILER_RT_VISIBILITY void __llvm_profile_get_padding_sizes_for_counters( uint64_t DataSize, uint64_t CountersSize, uint64_t NumBitmapBytes, -uint64_t NamesSize, uint64_t *PaddingBytesBeforeCounters, -uint64_t *PaddingBytesAfterCounters, uint64_t *PaddingBytesAfterBitmapBytes, -uint64_t *PaddingBytesAfterNames) { +uint64_t NamesSize, uint64_t VTableSize, uint64_t VNameSize, +uint64_t *PaddingBytesBeforeCounters, uint64_t *PaddingBytesAfterCounters, +uint64_t *PaddingBytesAfterBitmapBytes, uint64_t *PaddingBytesAfterNames, +uint64_t *PaddingBytesAfterVTable, uint64_t *PaddingBytesAfterVName) { + // Counter padding is needed only if continuous mode is enabled. if (!needsCounterPadding()) { *PaddingBytesBeforeCounters = 0; *PaddingBytesAfterCounters = __llvm_profile_get_num_padding_bytes(CountersSize); *PaddingBytesAfterBitmapBytes = __llvm_profile_get_num_padding_bytes(NumBitmapBytes); *PaddingBytesAfterNames = __llvm_profile_get_num_padding_bytes(NamesSize); +if (PaddingBytesAfterVTable != NULL) + *PaddingBytesAfterVTable = + __llvm_profile_get_num_padding_bytes(VTableSize); +if (PaddingBytesAfterVName != NULL) + *PaddingBytesAfterVName = __llvm_profile_get_num_padding_bytes(VNameSize); return; } + // Value profiling not supported in continuous mode at profile-write time. + assert(VTableSize == 0 && VNameSize == 0 && snehasish wrote: I think this check will go away in a Release build. We need a different way to alert for the incompatibility here. Also remove the header added for this change. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang-tools-extra] [compiler-rt] [llvm] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #668
@@ -453,11 +471,94 @@ Error InstrProfSymtab::create(Module &M, bool InLTO) { if (Error E = addFuncWithName(F, getPGOFuncName(F, InLTO))) return E; } + + SmallVector Types; + for (GlobalVariable &G : M.globals()) { +if (!G.hasName()) + continue; +Types.clear(); +G.getMetadata(LLVMContext::MD_type, Types); +if (!Types.empty()) { + MD5VTableMap.emplace_back(G.getGUID(), &G); +} + } Sorted = false; finalizeSymtab(); return Error::success(); } +/// \c NameStrings is a string composed of one of more possibly encoded +/// sub-strings. The substrings are separated by 0 or more zero bytes. This +/// method decodes the string and calls `NameCallback` for each substring. +static Error +readAndDecodeStrings(StringRef NameStrings, snehasish wrote: Are there any changes here which need to be covered by updates to the unit tests in InstrProfTest.cpp? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [MemProf] Expand optimization scope to internal linakge function (PR #73236)
snehasish wrote: @lifengxiang1025 thanks for flagging this issue. I think it's best to not rely on unique-internal-linkage-name here. Instead we should extend the logic in RawMemProfReader.cpp to include ";" if the function is internal linkage as expected by IRPGOFuncName. Can you try this approach? https://github.com/llvm/llvm-project/blob/c0fe0719df457b0992606b0f2a235719a5bbfd54/llvm/lib/ProfileData/RawMemProfReader.cpp#L509-L510 https://github.com/llvm/llvm-project/pull/73236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [MemProf] Expand optimization scope to internal linakge function (PR #73236)
snehasish wrote: > change the matcher (the latter may be the easiest). My response assumed that it was already using the IRPGOFuncName for matching. Yes, I think we should make the switch then. > It seems dwarf don't record linkage type. Yes, you're right. As an alternative can we use the symbol table and find `Bind = LOCAL` to add the prefix before hashing? https://github.com/llvm/llvm-project/pull/73236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[compiler-rt] [clang] [clang-tools-extra] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
snehasish wrote: @teresajohnson I mentioned the same thing on [discourse](https://discourse.llvm.org/t/pgo-are-the-llvm-profile-functions-stable-c-apis-across-llvm-releases/75832/5) but it seems like linking on AIX does not support this model. https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[compiler-rt] [clang] [clang-tools-extra] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -100,12 +103,6 @@ ValueProfNode *__llvm_profile_begin_vnodes(); ValueProfNode *__llvm_profile_end_vnodes(); uint32_t *__llvm_profile_begin_orderfile(); -/*! - * \brief Clear profile counters to zero. - * - */ -void __llvm_profile_reset_counters(void); snehasish wrote: I think removing these methods from the InstrProfiling.h header may break downstream users. Instead can you add a wrapper around these methods in the new header? These wrappers can then be guarded by __LLVM_INSTR_PROFILE_GENERATE . https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [compiler-rt] [clang] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -0,0 +1,26 @@ +// RUN: %clang_profgen %s -S -emit-llvm -o - | FileCheck %s --check-prefix=PROFGEN +// RUN: %clang_profgen -o %t %s +// RUN: env LLVM_PROFILE_FILE=%t.profraw %run %t +// RUN: llvm-profdata merge -o %t.profdata %t.profraw +// RUN: %clang_profuse=%t.profdata %s -S -emit-llvm -o - | FileCheck %s --check-prefix=PROFUSE +#include "profile/instr_profiling.h" + +__attribute__((noinline)) int bar() { return 4; } + +int foo() { + __llvm_profile_reset_counters(); snehasish wrote: Can you also add a test with the weak linking approach to ensure that is unaffected? (Maybe there exists one already but I'm not sure). https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang] [compiler-rt] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -44,6 +44,7 @@ endif(COMPILER_RT_BUILD_ORC) if (COMPILER_RT_BUILD_PROFILE) set(PROFILE_HEADERS profile/InstrProfData.inc +profile/instr_prof_interface.h snehasish wrote: Can you rename the header to follow the existing CamelCase convention? https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang] [compiler-rt] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -1364,12 +1364,22 @@ static void InitializePredefinedMacros(const TargetInfo &TI, TI.getTargetDefines(LangOpts, Builder); } +static void InitializePGOProfileMacros(const CodeGenOptions &CodeGenOpts, snehasish wrote: There are IR PGO users (e.g. Rust) which don't use the clang frontend. Do we need this macro based checks? I would prefer to just limit this patch to exposing the necessary interfaces. End-users can manage their own macros if they want to exclude the profile APIs from the source. https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang] [compiler-rt] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -100,12 +103,6 @@ ValueProfNode *__llvm_profile_begin_vnodes(); ValueProfNode *__llvm_profile_end_vnodes(); uint32_t *__llvm_profile_begin_orderfile(); -/*! - * \brief Clear profile counters to zero. - * - */ -void __llvm_profile_reset_counters(void); snehasish wrote: They are effectively removed if the frontend does not define __LLVM_INSTR_PROFILE_GENERATE (see my other comment). I would prefer the dependency to be in the other direction, include InstrProfiling.h in the interface `instr_prof_interface.h`. > Are there downstream use cases that pull out > compiler-rt/lib/profile/InstrProfiling.h and use it by itself? Yes, we have one internal user that I can find. https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[compiler-rt] [clang] [clang-tools-extra] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -44,6 +44,7 @@ endif(COMPILER_RT_BUILD_ORC) if (COMPILER_RT_BUILD_PROFILE) set(PROFILE_HEADERS profile/InstrProfData.inc +profile/instr_prof_interface.h snehasish wrote: Good point, looks like we treat `interface` filenames with snake case as opposed to headers. Please ignore this comment. https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang-tools-extra] [compiler-rt] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -100,12 +103,6 @@ ValueProfNode *__llvm_profile_begin_vnodes(); ValueProfNode *__llvm_profile_end_vnodes(); uint32_t *__llvm_profile_begin_orderfile(); -/*! - * \brief Clear profile counters to zero. - * - */ -void __llvm_profile_reset_counters(void); snehasish wrote: Ok, if we don't want to expose the internals then the existing approach seems like the simplest one. I think @davidxl's comment in the discourse thread is about ensuring no control flow changes in profile gen and use in hot code paths. This does not necessarily preclude additional wrappers. > Update: I think the explicit definition needs to be guarded by an #ifndef. Yes, I misread the original as `#ifdef` causing some confusion. This should also be `#undef` afterwards. I was thinking about churn for existing users of InstrProfiling.h but it should be a no-op. https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang-tools-extra] [compiler-rt] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
https://github.com/snehasish edited https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [compiler-rt] [clang] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -1364,12 +1364,22 @@ static void InitializePredefinedMacros(const TargetInfo &TI, TI.getTargetDefines(LangOpts, Builder); } +static void InitializePGOProfileMacros(const CodeGenOptions &CodeGenOpts, snehasish wrote: Yes, my previous comment about other frontends was based on my (incorrect) understanding header inclusion in InstrProfiling.h. Adding the macros in this PR sgtm. https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang-tools-extra] [compiler-rt] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
https://github.com/snehasish edited https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[compiler-rt] [clang-tools-extra] [clang] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
https://github.com/snehasish commented: Thanks for your patience with my comments! https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang] [compiler-rt] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -0,0 +1,17 @@ +// RUN: %clang_pgogen -o %t %s +// RUN: not %t +// RUN: %clang -o %t %s +// RUN: %t + +__attribute__((weak)) void __llvm_profile_reset_counters(void); + +__attribute__((noinline)) int bar() { return 4; } +int foo() { + if (__llvm_profile_reset_counters) { +__llvm_profile_reset_counters(); +return 0; + } + return bar(); snehasish wrote: Do we need bar or can we just return 1 here (and remove the subtraction below)? Perhaps the no-inline just needs to be applied to foo then. https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang-tools-extra] [compiler-rt] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -12,6 +12,19 @@ #include "InstrProfilingPort.h" #include +// Make sure __LLVM_INSTR_PROFILE_GENERATE is always defined before +// including instr_prof_interface.h so the interface functions are +// declared correctly for the runtime. Additionally, make sure +// that __LLVM_INSTR_PROFILE_GENERATE is undefined only when it is +// not explicitly defined somewhere else. +#ifndef __LLVM_INSTR_PROFILE_GENERATE +#define __LLVM_INSTR_PROFILE_GENERATE +#include "profile/instr_prof_interface.h" +#undef __LLVM_INSTR_PROFILE_GENERATE +#else snehasish wrote: Is this `else` for the case where compiler-rt itself is instrumented? I don't think we support this. https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang-tools-extra] [compiler-rt] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -1364,12 +1364,22 @@ static void InitializePredefinedMacros(const TargetInfo &TI, TI.getTargetDefines(LangOpts, Builder); } +static void InitializePGOProfileMacros(const CodeGenOptions &CodeGenOpts, + MacroBuilder &Builder) { + if (CodeGenOpts.hasProfileInstr()) +Builder.defineMacro("__LLVM_INSTR_PROFILE_GENERATE"); snehasish wrote: We should document these macros in the clang documentation at https://clang.llvm.org/docs/UsersManual.html#profiling-with-instrumentation https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang] [compiler-rt] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
https://github.com/snehasish edited https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[compiler-rt] [clang] [clang-tools-extra] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
https://github.com/snehasish approved this pull request. lgtm but please wait for other reviewers to chime in. Thanks! https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang] [compiler-rt] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -0,0 +1,38 @@ +// RUN: %clang_profgen %s --target=ppc64le-unknown-linux-gnu -S \ +// RUN:-emit-llvm -o - | FileCheck %s --check-prefix=PROFGEN +// RUN: %clang_profgen -o %t %s +// RUN: env LLVM_PROFILE_FILE=%t.profraw %run %t snehasish wrote: I think you need a `REQUIRES: ppc` for this test otherwise it will fail when this line is executed on non-PPC build bots? https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[compiler-rt] [clang-tools-extra] [clang] [PGO] Exposing PGO's Counter Reset and File Dumping APIs (PR #76471)
@@ -0,0 +1,38 @@ +// RUN: %clang_profgen %s --target=ppc64le-unknown-linux-gnu -S \ +// RUN:-emit-llvm -o - | FileCheck %s --check-prefix=PROFGEN +// RUN: %clang_profgen -o %t %s +// RUN: env LLVM_PROFILE_FILE=%t.profraw %run %t snehasish wrote: Good point, can we drop the `--target=ppc64le-unknown-linux-gnu` flag? It shouldn't make a difference to what we want to test here. https://github.com/llvm/llvm-project/pull/76471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
https://github.com/snehasish commented: Responses to a couple of other comments - > * For LLVM IR test coverage, store textual profiles in the repo, and run > `llvm-profdata merge` to convert it to indexed profiles. > * To have test coverage on raw profiles (generate raw profiles or convert it > into other formats), have a compiler-rt test. > > I wonder if that is preferred over the current status (with script and > `.profraw` in the repo). I think we should resort to scripts and profraw in LLVM if we don't have support for textual format. This is the case for memprof for example. If we support vtablenames in the text format, I would prefer the additional `llvm-profdata merge` based testing. > Considering a) and c), if 'InstrProfSymtab' has 'allocator' as a member, it > needs to implement move assignment operator. I think this is the way to go if you choose to adopt the IntervalMap suggestion. It seems cleanest if the allocator and the map are owned by the InstrProfSymtab class. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
https://github.com/snehasish edited https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -560,6 +602,28 @@ Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName) { return Error::success(); } +uint64_t InstrProfSymtab::getVTableHashFromAddress(uint64_t Address) { + finalizeSymtab(); snehasish wrote: Summarizing offline discussion - if we chose to refactor it makes more sense to add appropriate factory methods of cases `addElement` are called. This seems to be the intent of the original implementation with a few different `create` methods. Then we can make `addElement` private. This will enforce what we want through a better API rather than runtime checks. For this patch, we can add a comment and look into the refactor later. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
https://github.com/snehasish edited https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -0,0 +1,145 @@ +// REQUIRES: lld-available + +// RUN: rm -rf %t && split-file %s %t && cd %t snehasish wrote: I think you can skip this RUN line and just use `%s` as the input file in the line pgogen invocation below. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -0,0 +1,145 @@ +// REQUIRES: lld-available + +// RUN: rm -rf %t && split-file %s %t && cd %t +// +// RUN: %clangxx_pgogen -fuse-ld=lld -O2 -g -fprofile-generate=. -mllvm -enable-vtable-value-profiling test.cpp -o test +// RUN: env LLVM_PROFILE_FILE=test.profraw ./test + +// Show vtable profiles from raw profile. +// RUN: llvm-profdata show --function=main --ic-targets -show-vtables test.profraw | FileCheck %s --check-prefixes=COMMON,RAW snehasish wrote: nit: the show-vtables option has a single hyphen here as opposed to double elsewhere. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -676,24 +722,66 @@ TEST_P(InstrProfReaderWriterTest, icall_data_read_write) { Expected R = Reader->getInstrProfRecord("caller", 0x1234); ASSERT_THAT_ERROR(R.takeError(), Succeeded()); + + // Test the number of instrumented indirect call sites and the number of + // profiled values at each site. ASSERT_EQ(4U, R->getNumValueSites(IPVK_IndirectCallTarget)); EXPECT_EQ(3U, R->getNumValueDataForSite(IPVK_IndirectCallTarget, 0)); EXPECT_EQ(0U, R->getNumValueDataForSite(IPVK_IndirectCallTarget, 1)); EXPECT_EQ(2U, R->getNumValueDataForSite(IPVK_IndirectCallTarget, 2)); - EXPECT_EQ(1U, R->getNumValueDataForSite(IPVK_IndirectCallTarget, 3)); + EXPECT_EQ(2U, R->getNumValueDataForSite(IPVK_IndirectCallTarget, 3)); + + // Test the number of instrumented vtable sites and the number of profiled + // values at each site. + ASSERT_EQ(2U, R->getNumValueSites(IPVK_VTableTarget)); + EXPECT_EQ(3U, R->getNumValueDataForSite(IPVK_VTableTarget, 0)); + EXPECT_EQ(2U, R->getNumValueDataForSite(IPVK_VTableTarget, 1)); + + // First indirect site. + { +uint64_t TotalC; +std::unique_ptr VD = +R->getValueForSite(IPVK_IndirectCallTarget, 0, &TotalC); + +EXPECT_EQ(3U * getProfWeight(), VD[0].Count); snehasish wrote: Gunit expectations should be written as `EXPECT_EQ(actual, expected)`. Looks like these are reversed? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -676,24 +722,66 @@ TEST_P(InstrProfReaderWriterTest, icall_data_read_write) { Expected R = Reader->getInstrProfRecord("caller", 0x1234); ASSERT_THAT_ERROR(R.takeError(), Succeeded()); + + // Test the number of instrumented indirect call sites and the number of + // profiled values at each site. ASSERT_EQ(4U, R->getNumValueSites(IPVK_IndirectCallTarget)); EXPECT_EQ(3U, R->getNumValueDataForSite(IPVK_IndirectCallTarget, 0)); EXPECT_EQ(0U, R->getNumValueDataForSite(IPVK_IndirectCallTarget, 1)); EXPECT_EQ(2U, R->getNumValueDataForSite(IPVK_IndirectCallTarget, 2)); - EXPECT_EQ(1U, R->getNumValueDataForSite(IPVK_IndirectCallTarget, 3)); + EXPECT_EQ(2U, R->getNumValueDataForSite(IPVK_IndirectCallTarget, 3)); + + // Test the number of instrumented vtable sites and the number of profiled + // values at each site. + ASSERT_EQ(2U, R->getNumValueSites(IPVK_VTableTarget)); + EXPECT_EQ(3U, R->getNumValueDataForSite(IPVK_VTableTarget, 0)); + EXPECT_EQ(2U, R->getNumValueDataForSite(IPVK_VTableTarget, 1)); + + // First indirect site. + { +uint64_t TotalC; +std::unique_ptr VD = +R->getValueForSite(IPVK_IndirectCallTarget, 0, &TotalC); + +EXPECT_EQ(3U * getProfWeight(), VD[0].Count); +EXPECT_EQ(2U * getProfWeight(), VD[1].Count); +EXPECT_EQ(1U * getProfWeight(), VD[2].Count); +EXPECT_EQ(6U * getProfWeight(), TotalC); + +EXPECT_EQ(StringRef((const char *)VD[0].Value, 7), StringRef("callee3")); snehasish wrote: If you use EXPECT_STREQ then you can drop all the conversions to StringRef. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
https://github.com/snehasish approved this pull request. lgtm https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -1362,8 +1372,8 @@ remapSamples(const sampleprof::FunctionSamples &Samples, BodySample.second.getSamples()); for (const auto &Target : BodySample.second.getCallTargets()) { Result.addCalledTargetSamples(BodySample.first.LineOffset, -MaskedDiscriminator, -Remapper(Target.first), Target.second); +MaskedDiscriminator, Remapper(Target.first), snehasish wrote: nit: This formatting diff seems unrelated to the patch? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -0,0 +1,145 @@ +// REQUIRES: lld-available + +// RUN: rm -rf %t && split-file %s %t && cd %t +// +// RUN: %clangxx_pgogen -fuse-ld=lld -O2 -g -fprofile-generate=. -mllvm -enable-vtable-value-profiling test.cpp -o test +// RUN: env LLVM_PROFILE_FILE=test.profraw ./test + +// Show vtable profiles from raw profile. +// RUN: llvm-profdata show --function=main --ic-targets -show-vtables test.profraw | FileCheck %s --check-prefixes=COMMON,RAW + +// Generate indexed profile from raw profile and show the data. +// RUN: llvm-profdata merge test.profraw -o test.profdata +// RUN: llvm-profdata show --function=main --ic-targets --show-vtables test.profdata | FileCheck %s --check-prefixes=COMMON,INDEXED + +// Generate text profile from raw and indexed profiles respectively and show the data. +// RUN: llvm-profdata merge --text test.profraw -o raw.proftext +// RUN: llvm-profdata show --function=main --ic-targets --show-vtables --text raw.proftext | FileCheck %s --check-prefix=ICTEXT +// RUN: llvm-profdata merge --text test.profdata -o indexed.proftext +// RUN: llvm-profdata show --function=main --ic-targets --show-vtables --text indexed.proftext | FileCheck %s --check-prefix=ICTEXT + +// Generate indexed profile from text profiles and show the data +// RUN: llvm-profdata merge --binary raw.proftext -o text.profraw +// RUN: llvm-profdata show --function=main --ic-targets --show-vtables text.profraw | FileCheck %s --check-prefixes=COMMON,INDEXED +// RUN: llvm-profdata merge --binary indexed.proftext -o text.profdata +// RUN: llvm-profdata show --function=main --ic-targets --show-vtables text.profdata | FileCheck %s --check-prefixes=COMMON,INDEXED + +// COMMON: Counters: +// COMMON-NEXT: main: +// COMMON-NEXT: Hash: 0x0f9a16fe6d398548 +// COMMON-NEXT: Counters: 2 +// COMMON-NEXT: Indirect Call Site Count: 2 +// COMMON-NEXT: Number of instrumented vtables: 2 +// RAW: Indirect Target Results: +// RAW-NEXT: [ 0, _ZN8Derived15func1Eii,250 ] (25.00%) +// RAW-NEXT: [ 0, test.cpp;_ZN12_GLOBAL__N_18Derived25func1Eii, 750 ] (75.00%) +// RAW-NEXT: [ 1, _ZN8Derived15func2Eii,250 ] (25.00%) +// RAW-NEXT: [ 1, test.cpp;_ZN12_GLOBAL__N_18Derived25func2Eii, 750 ] (75.00%) +// RAW-NEXT: VTable Results: +// RAW-NEXT: [ 0, _ZTV8Derived1,250 ] (25.00%) +// RAW-NEXT: [ 0, test.cpp;_ZTVN12_GLOBAL__N_18Derived2E,750 ] (75.00%) +// RAW-NEXT: [ 1, _ZTV8Derived1,250 ] (25.00%) +// RAW-NEXT: [ 1, test.cpp;_ZTVN12_GLOBAL__N_18Derived2E,750 ] (75.00%) +// INDEXED: Indirect Target Results: +// INDEXED-NEXT: [ 0, test.cpp;_ZN12_GLOBAL__N_18Derived25func1Eii, 750 ] (75.00%) +// INDEXED-NEXT: [ 0, _ZN8Derived15func1Eii,250 ] (25.00%) +// INDEXED-NEXT: [ 1, test.cpp;_ZN12_GLOBAL__N_18Derived25func2Eii, 750 ] (75.00%) +// INDEXED-NEXT: [ 1, _ZN8Derived15func2Eii,250 ] (25.00%) +// INDEXED-NEXT: VTable Results: +// INDEXED-NEXT: [ 0, test.cpp;_ZTVN12_GLOBAL__N_18Derived2E, 750 ] (75.00%) +// INDEXED-NEXT: [ 0, _ZTV8Derived1,250 ] (25.00%) +// INDEXED-NEXT: [ 1, test.cpp;_ZTVN12_GLOBAL__N_18Derived2E, 750 ] (75.00%) +// INDEXED-NEXT: [ 1, _ZTV8Derived1,250 ] (25.00%) +// COMMON: Instrumentation level: IR entry_first = 0 +// COMMON-NEXT: Functions shown: 1 +// COMMON-NEXT: Total functions: 6 +// COMMON-NEXT: Maximum function count: 1000 +// COMMON-NEXT: Maximum internal block count: 250 +// COMMON-NEXT: Statistics for indirect call sites profile: +// COMMON-NEXT: Total number of sites: 2 +// COMMON-NEXT: Total number of sites with values: 2 +// COMMON-NEXT: Total number of profiled values: 4 +// COMMON-NEXT: Value sites histogram: +// COMMON-NEXT: NumTargets, SiteCount +// COMMON-NEXT: 2, 2 +// COMMON-NEXT: Statistics for vtable profile: +// COMMON-NEXT: Total number of sites: 2 +// COMMON-NEXT: Total number of sites with values: 2 +// COMMON-NEXT: Total number of profiled values: 4 +// COMMON-NEXT: Value sites histogram: +// COMMON-NEXT: NumTargets, SiteCount +// COMMON-NEXT: 2, 2 + +// ICTEXT: :ir +// ICTEXT: main +// ICTEXT: # Func Hash: +// ICTEXT: 1124236338992350536 +// ICTEXT: # Num Counters: +// ICTEXT: 2 +// ICTEXT: # Counter Values: +// ICTEXT: 1000 +// ICTEXT: 1 +// ICTEXT: # Num Value Kinds: +// ICTEXT: 2 +// ICTEXT: # ValueKind = IPVK_IndirectCallTarget: +// ICTEXT: 0 +// ICTEXT: # NumValueSites: +// ICTEXT: 2 +// ICTEXT: 2 +// ICTEXT: test.cpp;_ZN12_GLOBAL__N_18Derived25func1Eii:750 +// ICTEXT: _ZN8Derived15func1Eii:250 +// ICTEXT: 2 +// ICTEXT: test.cpp;_ZN12_GLOBAL__N_18Derived25func2Eii:750 +// ICTEXT: _ZN8Derived15func2Eii:250 +// ICTEXT: # ValueKind = IPVK_VTableTarget: +// ICTEXT: 2 +// ICTEXT: # NumValueSites: +// ICTEXT: 2 +// ICTEXT: 2 +// ICTEXT: test.cpp;_ZTVN12_GLOBAL__N_18Derived2E:750 +/
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -0,0 +1,145 @@ +// REQUIRES: lld-available + +// RUN: rm -rf %t && split-file %s %t && cd %t +// +// RUN: %clangxx_pgogen -fuse-ld=lld -O2 -g -fprofile-generate=. -mllvm -enable-vtable-value-profiling test.cpp -o test +// RUN: env LLVM_PROFILE_FILE=test.profraw ./test snehasish wrote: [optional] If you prefix the test.profraw as `%t-test.profraw` then when the test fails and prints out the failed command line its easier to copy paste and run separately to debug. If you want to adopt this then all the generated files should be prefixed with `%t`. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -431,23 +441,34 @@ class InstrProfSymtab { using AddrHashMap = std::vector>; private: + using AddrIntervalMap = + IntervalMap>; StringRef Data; uint64_t Address = 0; - // Unique name strings. + // Unique name strings. Used to ensure entries in MD5NameMap (a vector that's + // going to be sorted) has unique MD5 keys in the first place. StringSet<> NameTab; + // Records the unique virtual table names. This is used by InstrProfWriter to + // write out an on-disk chained hash table of virtual table names. + // InstrProfWriter stores per function profile data (keyed by function names) + // so it doesn't use a StringSet for function names. + StringSet<> VTableNames; // A map from MD5 keys to function name strings. std::vector> MD5NameMap; snehasish wrote: Might be worth it to simplify this to use DenseMap too in a separate patch. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Add option to generate additional debug info for expression dereferencing pointer to pointers. (PR #81545)
snehasish wrote: Any measurements of impact to compile time, memory etc on a reasonable size benchmark (e.g. clang)? https://github.com/llvm/llvm-project/pull/81545 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
snehasish wrote: > > Here are some things I noticed, haven't looked at the tests yet. > > @snehasish Thanks for the review! > > Talking about tests, I wonder if it looks better if I change the current LLVM > IR test under `llvm/test/tools/llvm-profdata/` into a compiler-rt test under > `compiler-rt/test/profile/Linux`, demonstrated in > [167d5ce](https://github.com/llvm/llvm-project/commit/167d5ce49e41ca098085cda315687ade194981b1) > My concern with this approach is that compiler-rt is treated as a different project and updating the code within LLVM makes it easy to miss running the test locally for the other project. I think such issues will be caught by the buildbot but having it flagged earlier is better for the developer. What do you think? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -1237,6 +1259,101 @@ void InstrLowerer::maybeSetComdat(GlobalVariable *GV, Function *Fn, GV->setLinkage(GlobalValue::InternalLinkage); } +static inline bool shouldRecordVTableAddr(GlobalVariable *GV) { + if (!profDataReferencedByCode(*GV->getParent())) +return false; + + if (!GV->hasLinkOnceLinkage() && !GV->hasLocalLinkage() && + !GV->hasAvailableExternallyLinkage()) +return true; + + // This avoids the profile data from referencing internal symbols in + // COMDAT. + if (GV->hasLocalLinkage() && GV->hasComdat()) +return false; + + return true; +} + +// FIXME: Introduce an internal alias like what's done for functions to reduce +// the number of relocation entries. +static inline Constant *getVTableAddrForProfData(GlobalVariable *GV) { + auto *Int8PtrTy = PointerType::getUnqual(GV->getContext()); + + // Store a nullptr in __profvt_ if a real address shouldn't be used. + if (!shouldRecordVTableAddr(GV)) +return ConstantPointerNull::get(Int8PtrTy); + + return ConstantExpr::getBitCast(GV, Int8PtrTy); +} + +void InstrLowerer::getOrCreateVTableProfData(GlobalVariable *GV) { + assert(!DebugInfoCorrelate && + "Value profiling is not supported with lightweight instrumentation"); + if (GV->isDeclaration() || GV->hasAvailableExternallyLinkage()) +return; + + if (GV->getName().starts_with("llvm.") || + GV->getName().starts_with("__llvm") || + GV->getName().starts_with("__prof")) +return; + + // VTableProfData already created + auto It = VTableDataMap.find(GV); + if (It != VTableDataMap.end() && It->second) snehasish wrote: Ok, sounds reasonable to me. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -560,6 +602,28 @@ Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName) { return Error::success(); } +uint64_t InstrProfSymtab::getVTableHashFromAddress(uint64_t Address) { + finalizeSymtab(); snehasish wrote: Add a comment why its required and note the early return if its sorted? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -605,6 +703,19 @@ Function* InstrProfSymtab::getFunction(uint64_t FuncMD5Hash) { return nullptr; } +GlobalVariable * +InstrProfSymtab::getGlobalVariable(uint64_t GlobalVariableMD5Hash) { + finalizeSymtab(); snehasish wrote: Why do we need to finalizeSymtab when looking up a global var? Also just using a DenseMap for `MD5VTableMap` means this code can be simplified to a one liner probably. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -459,6 +472,16 @@ Error InstrProfSymtab::create(Module &M, bool InLTO) { if (Error E = addFuncWithName(F, getPGOFuncName(F, InLTO))) return E; } + + SmallVector Types; + for (GlobalVariable &G : M.globals()) { +if (!G.hasName()) + continue; +Types.clear(); +G.getMetadata(LLVMContext::MD_type, Types); +if (!Types.empty()) snehasish wrote: Can this be replaced with `if(G.hasMetadata(LLVMContext::MD_type))`? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
https://github.com/snehasish commented: Here are some things I noticed, haven't looked at the tests yet. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
https://github.com/snehasish edited https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -16,23 +16,72 @@ #include namespace llvm { -// Visitor class that finds all indirect call. +// Visitor class that finds indirect calls or instructions that gives vtable +// value, depending on Type. struct PGOIndirectCallVisitor : public InstVisitor { + enum class InstructionType { +kIndirectCall = 0, +kVTableVal = 1, + }; std::vector IndirectCalls; - PGOIndirectCallVisitor() = default; + std::vector ProfiledAddresses; + PGOIndirectCallVisitor(InstructionType Type) : Type(Type) {} void visitCallBase(CallBase &Call) { -if (Call.isIndirectCall()) +if (Call.isIndirectCall()) { snehasish wrote: Can we flip this condition and return early to reduce the nesting? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -1237,6 +1259,101 @@ void InstrLowerer::maybeSetComdat(GlobalVariable *GV, Function *Fn, GV->setLinkage(GlobalValue::InternalLinkage); } +static inline bool shouldRecordVTableAddr(GlobalVariable *GV) { + if (!profDataReferencedByCode(*GV->getParent())) +return false; + + if (!GV->hasLinkOnceLinkage() && !GV->hasLocalLinkage() && + !GV->hasAvailableExternallyLinkage()) +return true; + + // This avoids the profile data from referencing internal symbols in + // COMDAT. + if (GV->hasLocalLinkage() && GV->hasComdat()) +return false; + + return true; +} + +// FIXME: Introduce an internal alias like what's done for functions to reduce +// the number of relocation entries. +static inline Constant *getVTableAddrForProfData(GlobalVariable *GV) { + auto *Int8PtrTy = PointerType::getUnqual(GV->getContext()); + + // Store a nullptr in __profvt_ if a real address shouldn't be used. + if (!shouldRecordVTableAddr(GV)) +return ConstantPointerNull::get(Int8PtrTy); + + return ConstantExpr::getBitCast(GV, Int8PtrTy); +} + +void InstrLowerer::getOrCreateVTableProfData(GlobalVariable *GV) { + assert(!DebugInfoCorrelate && + "Value profiling is not supported with lightweight instrumentation"); + if (GV->isDeclaration() || GV->hasAvailableExternallyLinkage()) +return; + + if (GV->getName().starts_with("llvm.") || + GV->getName().starts_with("__llvm") || + GV->getName().starts_with("__prof")) +return; + + // VTableProfData already created + auto It = VTableDataMap.find(GV); + if (It != VTableDataMap.end() && It->second) +return; + + GlobalValue::LinkageTypes Linkage = GV->getLinkage(); + GlobalValue::VisibilityTypes Visibility = GV->getVisibility(); + + // This is to keep consistent with per-function profile data + // for correctness. + if (TT.isOSBinFormatXCOFF()) { +Linkage = GlobalValue::InternalLinkage; +Visibility = GlobalValue::DefaultVisibility; + } + + LLVMContext &Ctx = M.getContext(); + Type *DataTypes[] = { +#define INSTR_PROF_VTABLE_DATA(Type, LLVMType, Name, Init) LLVMType, +#include "llvm/ProfileData/InstrProfData.inc" + }; snehasish wrote: nit: `#undef` afterwards, same for the usage below. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -1237,6 +1259,101 @@ void InstrLowerer::maybeSetComdat(GlobalVariable *GV, Function *Fn, GV->setLinkage(GlobalValue::InternalLinkage); } +static inline bool shouldRecordVTableAddr(GlobalVariable *GV) { + if (!profDataReferencedByCode(*GV->getParent())) +return false; + + if (!GV->hasLinkOnceLinkage() && !GV->hasLocalLinkage() && + !GV->hasAvailableExternallyLinkage()) +return true; + + // This avoids the profile data from referencing internal symbols in + // COMDAT. + if (GV->hasLocalLinkage() && GV->hasComdat()) +return false; + + return true; +} + +// FIXME: Introduce an internal alias like what's done for functions to reduce +// the number of relocation entries. +static inline Constant *getVTableAddrForProfData(GlobalVariable *GV) { + auto *Int8PtrTy = PointerType::getUnqual(GV->getContext()); + + // Store a nullptr in __profvt_ if a real address shouldn't be used. + if (!shouldRecordVTableAddr(GV)) +return ConstantPointerNull::get(Int8PtrTy); + + return ConstantExpr::getBitCast(GV, Int8PtrTy); +} + +void InstrLowerer::getOrCreateVTableProfData(GlobalVariable *GV) { + assert(!DebugInfoCorrelate && + "Value profiling is not supported with lightweight instrumentation"); + if (GV->isDeclaration() || GV->hasAvailableExternallyLinkage()) +return; + + if (GV->getName().starts_with("llvm.") || + GV->getName().starts_with("__llvm") || + GV->getName().starts_with("__prof")) +return; + + // VTableProfData already created + auto It = VTableDataMap.find(GV); + if (It != VTableDataMap.end() && It->second) +return; + + GlobalValue::LinkageTypes Linkage = GV->getLinkage(); + GlobalValue::VisibilityTypes Visibility = GV->getVisibility(); + + // This is to keep consistent with per-function profile data + // for correctness. + if (TT.isOSBinFormatXCOFF()) { +Linkage = GlobalValue::InternalLinkage; +Visibility = GlobalValue::DefaultVisibility; + } + + LLVMContext &Ctx = M.getContext(); + Type *DataTypes[] = { +#define INSTR_PROF_VTABLE_DATA(Type, LLVMType, Name, Init) LLVMType, +#include "llvm/ProfileData/InstrProfData.inc" + }; + + auto *DataTy = StructType::get(Ctx, ArrayRef(DataTypes)); + + // Used by INSTR_PROF_VTABLE_DATA MACRO + Constant *VTableAddr = getVTableAddrForProfData(GV); + const std::string PGOVTableName = getPGOName(*GV); + // Record the length of the vtable. This is needed since vtable pointers + // loaded from C++ objects might be from the middle of a vtable definition. + uint32_t VTableSizeVal = + M.getDataLayout().getTypeAllocSize(GV->getValueType()); + + Constant *DataVals[] = { +#define INSTR_PROF_VTABLE_DATA(Type, LLVMType, Name, Init) Init, +#include "llvm/ProfileData/InstrProfData.inc" + }; + + std::string VarName = getInstrProfVTableVarPrefix().str() + PGOVTableName; + auto *Data = + new GlobalVariable(M, DataTy, false /* constant */, Linkage, snehasish wrote: nit: `/*constant=*/` https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -1237,6 +1259,101 @@ void InstrLowerer::maybeSetComdat(GlobalVariable *GV, Function *Fn, GV->setLinkage(GlobalValue::InternalLinkage); } +static inline bool shouldRecordVTableAddr(GlobalVariable *GV) { + if (!profDataReferencedByCode(*GV->getParent())) +return false; + + if (!GV->hasLinkOnceLinkage() && !GV->hasLocalLinkage() && + !GV->hasAvailableExternallyLinkage()) +return true; + + // This avoids the profile data from referencing internal symbols in + // COMDAT. + if (GV->hasLocalLinkage() && GV->hasComdat()) +return false; + + return true; +} + +// FIXME: Introduce an internal alias like what's done for functions to reduce +// the number of relocation entries. +static inline Constant *getVTableAddrForProfData(GlobalVariable *GV) { + auto *Int8PtrTy = PointerType::getUnqual(GV->getContext()); + + // Store a nullptr in __profvt_ if a real address shouldn't be used. + if (!shouldRecordVTableAddr(GV)) +return ConstantPointerNull::get(Int8PtrTy); + + return ConstantExpr::getBitCast(GV, Int8PtrTy); +} + +void InstrLowerer::getOrCreateVTableProfData(GlobalVariable *GV) { + assert(!DebugInfoCorrelate && + "Value profiling is not supported with lightweight instrumentation"); + if (GV->isDeclaration() || GV->hasAvailableExternallyLinkage()) +return; + + if (GV->getName().starts_with("llvm.") || + GV->getName().starts_with("__llvm") || + GV->getName().starts_with("__prof")) +return; + + // VTableProfData already created + auto It = VTableDataMap.find(GV); + if (It != VTableDataMap.end() && It->second) +return; + + GlobalValue::LinkageTypes Linkage = GV->getLinkage(); + GlobalValue::VisibilityTypes Visibility = GV->getVisibility(); + + // This is to keep consistent with per-function profile data + // for correctness. + if (TT.isOSBinFormatXCOFF()) { +Linkage = GlobalValue::InternalLinkage; +Visibility = GlobalValue::DefaultVisibility; + } + + LLVMContext &Ctx = M.getContext(); + Type *DataTypes[] = { +#define INSTR_PROF_VTABLE_DATA(Type, LLVMType, Name, Init) LLVMType, +#include "llvm/ProfileData/InstrProfData.inc" + }; + + auto *DataTy = StructType::get(Ctx, ArrayRef(DataTypes)); + + // Used by INSTR_PROF_VTABLE_DATA MACRO + Constant *VTableAddr = getVTableAddrForProfData(GV); + const std::string PGOVTableName = getPGOName(*GV); + // Record the length of the vtable. This is needed since vtable pointers + // loaded from C++ objects might be from the middle of a vtable definition. + uint32_t VTableSizeVal = + M.getDataLayout().getTypeAllocSize(GV->getValueType()); + + Constant *DataVals[] = { +#define INSTR_PROF_VTABLE_DATA(Type, LLVMType, Name, Init) Init, +#include "llvm/ProfileData/InstrProfData.inc" + }; + + std::string VarName = getInstrProfVTableVarPrefix().str() + PGOVTableName; snehasish wrote: The GlobalVariable constructor accepts a Twine so perhaps use that directly in the args? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -1237,6 +1259,101 @@ void InstrLowerer::maybeSetComdat(GlobalVariable *GV, Function *Fn, GV->setLinkage(GlobalValue::InternalLinkage); } +static inline bool shouldRecordVTableAddr(GlobalVariable *GV) { + if (!profDataReferencedByCode(*GV->getParent())) +return false; + + if (!GV->hasLinkOnceLinkage() && !GV->hasLocalLinkage() && + !GV->hasAvailableExternallyLinkage()) +return true; + + // This avoids the profile data from referencing internal symbols in + // COMDAT. + if (GV->hasLocalLinkage() && GV->hasComdat()) +return false; + + return true; +} + +// FIXME: Introduce an internal alias like what's done for functions to reduce +// the number of relocation entries. +static inline Constant *getVTableAddrForProfData(GlobalVariable *GV) { + auto *Int8PtrTy = PointerType::getUnqual(GV->getContext()); + + // Store a nullptr in __profvt_ if a real address shouldn't be used. + if (!shouldRecordVTableAddr(GV)) +return ConstantPointerNull::get(Int8PtrTy); + + return ConstantExpr::getBitCast(GV, Int8PtrTy); +} + +void InstrLowerer::getOrCreateVTableProfData(GlobalVariable *GV) { + assert(!DebugInfoCorrelate && + "Value profiling is not supported with lightweight instrumentation"); + if (GV->isDeclaration() || GV->hasAvailableExternallyLinkage()) +return; + + if (GV->getName().starts_with("llvm.") || snehasish wrote: Add a comment to explain this if condition. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -429,20 +439,36 @@ uint64_t ComputeHash(StringRef K); class InstrProfSymtab { public: using AddrHashMap = std::vector>; + using RangeHashMap = + std::vector, uint64_t>>; snehasish wrote: Can you change the element type to a structure with 3 elements? E.g. `struct { uint64_t Start, End, Md5Value }`; https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -378,6 +384,13 @@ std::string getPGOFuncName(const Function &F, bool InLTO, uint64_t Version) { return getPGOFuncName(F.getName(), GlobalValue::ExternalLinkage, ""); } +std::string getPGOName(const GlobalVariable &V, bool InLTO) { + // PGONameMetadata should be set by compiler at profile use time + // and read by symtab creation to look up symbols corresponding to + // a MD5 hash. + return getIRPGOObjectName(V, InLTO, nullptr /* PGONameMetadata */); snehasish wrote: nit: /*PGONameMetaData=*/ https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -496,32 +532,70 @@ class InstrProfSymtab { /// Create InstrProfSymtab from a set of names iteratable from /// \p IterRange. This interface is used by IndexedProfReader. - template Error create(const NameIterRange &IterRange); - - /// Update the symtab by adding \p FuncName to the table. This interface - /// is used by the raw and text profile readers. - Error addFuncName(StringRef FuncName) { -if (FuncName.empty()) + template + Error create(const NameIterRange &IterRange); + + /// Create InstrProfSymtab from a set of function names and vtable + /// names iteratable from \p IterRange. This interface is used by + /// IndexedProfReader. + template + Error create(const FuncNameIterRange &FuncIterRange, + const VTableNameIterRange &VTableIterRange); + + Error addSymbolName(StringRef SymbolName) { +if (SymbolName.empty()) return make_error(instrprof_error::malformed, -"function name is empty"); -auto Ins = NameTab.insert(FuncName); +"symbol name is empty"); + +// Insert into NameTab so that MD5NameMap (a vector that will be sorted) +// won't have duplicated entries in the first place. +auto Ins = NameTab.insert(SymbolName); if (Ins.second) { MD5NameMap.push_back(std::make_pair( - IndexedInstrProf::ComputeHash(FuncName), Ins.first->getKey())); + IndexedInstrProf::ComputeHash(SymbolName), Ins.first->getKey())); Sorted = false; } return Error::success(); } + /// The method name is kept since there are many callers. + /// It just forwards to 'addSymbolName'. + Error addFuncName(StringRef FuncName) { return addSymbolName(FuncName); } + + /// Adds VTableName as a known symbol, and inserts it to a map that + /// tracks all vtable names. + Error addVTableName(StringRef VTableName) { +if (Error E = addSymbolName(VTableName)) + return E; + +// Record VTableName. InstrProfWriter uses this map. The comment around snehasish wrote: nit: "uses this set". https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -327,6 +327,11 @@ extern cl::opt PGOViewCounts; // Defined in Analysis/BlockFrequencyInfo.cpp: -view-bfi-func-name= extern cl::opt ViewBlockFreqFuncName; +extern cl::opt DebugInfoCorrelate; snehasish wrote: Add a comment for this like the ones above. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -567,6 +643,21 @@ Error InstrProfSymtab::create(const NameIterRange &IterRange) { return Error::success(); } +template +Error InstrProfSymtab::create(const FuncNameIterRange &FuncIterRange, + const VTableNameIterRange &VTableIterRange) { + for (auto Name : FuncIterRange) snehasish wrote: const auto& to avoid a copy. Same for the loop below. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -490,6 +591,23 @@ Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName) { return Error::success(); } +uint64_t InstrProfSymtab::getVTableHashFromAddress(uint64_t Address) { + finalizeSymtab(); + auto It = lower_bound( + VTableAddrRangeToMD5Map, Address, snehasish wrote: Perhaps consider using llvm::IntervalMap instead. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -429,20 +439,36 @@ uint64_t ComputeHash(StringRef K); class InstrProfSymtab { public: using AddrHashMap = std::vector>; + using RangeHashMap = snehasish wrote: Hmm, it's a little strange to have HashMap in the name and the underlying data structure is a vector. How about something like `VTableAddrRanges`? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -560,6 +602,28 @@ Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName) { return Error::success(); } +uint64_t InstrProfSymtab::getVTableHashFromAddress(uint64_t Address) { + finalizeSymtab(); snehasish wrote: Like the lookup for GlobalVariable, why do we need to call finalizeSymtab here? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (PR #66825)
@@ -429,20 +439,36 @@ uint64_t ComputeHash(StringRef K); class InstrProfSymtab { public: using AddrHashMap = std::vector>; + using RangeHashMap = + std::vector, uint64_t>>; private: StringRef Data; uint64_t Address = 0; - // Unique name strings. + // Unique name strings. Used to ensure entries in MD5NameMap (a vector that's + // going to be sorted) has unique MD5 keys in the first place. StringSet<> NameTab; + // Records the unique virtual table names. This is used by InstrProfWriter to + // write out an on-disk chained hash table of virtual table names. + // InstrProfWriter stores per function profile data (keyed by function names) + // so it doesn't use a StringSet for function names. + StringSet<> VTableNames; // A map from MD5 keys to function name strings. std::vector> MD5NameMap; + // A map from MD5 keys to virtual table definitions. Only populated when + // building the Symtab from a module. + std::vector> MD5VTableMap; snehasish wrote: Having `Map` in the name is confusing when the data structure is a vector. In this case I think using a DenseMap is better since the keys will be random and spread apart. So it should have better performance than log(N) operations necessary for lookup in a sorted vector. https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Support MemProf on darwin (PR #69640)
https://github.com/snehasish requested changes to this pull request. It would be nice to split this patch into at least two parts: 1. Add basic support for Darwin (malloc implementation, build ids, raw format etc) 2. Record and dump references to static data in the binary I think there are a number of places where we assume x86-ELF and I would like to make sure we address any issues before adding more functionality. https://github.com/llvm/llvm-project/pull/69640 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] f1a3ab9 - [clang] Add a command line flag for the Machine Function Splitter.
Author: Snehasish Kumar Date: 2020-09-15T12:41:58-07:00 New Revision: f1a3ab904439a63b21ba1c4521765c46630687c6 URL: https://github.com/llvm/llvm-project/commit/f1a3ab904439a63b21ba1c4521765c46630687c6 DIFF: https://github.com/llvm/llvm-project/commit/f1a3ab904439a63b21ba1c4521765c46630687c6.diff LOG: [clang] Add a command line flag for the Machine Function Splitter. This patch adds a command line flag for the machine function splitter (added in rG94faadaca4e1). -fsplit-machine-functions Split machine functions using profile information (x86 ELF). On other targets an error is emitted. If profile information is not provided a warning is emitted notifying the user that profile information is required. Differential Revision: https://reviews.llvm.org/D87047 Added: clang/test/Driver/fsplit-machine-functions.c Modified: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Driver/Options.td clang/lib/CodeGen/BackendUtil.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInvocation.cpp Removed: diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index feb4ed01f6e8..b5da2a9cde1a 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -162,6 +162,7 @@ CODEGENOPT(NoImplicitFloat , 1, 0) ///< Set when -mno-implicit-float is enable CODEGENOPT(NullPointerIsValid , 1, 0) ///< Assume Null pointer deference is defined. CODEGENOPT(CorrectlyRoundedDivSqrt, 1, 0) ///< -cl-fp32-correctly-rounded-divide-sqrt CODEGENOPT(UniqueInternalLinkageNames, 1, 0) ///< Internal Linkage symbols get unique names. +CODEGENOPT(SplitMachineFunctions, 1, 0) ///< Split machine functions using profile information. /// When false, this attempts to generate code as if the result of an /// overflowing conversion matches the overflowing behavior of a target's native diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index f196c1b72d27..5b39ea513b24 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1996,6 +1996,9 @@ defm unique_internal_linkage_names : OptInFFlag<"unique-internal-linkage-names", defm unique_section_names : OptOutFFlag<"unique-section-names", "", "Don't use unique names for text and data sections">; +defm split_machine_functions: OptInFFlag<"split-machine-functions", + "Enable", "Disable", " late function splitting using profile information (x86 ELF)">; + defm strict_return : OptOutFFlag<"strict-return", "", "Don't treat control flow paths that fall off the end of a non-void function as unreachable">; diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index 472d86ea2e36..5fc80d4fae71 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -514,6 +514,7 @@ static void initTargetOptions(DiagnosticsEngine &Diags, Options.BBSectionsFuncListBuf = std::move(*MBOrErr); } + Options.EnableMachineFunctionSplitter = CodeGenOpts.SplitMachineFunctions; Options.FunctionSections = CodeGenOpts.FunctionSections; Options.DataSections = CodeGenOpts.DataSections; Options.UniqueSectionNames = CodeGenOpts.UniqueSectionNames; diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 40659ebb1395..51056960761d 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -4911,6 +4911,26 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, options::OPT_fno_unique_basic_block_section_names, false)) CmdArgs.push_back("-funique-basic-block-section-names"); + if (Arg *A = Args.getLastArg(options::OPT_fsplit_machine_functions, + options::OPT_fno_split_machine_functions)) { +// This codegen pass is only available on x86-elf targets. +if (Triple.isX86() && Triple.isOSBinFormatELF()) { + if (A->getOption().matches(options::OPT_fsplit_machine_functions)) { +// If the flag is enabled but no profile information is available then +// emit a warning. +if (getLastProfileUseArg(Args) || getLastProfileSampleUseArg(Args)) { + A->render(Args, CmdArgs); +} else { + D.Diag(diag::warn_drv_diagnostics_hotness_requires_pgo) + << A->getAsString(Args); +} + } +} else { + D.Diag(diag::err_drv_unsupported_opt_for_target) + << A->getAsString(Args) << TripleStr; +} + } + Args.AddLastArg(CmdArgs, options::OPT_finstrument_functions, options::OPT_finstrument_functions_after_inlining, options::OPT_finstrument_function_entry_bare); diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocatio
[clang] b86f1af - [clang] Remove profile available check for fsplit-machine-functions.
Author: Snehasish Kumar Date: 2020-09-18T15:08:00-07:00 New Revision: b86f1af423952d9f1dbe105b651b948ce0e1e8d0 URL: https://github.com/llvm/llvm-project/commit/b86f1af423952d9f1dbe105b651b948ce0e1e8d0 DIFF: https://github.com/llvm/llvm-project/commit/b86f1af423952d9f1dbe105b651b948ce0e1e8d0.diff LOG: [clang] Remove profile available check for fsplit-machine-functions. Enforcing a profile available check in the driver does not work with incremental LTO builds where the LTO backend invocation does not include the profile flags. At this point the profiles have already been consumed and the IR contains profile metadata. Instead we always pass through the -fsplit-machine-functions flag on user request. The pass itself contains a check to return early if no profile information is available. Differential Revision: https://reviews.llvm.org/D87943 Added: Modified: clang/lib/Driver/ToolChains/Clang.cpp clang/test/Driver/fsplit-machine-functions.c Removed: diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 67c17ee2cbb4..0c03a90b8566 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -4916,16 +4916,8 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, options::OPT_fno_split_machine_functions)) { // This codegen pass is only available on x86-elf targets. if (Triple.isX86() && Triple.isOSBinFormatELF()) { - if (A->getOption().matches(options::OPT_fsplit_machine_functions)) { -// If the flag is enabled but no profile information is available then -// emit a warning. -if (getLastProfileUseArg(Args) || getLastProfileSampleUseArg(Args)) { - A->render(Args, CmdArgs); -} else { - D.Diag(diag::warn_drv_diagnostics_hotness_requires_pgo) - << A->getAsString(Args); -} - } + if (A->getOption().matches(options::OPT_fsplit_machine_functions)) +A->render(Args, CmdArgs); } else { D.Diag(diag::err_drv_unsupported_opt_for_target) << A->getAsString(Args) << TripleStr; diff --git a/clang/test/Driver/fsplit-machine-functions.c b/clang/test/Driver/fsplit-machine-functions.c index e126e4d41edb..fc0bb31f0f2b 100644 --- a/clang/test/Driver/fsplit-machine-functions.c +++ b/clang/test/Driver/fsplit-machine-functions.c @@ -1,9 +1,8 @@ // RUN: %clang -### -target x86_64 -fprofile-use=default.profdata -fsplit-machine-functions %s -c 2>&1 | FileCheck -check-prefix=CHECK-OPT %s +// RUN: %clang -### -target x86_64 -fsplit-machine-functions %s -c 2>&1 | FileCheck -check-prefix=CHECK-OPT %s // RUN: %clang -### -target x86_64 -fprofile-use=default.profdata -fsplit-machine-functions -fno-split-machine-functions %s -c 2>&1 | FileCheck -check-prefix=CHECK-NOOPT %s -// RUN: %clang -### -target x86_64 -fsplit-machine-functions %s 2>&1 | FileCheck -check-prefix=CHECK-WARN %s // RUN: not %clang -c -target arm-unknown-linux -fsplit-machine-functions %s 2>&1 | FileCheck -check-prefix=CHECK-TRIPLE %s // CHECK-OPT: "-fsplit-machine-functions" // CHECK-NOOPT-NOT: "-fsplit-machine-functions" -// CHECK-WARN: warning: argument '-fsplit-machine-functions' requires profile-guided optimization information // CHECK-TRIPLE:error: unsupported option '-fsplit-machine-functions' for target ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 157cd93 - [clang] Disallow fbasic-block-sections on non-ELF, non-x86 targets.
Author: Snehasish Kumar Date: 2020-09-10T00:15:33-07:00 New Revision: 157cd93b48a90f484e9eb2ed9997e0372b9c7ebb URL: https://github.com/llvm/llvm-project/commit/157cd93b48a90f484e9eb2ed9997e0372b9c7ebb DIFF: https://github.com/llvm/llvm-project/commit/157cd93b48a90f484e9eb2ed9997e0372b9c7ebb.diff LOG: [clang] Disallow fbasic-block-sections on non-ELF, non-x86 targets. Basic block sections is untested on other platforms and binary formats apart from x86,elf. This patch emits a warning and drops the flag if the platform and binary format are not compatible. Add a test to ensure that specifying an incompatible target in the driver does not enable the feature. Differential Revision: https://reviews.llvm.org/D87426 Added: Modified: clang/lib/Driver/ToolChains/Clang.cpp clang/test/Driver/fbasic-block-sections.c Removed: diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 1680f2ad91ea..40659ebb1395 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -4880,13 +4880,18 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, } if (Arg *A = Args.getLastArg(options::OPT_fbasic_block_sections_EQ)) { -StringRef Val = A->getValue(); -if (Val != "all" && Val != "labels" && Val != "none" && -!(Val.startswith("list=") && llvm::sys::fs::exists(Val.substr(5 - D.Diag(diag::err_drv_invalid_value) - << A->getAsString(Args) << A->getValue(); -else - A->render(Args, CmdArgs); +if (Triple.isX86() && Triple.isOSBinFormatELF()) { + StringRef Val = A->getValue(); + if (Val != "all" && Val != "labels" && Val != "none" && + !(Val.startswith("list=") && llvm::sys::fs::exists(Val.substr(5 +D.Diag(diag::err_drv_invalid_value) +<< A->getAsString(Args) << A->getValue(); + else +A->render(Args, CmdArgs); +} else { + D.Diag(diag::err_drv_unsupported_opt_for_target) + << A->getAsString(Args) << TripleStr; +} } if (Args.hasFlag(options::OPT_fdata_sections, options::OPT_fno_data_sections, diff --git a/clang/test/Driver/fbasic-block-sections.c b/clang/test/Driver/fbasic-block-sections.c index 2ff98c94222b..93c7fe9fc069 100644 --- a/clang/test/Driver/fbasic-block-sections.c +++ b/clang/test/Driver/fbasic-block-sections.c @@ -1,9 +1,12 @@ -// RUN: %clang -### -fbasic-block-sections=none %s -S 2>&1 | FileCheck -check-prefix=CHECK-OPT-NONE %s -// RUN: %clang -### -fbasic-block-sections=all %s -S 2>&1 | FileCheck -check-prefix=CHECK-OPT-ALL %s -// RUN: %clang -### -fbasic-block-sections=list=%s %s -S 2>&1 | FileCheck -check-prefix=CHECK-OPT-LIST %s -// RUN: %clang -### -fbasic-block-sections=labels %s -S 2>&1 | FileCheck -check-prefix=CHECK-OPT-LABELS %s +// RUN: %clang -### -target x86_64 -fbasic-block-sections=none %s -S 2>&1 | FileCheck -check-prefix=CHECK-OPT-NONE %s +// RUN: %clang -### -target x86_64 -fbasic-block-sections=all %s -S 2>&1 | FileCheck -check-prefix=CHECK-OPT-ALL %s +// RUN: %clang -### -target x86_64 -fbasic-block-sections=list=%s %s -S 2>&1 | FileCheck -check-prefix=CHECK-OPT-LIST %s +// RUN: %clang -### -target x86_64 -fbasic-block-sections=labels %s -S 2>&1 | FileCheck -check-prefix=CHECK-OPT-LABELS %s +// RUN: not %clang -c -target arm-unknown-linux -fbasic-block-sections=all %s -S 2>&1 | FileCheck -check-prefix=CHECK-TRIPLE %s +// RUN: not %clang -c -target x86_64-apple-darwin10 -fbasic-block-sections=all %s -S 2>&1 | FileCheck -check-prefix=CHECK-TRIPLE %s // -// CHECK-OPT-NONE: "-fbasic-block-sections=none" -// CHECK-OPT-ALL: "-fbasic-block-sections=all" -// CHECK-OPT-LIST: "-fbasic-block-sections={{[^ ]*}}fbasic-block-sections.c" +// CHECK-OPT-NONE: "-fbasic-block-sections=none" +// CHECK-OPT-ALL:"-fbasic-block-sections=all" +// CHECK-OPT-LIST: "-fbasic-block-sections={{[^ ]*}}fbasic-block-sections.c" // CHECK-OPT-LABELS: "-fbasic-block-sections=labels" +// CHECK-TRIPLE: error: unsupported option '-fbasic-block-sections=all' for target ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [IRPGO][ValueProfile] Instrument virtual table address that could be used to do virtual table address comparision for indirect-call-promotion. (PR #66825)
snehasish wrote: > @david-xl It's interesting to know this. How is that going on your end? Currently we are exploring the design space to accommodate the variety of platforms and FDO types we use internally. This is a priority for us though so we should have some updates to share externally by the end of the year. > We've been using similar technique to do memcpy size optimization. It's interesting that you are exploring a profile-guided direction for this. Chatelet et. al (from Google) published a paper on [automatic generation of memcpy](https://conf.researchr.org/details/ismm-2021/ismm-2021/4/automemcpy-A-framework-for-automatic-generation-of-fundamental-memory-operations) which uses PMU based parameter profiling at ISMM'21. The technique does not use Intel DLA instead we use precise sampling on call instructions in the process and filter the functions of interest. We inspect the RDX register to collect the parameter value for size. The data in aggregate was used to auto-generate the memcpy implementation. Section 2.4 has the rationale for not using an FDO approach. @gchatelet is the primary owner for this work. What is the memcpy implementation you are trying to optimize? Do you see context sensitivity or workload specificity an important dimension to consider? > But a common problem here could be how to generalize the AutoFDO profile > format to incorporate both indirect call targets, callsite parameter values > and other types of values. Do you have a plan for that? Maybe we can work > together on this. Extensions to the AutoFDO format to accommodate such hints sounds good. Happy to collaborate on the design which can be leveraged by future work. Perhaps start a separate issue for discussion on Github or a thread on Discourse? https://github.com/llvm/llvm-project/pull/66825 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] e99b5ad - [memprof] Add scripts to automate testdata regeneration.
Author: Snehasish Kumar Date: 2023-03-09T19:54:23Z New Revision: e99b5ad38381ab263820b23a184d217a4112519c URL: https://github.com/llvm/llvm-project/commit/e99b5ad38381ab263820b23a184d217a4112519c DIFF: https://github.com/llvm/llvm-project/commit/e99b5ad38381ab263820b23a184d217a4112519c.diff LOG: [memprof] Add scripts to automate testdata regeneration. The memprof profiles and binaries need to be updated in case of version updates. This change adds three scripts for llvm-profdata, clang and llvm tests where memprof profiles are used as inputs. Also update the tests, profiles and binaries in this change. Change based on the review suggestions in D145023. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D145644 Added: clang/test/CodeGen/Inputs/update_memprof_inputs.sh llvm/test/Transforms/PGOProfile/Inputs/update_memprof_inputs.sh llvm/test/tools/llvm-profdata/Inputs/update_memprof_inputs.sh Modified: clang/test/CodeGen/Inputs/memprof.exe clang/test/CodeGen/Inputs/memprof.memprofraw clang/test/CodeGen/memprof.cpp llvm/test/Transforms/PGOProfile/Inputs/memprof.exe llvm/test/Transforms/PGOProfile/Inputs/memprof.memprofraw llvm/test/Transforms/PGOProfile/Inputs/memprof_pgo.profraw llvm/test/Transforms/PGOProfile/memprof.ll llvm/test/tools/llvm-profdata/Inputs/basic.memprofexe llvm/test/tools/llvm-profdata/Inputs/basic.memprofraw llvm/test/tools/llvm-profdata/Inputs/inline.memprofexe llvm/test/tools/llvm-profdata/Inputs/inline.memprofraw llvm/test/tools/llvm-profdata/Inputs/multi.memprofexe llvm/test/tools/llvm-profdata/Inputs/multi.memprofraw llvm/test/tools/llvm-profdata/Inputs/pic.memprofexe llvm/test/tools/llvm-profdata/Inputs/pic.memprofraw llvm/test/tools/llvm-profdata/memprof-basic.test llvm/test/tools/llvm-profdata/memprof-inline.test llvm/test/tools/llvm-profdata/memprof-merge.test llvm/test/tools/llvm-profdata/memprof-multi.test llvm/test/tools/llvm-profdata/memprof-pic.test Removed: llvm/test/tools/llvm-profdata/Inputs/memprof-inline.exe diff --git a/clang/test/CodeGen/Inputs/memprof.exe b/clang/test/CodeGen/Inputs/memprof.exe index c3b28a8ed9d3d..ad7a0414e899e 100755 Binary files a/clang/test/CodeGen/Inputs/memprof.exe and b/clang/test/CodeGen/Inputs/memprof.exe diff er diff --git a/clang/test/CodeGen/Inputs/memprof.memprofraw b/clang/test/CodeGen/Inputs/memprof.memprofraw index 7ae89bff648af..e64214a51b086 100644 Binary files a/clang/test/CodeGen/Inputs/memprof.memprofraw and b/clang/test/CodeGen/Inputs/memprof.memprofraw diff er diff --git a/clang/test/CodeGen/Inputs/update_memprof_inputs.sh b/clang/test/CodeGen/Inputs/update_memprof_inputs.sh new file mode 100755 index 0..1d51419ed1e7f --- /dev/null +++ b/clang/test/CodeGen/Inputs/update_memprof_inputs.sh @@ -0,0 +1,17 @@ +#!/bin/bash + +if [ -z $1 ]; then + echo "Path to clang required!" + echo "Usage: update_memprof_inputs.sh /path/to/updated/clang" + exit 1 +else + CLANG=$1 +fi + +# Allows the script to be invoked from other directories. +OUTDIR=$(dirname $(realpath -s $0)) + +DEFAULT_MEMPROF_FLAGS="-fuse-ld=lld -Wl,--no-rosegment -gmlt -fdebug-info-for-profiling -fmemory-profile -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -m64 -Wl,-build-id -no-pie" + +${CLANG} ${DEFAULT_MEMPROF_FLAGS} ${OUTDIR}/../memprof.cpp -o ${OUTDIR}/memprof.exe +env MEMPROF_OPTIONS=log_path=stdout ${OUTDIR}/memprof.exe > ${OUTDIR}/memprof.memprofraw diff --git a/clang/test/CodeGen/memprof.cpp b/clang/test/CodeGen/memprof.cpp index b246d1f086942..cb53eaac03e11 100644 --- a/clang/test/CodeGen/memprof.cpp +++ b/clang/test/CodeGen/memprof.cpp @@ -11,16 +11,7 @@ // TODO: Use text profile inputs once that is available for memprof. // -// The following commands were used to compile the source to instrumented -// executables and collect raw binary format profiles: -// -// # Collect memory profile: -// $ clang++ -fuse-ld=lld -no-pie -Wl,--no-rosegment -gmlt \ -// -fdebug-info-for-profiling -mno-omit-leaf-frame-pointer \ -// -fno-omit-frame-pointer -fno-optimize-sibling-calls -m64 -Wl,-build-id \ -// memprof.cpp -o memprof.exe -fmemory-profile -// $ env MEMPROF_OPTIONS=log_path=stdout ./memprof.exe > memprof.memprofraw -// +// To update the inputs below, run Inputs/update_memprof_inputs.sh // RUN: llvm-profdata merge %S/Inputs/memprof.memprofraw --profiled-binary %S/Inputs/memprof.exe -o %t.memprofdata // Profile use: diff --git a/llvm/test/Transforms/PGOProfile/Inputs/memprof.exe b/llvm/test/Transforms/PGOProfile/Inputs/memprof.exe index 6dd5783eb5acc..0ed40163919cc 100755 Binary files a/llvm/test/Transforms/PGOProfile/Inputs/memprof.exe and b/llvm/test/Transforms/PGOProfile/Inputs/memprof.exe diff er diff --git a/llvm/test/Transforms/PGOProfile/Inputs/memprof.memprof
[clang] 287177a - [memprof] Record BuildIDs in the raw profile.
Author: Snehasish Kumar Date: 2023-03-13T19:28:38Z New Revision: 287177a47a396ca6cc0bef7696108cdaa0c68e5f URL: https://github.com/llvm/llvm-project/commit/287177a47a396ca6cc0bef7696108cdaa0c68e5f DIFF: https://github.com/llvm/llvm-project/commit/287177a47a396ca6cc0bef7696108cdaa0c68e5f.diff LOG: [memprof] Record BuildIDs in the raw profile. This patch adds support for recording BuildIds usng the sanitizer ListOfModules API. We add another entry to the SegmentEntry struct and change the memprof raw version. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D145190 Added: llvm/test/tools/llvm-profdata/Inputs/buildid.memprofexe llvm/test/tools/llvm-profdata/Inputs/buildid.memprofraw llvm/test/tools/llvm-profdata/memprof-buildid.test Modified: clang/test/CodeGen/Inputs/memprof.exe clang/test/CodeGen/Inputs/memprof.memprofraw compiler-rt/include/profile/MemProfData.inc compiler-rt/lib/memprof/memprof_allocator.cpp compiler-rt/lib/memprof/memprof_rawprofile.cpp compiler-rt/lib/memprof/memprof_rawprofile.h llvm/include/llvm/ProfileData/MemProfData.inc llvm/lib/ProfileData/RawMemProfReader.cpp llvm/test/Transforms/PGOProfile/Inputs/memprof.exe llvm/test/Transforms/PGOProfile/Inputs/memprof.memprofraw llvm/test/Transforms/PGOProfile/Inputs/memprof_pgo.profraw llvm/test/tools/llvm-profdata/Inputs/basic.memprofexe llvm/test/tools/llvm-profdata/Inputs/basic.memprofraw llvm/test/tools/llvm-profdata/Inputs/inline.memprofexe llvm/test/tools/llvm-profdata/Inputs/inline.memprofraw llvm/test/tools/llvm-profdata/Inputs/multi.memprofexe llvm/test/tools/llvm-profdata/Inputs/multi.memprofraw llvm/test/tools/llvm-profdata/Inputs/pic.memprofexe llvm/test/tools/llvm-profdata/Inputs/pic.memprofraw llvm/test/tools/llvm-profdata/Inputs/update_memprof_inputs.sh llvm/test/tools/llvm-profdata/memprof-basic.test llvm/test/tools/llvm-profdata/memprof-inline.test llvm/test/tools/llvm-profdata/memprof-multi.test Removed: diff --git a/clang/test/CodeGen/Inputs/memprof.exe b/clang/test/CodeGen/Inputs/memprof.exe index ad7a0414e899e..c03b8b65c3b5b 100755 Binary files a/clang/test/CodeGen/Inputs/memprof.exe and b/clang/test/CodeGen/Inputs/memprof.exe diff er diff --git a/clang/test/CodeGen/Inputs/memprof.memprofraw b/clang/test/CodeGen/Inputs/memprof.memprofraw index e64214a51b086..c3e0818e2ed10 100644 Binary files a/clang/test/CodeGen/Inputs/memprof.memprofraw and b/clang/test/CodeGen/Inputs/memprof.memprofraw diff er diff --git a/compiler-rt/include/profile/MemProfData.inc b/compiler-rt/include/profile/MemProfData.inc index c533073da751f..b82a4baf6dd74 100644 --- a/compiler-rt/include/profile/MemProfData.inc +++ b/compiler-rt/include/profile/MemProfData.inc @@ -19,6 +19,7 @@ * synced up. * \*===--===*/ +#include #ifdef _MSC_VER #define PACKED(...) __pragma(pack(push,1)) __VA_ARGS__ __pragma(pack(pop)) @@ -32,7 +33,9 @@ (uint64_t)'o' << 24 | (uint64_t)'f' << 16 | (uint64_t)'r' << 8 | (uint64_t)129) // The version number of the raw binary format. -#define MEMPROF_RAW_VERSION 2ULL +#define MEMPROF_RAW_VERSION 3ULL + +#define MEMPROF_BUILDID_MAX_SIZE 32ULL namespace llvm { namespace memprof { @@ -46,37 +49,40 @@ PACKED(struct Header { uint64_t StackOffset; }); - // A struct describing the information necessary to describe a /proc/maps // segment entry for a particular binary/library identified by its build id. PACKED(struct SegmentEntry { uint64_t Start; uint64_t End; uint64_t Offset; - // This field is unused until sanitizer procmaps support for build ids for - // Linux-Elf is implemented. - uint8_t BuildId[32] = {0}; + uint64_t BuildIdSize; + uint8_t BuildId[MEMPROF_BUILDID_MAX_SIZE] = {0}; - SegmentEntry(uint64_t S, uint64_t E, uint64_t O) : -Start(S), End(E), Offset(O) {} + // This constructor is only used in tests so don't set the BuildId. + SegmentEntry(uint64_t S, uint64_t E, uint64_t O) + : Start(S), End(E), Offset(O), BuildIdSize(0) {} SegmentEntry(const SegmentEntry& S) { Start = S.Start; End = S.End; Offset = S.Offset; +BuildIdSize = S.BuildIdSize; +memcpy(BuildId, S.BuildId, S.BuildIdSize); } SegmentEntry& operator=(const SegmentEntry& S) { Start = S.Start; End = S.End; Offset = S.Offset; +BuildIdSize = S.BuildIdSize; +memcpy(BuildId, S.BuildId, S.BuildIdSize); return *this; } bool operator==(const SegmentEntry& S) const { -return Start == S.Start && - End == S.End && - Offset == S.Offset; +return Start == S.Start && End == S.End && Offset == S.Offset && + BuildIdSize == S.BuildIdSize && + memcmp(BuildId, S.BuildId, S.BuildIdSize) == 0; } }); diff -
[clang] debe80c - Revert "[memprof] Record BuildIDs in the raw profile."
Author: Snehasish Kumar Date: 2023-03-13T20:09:46Z New Revision: debe80cb8d50275849d8d0b9357321eb7985b854 URL: https://github.com/llvm/llvm-project/commit/debe80cb8d50275849d8d0b9357321eb7985b854 DIFF: https://github.com/llvm/llvm-project/commit/debe80cb8d50275849d8d0b9357321eb7985b854.diff LOG: Revert "[memprof] Record BuildIDs in the raw profile." This reverts commit 287177a47a396ca6cc0bef7696108cdaa0c68e5f. Added: Modified: clang/test/CodeGen/Inputs/memprof.exe clang/test/CodeGen/Inputs/memprof.memprofraw compiler-rt/include/profile/MemProfData.inc compiler-rt/lib/memprof/memprof_allocator.cpp compiler-rt/lib/memprof/memprof_rawprofile.cpp compiler-rt/lib/memprof/memprof_rawprofile.h llvm/include/llvm/ProfileData/MemProfData.inc llvm/lib/ProfileData/RawMemProfReader.cpp llvm/test/Transforms/PGOProfile/Inputs/memprof.exe llvm/test/Transforms/PGOProfile/Inputs/memprof.memprofraw llvm/test/Transforms/PGOProfile/Inputs/memprof_pgo.profraw llvm/test/tools/llvm-profdata/Inputs/basic.memprofexe llvm/test/tools/llvm-profdata/Inputs/basic.memprofraw llvm/test/tools/llvm-profdata/Inputs/inline.memprofexe llvm/test/tools/llvm-profdata/Inputs/inline.memprofraw llvm/test/tools/llvm-profdata/Inputs/multi.memprofexe llvm/test/tools/llvm-profdata/Inputs/multi.memprofraw llvm/test/tools/llvm-profdata/Inputs/pic.memprofexe llvm/test/tools/llvm-profdata/Inputs/pic.memprofraw llvm/test/tools/llvm-profdata/Inputs/update_memprof_inputs.sh llvm/test/tools/llvm-profdata/memprof-basic.test llvm/test/tools/llvm-profdata/memprof-inline.test llvm/test/tools/llvm-profdata/memprof-multi.test Removed: llvm/test/tools/llvm-profdata/Inputs/buildid.memprofexe llvm/test/tools/llvm-profdata/Inputs/buildid.memprofraw llvm/test/tools/llvm-profdata/memprof-buildid.test diff --git a/clang/test/CodeGen/Inputs/memprof.exe b/clang/test/CodeGen/Inputs/memprof.exe index c03b8b65c3b5b..ad7a0414e899e 100755 Binary files a/clang/test/CodeGen/Inputs/memprof.exe and b/clang/test/CodeGen/Inputs/memprof.exe diff er diff --git a/clang/test/CodeGen/Inputs/memprof.memprofraw b/clang/test/CodeGen/Inputs/memprof.memprofraw index c3e0818e2ed10..e64214a51b086 100644 Binary files a/clang/test/CodeGen/Inputs/memprof.memprofraw and b/clang/test/CodeGen/Inputs/memprof.memprofraw diff er diff --git a/compiler-rt/include/profile/MemProfData.inc b/compiler-rt/include/profile/MemProfData.inc index b82a4baf6dd74..c533073da751f 100644 --- a/compiler-rt/include/profile/MemProfData.inc +++ b/compiler-rt/include/profile/MemProfData.inc @@ -19,7 +19,6 @@ * synced up. * \*===--===*/ -#include #ifdef _MSC_VER #define PACKED(...) __pragma(pack(push,1)) __VA_ARGS__ __pragma(pack(pop)) @@ -33,9 +32,7 @@ (uint64_t)'o' << 24 | (uint64_t)'f' << 16 | (uint64_t)'r' << 8 | (uint64_t)129) // The version number of the raw binary format. -#define MEMPROF_RAW_VERSION 3ULL - -#define MEMPROF_BUILDID_MAX_SIZE 32ULL +#define MEMPROF_RAW_VERSION 2ULL namespace llvm { namespace memprof { @@ -49,40 +46,37 @@ PACKED(struct Header { uint64_t StackOffset; }); + // A struct describing the information necessary to describe a /proc/maps // segment entry for a particular binary/library identified by its build id. PACKED(struct SegmentEntry { uint64_t Start; uint64_t End; uint64_t Offset; - uint64_t BuildIdSize; - uint8_t BuildId[MEMPROF_BUILDID_MAX_SIZE] = {0}; + // This field is unused until sanitizer procmaps support for build ids for + // Linux-Elf is implemented. + uint8_t BuildId[32] = {0}; - // This constructor is only used in tests so don't set the BuildId. - SegmentEntry(uint64_t S, uint64_t E, uint64_t O) - : Start(S), End(E), Offset(O), BuildIdSize(0) {} + SegmentEntry(uint64_t S, uint64_t E, uint64_t O) : +Start(S), End(E), Offset(O) {} SegmentEntry(const SegmentEntry& S) { Start = S.Start; End = S.End; Offset = S.Offset; -BuildIdSize = S.BuildIdSize; -memcpy(BuildId, S.BuildId, S.BuildIdSize); } SegmentEntry& operator=(const SegmentEntry& S) { Start = S.Start; End = S.End; Offset = S.Offset; -BuildIdSize = S.BuildIdSize; -memcpy(BuildId, S.BuildId, S.BuildIdSize); return *this; } bool operator==(const SegmentEntry& S) const { -return Start == S.Start && End == S.End && Offset == S.Offset && - BuildIdSize == S.BuildIdSize && - memcmp(BuildId, S.BuildId, S.BuildIdSize) == 0; +return Start == S.Start && + End == S.End && + Offset == S.Offset; } }); diff --git a/compiler-rt/lib/memprof/memprof_allocator.cpp b/compiler-rt/lib/memprof/memprof_allocator.cpp index 751e4c40a72b2..6e3fa7f2dc7b1 100644 --- a/compiler-rt/lib/memprof/mempr
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, snehasish wrote: > also to centralize the configuration for the convenience +1 I think a frontend flag is useful. It allows us to identify incompatibilities early and provide clear error messages instead of more obscure failures down the line. https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
snehasish wrote: > The major motivation is to detect dead functions for the services that are > optimized with sampling PGO. For your use case, can you use [ProfileSymbolList](https://github.com/llvm/llvm-project/blob/32ffc9fdc2cd422c88c926b862adb3de726e3888/llvm/include/llvm/ProfileData/SampleProf.h#L1509-L1512) to identify very cold functions (executed but unsampled) or are you indeed looking for functions that are never executed? Will this be used to guide developers with diagnostics or more aggressive compiler driven optimizations? https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, snehasish wrote: > Is it that the backend can't actually reliably point the finger at the > specific flag causing the conflict Yes, if we know that this instrumentation mode should not be mixed with e.g. sanitizers or something else we can enforce these checks early. I don't see a particular downside to adding a frontend flag. The convenience of bundling the 2 lld flags and 3 mllvm flags into a single frontend flag seems like a good enough motivation to do so. https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [docs]Fix a typo around '#pragma clang section' (PR #119791)
https://github.com/snehasish approved this pull request. lgtm https://github.com/llvm/llvm-project/pull/119791 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits