Nathan-Huckleberry created this revision. Herald added subscribers: llvm-commits, cfe-commits, arphaman, hiraditya, mgorny. Herald added projects: clang, LLVM. Nathan-Huckleberry edited the summary of this revision. Nathan-Huckleberry edited the summary of this revision. Nathan-Huckleberry edited the summary of this revision. Nathan-Huckleberry edited the summary of this revision. Nathan-Huckleberry edited the summary of this revision. Nathan-Huckleberry edited the summary of this revision. Nathan-Huckleberry edited the summary of this revision.
Instrumenting Clang/LLVM with Perfetto Overview Perfetto is an event based tracer designed to replace chrome://tracing. It allows for fine-grained control over trace data and is currently in use by Chrome and Android. Instrumentation of Clang with Perfetto would give nicely formatted traces that are easily shareable by link. Compile time regression bugs could be filed with Perfetto links that clearly show the regression. Perfetto exposes a C++ library that allows arbitrary applications to record app-specific events. Trace events can be added to Clang by calling macros exposed by Perfetto. The trace events are sent to an in-process tracing service and are kept in memory until the trace is written to disk. The trace is written as a protobuf and can be opened by the Perfetto trace processor (https://ui.perfetto.dev/). The Perfetto trace processor allows you to vizualize traces as flamegraphs. The view can be scrolled with "WASD" keys. There is also a query engine built into the processor that can run queries by pressing CTRL+ENTER. The benefits of Perfetto: - Shareable Perfetto links - Traces can be easily shared without sending the trace file - Traces can be easily aggregated with UNIX cat - Fine-grained Tracing Control - Trace events can span across function boundaries (Start a trace in one function, end it in another) - Finer granularity than function level that you would see with Linux perf - Less tracing overhead - Trace events are buffered in memory, not sent directly to disk - Perfetto macros are optimized to prevent overhead - Smaller trace sizes - Strings and other reused data is interned - Traces are stored as protobufs instead of JSON - 3x smaller than -ftrace-time traces - SQL queries for traces - The Perfetto UI has a query language built in for data aggregation - Works on Linux/MacOS/Windows Example Trace This is an example trace on a Linux kernel source file. https://ui.perfetto.dev/#!/?s=c7942d5118f3ccfe16f46d166b05a66d077eb61ef8e22184a7d7dfe87ba8ea This is an example trace on the entire Linux kernel. https://ui.perfetto.dev/#!/?s=10556b46b46aba46188a51478102a6ce21a9c767c218afa5b8429eac4cb9d4 Recorded with: make CC="clang-9" KCFLAGS="-perfetto" -j72 find /tmp -name "*pftrace" -exec cat {} \; > trace.pftrace Current Implementation These changes are behind a CMake flag (-DPERFETTO). When building Clang with the CMake flag enabled, the Perfetto GitHub is cloned into the build folder and linked against any code that uses Perfetto macros. The -ftime-trace and Perfetto trace events have been combined into one macro that expands to trace events for both. The behavior of -ftime-trace is unchanged. To run a Perfetto trace, pass the flag -perfetto to Clang (built with -DPERFETTO). The trace output file follows the convention set by -ftime-trace and uses the filename passed to -o to determine the trace filename. For example: `clang -perfetto -c foo.c -o foo.o` would generate foo.pftrace. Tracing documentation `LLVM_TRACE_BEGIN(name, detail)` Begins a tracing slice if Perfetto or -ftime-trace is enabled. `name` : constexpr String This is what will be displayed on the tracing UI. `detail` : StringRef Additional detail to add to the trace slice. This expands to a lambda and will be evaluated lazily only if Perfetto or -ftime-trace are enabled. `LLVM_TRACE_END()` Ends the most recently started slice. `LLVM_TRACE_SCOPE(name, detail)` Begins a tracing slice and initializes an anonymous struct if Perfetto or -ftime-trace is enabled. When the struct goes out of scope, the tracing slice will end. `name` : constexpr String This is what will be displayed on the tracing UI. `detail` : StringRef Additional detail to add to the trace slice. This expands to a lambda and will be evaluated lazily only if Perfetto or -ftime-trace are enabled. Perfetto Documentation: https://perfetto.dev/ FAQs Why not use Linux Perf? Perfetto's event based model allows for much finer grained control over the trace. - Linux Perf is only available on Linux. - Visualization requires post processing with separate tools. - Requires kernel version specific dependencies. Why not use -ftime-trace? Perfetto has almost the same functionality as -ftime-trace, but with a few added benefits. - Shareable links. - Traces can be aggregated easily with UNIX cat. - The query engine for trace analysis. - The Perfetto UI is browser agnostic and could be used the same way as godbolt. - The resulting trace files are ~3x smaller. - A trace of the Linux kernel is 50MB with Perfetto and 139MB with -ftime-trace. Extra Notes Perfetto also has a system-mode that interacts with Linux ftrace. It can record things like process scheduling, syscalls, memory usage and CPU usage. This type of trace probably records way more data than we need, but I recorded a sample trace anyway while testing. https://ui.perfetto.dev/#!/?s=18de7feb4f84ecd29519cb4ac136613ba891e4fd5e88a9e6511412ccfd210 Known Issues When no-integrated-as is enabled, traces are outputted to /tmp/. This is a bug with the current implementation of -ftime-trace. When the Perfetto change is applied, the bug also applies to Perfetto. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D82994 Files: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Driver/Options.td clang/include/clang/Frontend/FrontendOptions.h clang/lib/CodeGen/BackendUtil.cpp clang/lib/CodeGen/CGDebugInfo.cpp clang/lib/CodeGen/CMakeLists.txt clang/lib/CodeGen/CodeGenAction.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/Driver.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Frontend/CompilerInstance.cpp clang/lib/Frontend/CompilerInvocation.cpp clang/lib/Lex/CMakeLists.txt clang/lib/Parse/CMakeLists.txt clang/lib/Parse/ParseAST.cpp clang/lib/Parse/ParseDeclCXX.cpp clang/lib/Parse/ParseTemplate.cpp clang/lib/Sema/Sema.cpp clang/lib/Sema/SemaTemplateInstantiate.cpp clang/lib/Sema/SemaTemplateInstantiateDecl.cpp clang/lib/Serialization/GlobalModuleIndex.cpp clang/tools/clang-shlib/CMakeLists.txt clang/tools/driver/cc1_main.cpp llvm/cmake/config-ix.cmake llvm/cmake/modules/AddPerfetto.cmake llvm/include/llvm/Support/PerfettoTracer.h llvm/include/llvm/Support/Tracing.h llvm/lib/Analysis/LoopPass.cpp llvm/lib/IR/CMakeLists.txt llvm/lib/IR/LegacyPassManager.cpp llvm/lib/Support/CMakeLists.txt llvm/lib/Support/PerfettoTracer.cpp
Index: llvm/lib/Support/PerfettoTracer.cpp =================================================================== --- /dev/null +++ llvm/lib/Support/PerfettoTracer.cpp @@ -0,0 +1,54 @@ +#ifdef PERFETTO +#include "llvm/Support/Tracing.h" + +using namespace llvm; + +bool PerfettoEnabled = false; +uint64_t PerfettoGranularityNS = 2 * 1000 * 1000; +SmallVector<Entry, 16> PerfettoStack; + +PerfettoProfiler::PerfettoProfiler(std::unique_ptr<raw_pwrite_stream> os) + : OS(std::move(os)) { + perfetto::TracingInitArgs args; + // The backends determine where trace events are recorded. For this example we + // are going to use the in-process tracing service, which only includes in-app + // events. + args.backends = perfetto::kInProcessBackend; + + perfetto::Tracing::Initialize(args); + perfetto::TrackEvent::Register(); + + // The trace config defines which types of data sources are enabled for + // recording. In this example we just need the "track_event" data source, + // which corresponds to the TRACE_EVENT trace points. + perfetto::TraceConfig cfg; + cfg.add_buffers()->set_size_kb(32384); + auto *ds_cfg = cfg.add_data_sources()->mutable_config(); + ds_cfg->set_name("track_event"); + + TracingSession = perfetto::Tracing::NewTrace(); + TracingSession->Setup(cfg); + TracingSession->StartBlocking(); + PerfettoEnabled = true; +} + +PerfettoProfiler::~PerfettoProfiler() { + // Make sure the last event is closed for this example. + perfetto::TrackEvent::Flush(); + // Stop tracing and read the trace data. + TracingSession->StopBlocking(); + std::vector<char> trace_data(TracingSession->ReadTraceBlocking()); + // Write the result into a file. + // Note: To save memory with longer traces, you can tell Perfetto to write + // directly into a file by passing a file descriptor into Setup() above. + OS->write(&trace_data[0], trace_data.size()); + OS->flush(); +} + +bool PerfettoTracingEnabled() { return PerfettoEnabled; } + +uint64_t PerfettoGetGranularity() { return PerfettoGranularityNS; } + +// Reserves internal static storage for our tracing categories. +PERFETTO_TRACK_EVENT_STATIC_STORAGE(); +#endif Index: llvm/lib/Support/CMakeLists.txt =================================================================== --- llvm/lib/Support/CMakeLists.txt +++ llvm/lib/Support/CMakeLists.txt @@ -122,6 +122,7 @@ OptimizedStructLayout.cpp Optional.cpp Parallel.cpp + PerfettoTracer.cpp PluginLoader.cpp PrettyStackTrace.cpp RandomNumberGenerator.cpp Index: llvm/lib/IR/LegacyPassManager.cpp =================================================================== --- llvm/lib/IR/LegacyPassManager.cpp +++ llvm/lib/IR/LegacyPassManager.cpp @@ -27,8 +27,8 @@ #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/ManagedStatic.h" #include "llvm/Support/Mutex.h" -#include "llvm/Support/TimeProfiler.h" #include "llvm/Support/Timer.h" +#include "llvm/Support/Tracing.h" #include "llvm/Support/raw_ostream.h" #include <algorithm> #include <unordered_set> @@ -1463,13 +1463,13 @@ FunctionSize = F.getInstructionCount(); } - llvm::TimeTraceScope FunctionScope("OptFunction", F.getName()); + LLVM_TRACE_SCOPE("OptFunction", F.getName()); for (unsigned Index = 0; Index < getNumContainedPasses(); ++Index) { FunctionPass *FP = getContainedPass(Index); bool LocalChanged = false; - llvm::TimeTraceScope PassScope("RunPass", FP->getPassName()); + LLVM_TRACE_SCOPE("RunPass", FP->getPassName()); dumpPassInfo(FP, EXECUTION_MSG, ON_FUNCTION_MSG, F.getName()); dumpRequiredSet(FP); @@ -1546,7 +1546,7 @@ /// the module, and if so, return true. bool MPPassManager::runOnModule(Module &M) { - llvm::TimeTraceScope TimeScope("OptModule", M.getName()); + LLVM_TRACE_SCOPE("OptModule", M.getName()); bool Changed = false; Index: llvm/lib/IR/CMakeLists.txt =================================================================== --- llvm/lib/IR/CMakeLists.txt +++ llvm/lib/IR/CMakeLists.txt @@ -63,3 +63,9 @@ DEPENDS intrinsics_gen ) + +if(PERFETTO) + target_link_libraries(LLVMCore + PRIVATE perfetto + ) +endif() Index: llvm/lib/Analysis/LoopPass.cpp =================================================================== --- llvm/lib/Analysis/LoopPass.cpp +++ llvm/lib/Analysis/LoopPass.cpp @@ -22,8 +22,8 @@ #include "llvm/IR/PassTimingInfo.h" #include "llvm/InitializePasses.h" #include "llvm/Support/Debug.h" -#include "llvm/Support/TimeProfiler.h" #include "llvm/Support/Timer.h" +#include "llvm/Support/Tracing.h" #include "llvm/Support/raw_ostream.h" using namespace llvm; @@ -179,7 +179,7 @@ for (unsigned Index = 0; Index < getNumContainedPasses(); ++Index) { LoopPass *P = getContainedPass(Index); - llvm::TimeTraceScope LoopPassScope("RunLoopPass", P->getPassName()); + LLVM_TRACE_SCOPE("RunLoopPass", P->getPassName()); dumpPassInfo(P, EXECUTION_MSG, ON_LOOP_MSG, CurrentLoop->getHeader()->getName()); Index: llvm/include/llvm/Support/Tracing.h =================================================================== --- /dev/null +++ llvm/include/llvm/Support/Tracing.h @@ -0,0 +1,26 @@ +#ifndef TRACING_H +#define TRACING_H + +#include "llvm/Support/PerfettoTracer.h" +#include "llvm/Support/TimeProfiler.h" + +// Generate a unique variable name with a given prefix. +#define INTERNAL_CONCAT2(a, b) a##b +#define INTERNAL_CONCAT(a, b) INTERNAL_CONCAT2(a, b) +#define UID(prefix) INTERNAL_CONCAT(prefix, __LINE__) + +#define LLVM_TRACE_BEGIN(name, detail) \ + if (llvm::timeTraceProfilerEnabled()) \ + llvm::timeTraceProfilerBegin(name, detail); \ + PERFETTO_TRACE_EVENT_BEGIN(name, detail) + +#define LLVM_TRACE_END() \ + if (llvm::timeTraceProfilerEnabled()) \ + llvm::timeTraceProfilerEnd(); \ + PERFETTO_TRACE_EVENT_END() + +#define LLVM_TRACE_SCOPE(name, detail) \ + llvm::TimeTraceScope UID(TimeScope)(name, detail); \ + PERFETTO_TRACE_EVENT_SCOPE(name, detail) + +#endif // TRACING_H Index: llvm/include/llvm/Support/PerfettoTracer.h =================================================================== --- /dev/null +++ llvm/include/llvm/Support/PerfettoTracer.h @@ -0,0 +1,109 @@ +#ifndef PERFETTO_TRACE_CATEGORIES_H +#define PERFETTO_TRACE_CATEGORIES_H + +#ifdef PERFETTO +#include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/raw_ostream.h" +#include <perfetto.h> + +// The set of track event categories that the example is using. +PERFETTO_DEFINE_CATEGORIES( + perfetto::Category("frontend").SetDescription("Frontend events"), + perfetto::Category("backend").SetDescription("Backend events"), + perfetto::Category("clang").SetDescription("Trace for main()")); + +class PerfettoProfiler { + std::unique_ptr<perfetto::TracingSession> TracingSession; + std::unique_ptr<llvm::raw_pwrite_stream> OS; + +public: + PerfettoProfiler(std::unique_ptr<llvm::raw_pwrite_stream> os); + ~PerfettoProfiler(); +}; + +struct Entry { + uint64_t BeginTime; + std::string Name; + std::string Detail; + Entry(uint64_t begin, std::string name, std::string detail) + : BeginTime(begin), Name(name), Detail(detail){}; +}; + +bool PerfettoTracingEnabled(); +uint64_t PerfettoGetGranularity(); +extern llvm::SmallVector<Entry, 16> PerfettoStack; + +#define PERFETTO_TRACE_EVENT_BEGIN(name, detail) \ + if (PerfettoTracingEnabled()) { \ + PerfettoStack.push_back( \ + Entry(perfetto::internal::TrackEventInternal::GetTimeNs(), \ + std::string(name), (detail).str())); \ + } + +#define PERFETTO_TRACE_EVENT_END() \ + if (PerfettoTracingEnabled()) { \ + uint64_t EndTime = perfetto::internal::TrackEventInternal::GetTimeNs(); \ + Entry e = PerfettoStack.back(); \ + PerfettoStack.pop_back(); \ + if (EndTime - e.BeginTime > PerfettoGetGranularity()) { \ + TRACE_EVENT_BEGIN("clang", "", e.BeginTime, \ + [&](perfetto::EventContext ctx) { \ + auto event = ctx.event(); \ + event->set_name_iid(0); \ + event->set_name(e.Name); \ + auto debug = event->add_debug_annotations(); \ + debug->set_name("detail"); \ + debug->set_string_value(e.Detail); \ + }); \ + TRACE_EVENT_END("clang", EndTime, [&](perfetto::EventContext ctx) {}); \ + } \ + } + +#define PERFETTO_TRACE_EVENT_SCOPE(name, detail) \ + FREQUENT_TRACE_EVENT_LAMBDA("clang", name, [&](perfetto::EventContext ctx) { \ + auto event = ctx.event(); \ + auto debug = event->add_debug_annotations(); \ + debug->set_name("detail"); \ + debug->set_string_value((detail).str()); \ + }) + +// Custom TRACE_EVENT macro for events that are usually high frequency +// and very short. Only records the trace events if the slice is above +// the minimum required theshold. +// +// Can take a constexpr string and an optional non-constexpr string to +// be appended. +#define FREQUENT_TRACE_EVENT_LAMBDA(category, name, lambda) \ + struct PERFETTO_UID(ScopedEvent) { \ + struct EventFinalizer { \ + std::function<void(perfetto::EventContext)> lambda__; \ + uint64_t BeginTimestamp = 0; \ + uint64_t EndTimestamp = 0; \ + EventFinalizer(std::function<void(perfetto::EventContext)> l) { \ + if (PerfettoTracingEnabled()) { \ + BeginTimestamp = \ + perfetto::internal::TrackEventInternal::GetTimeNs(); \ + lambda__ = l; \ + } \ + } \ + ~EventFinalizer() { \ + if (PerfettoTracingEnabled()) { \ + EndTimestamp = perfetto::internal::TrackEventInternal::GetTimeNs(); \ + if (EndTimestamp - BeginTimestamp > PerfettoGetGranularity()) { \ + TRACE_EVENT_BEGIN(category, name, BeginTimestamp, lambda__); \ + TRACE_EVENT_END(category, EndTimestamp, \ + [&](perfetto::EventContext ctx) {}); \ + } \ + } \ + } \ + } finalizer; \ + } PERFETTO_UID(scoped_event){{lambda}}; + +#else +#define PERFETTO_TRACE_EVENT_BEGIN(name, detail) +#define PERFETTO_TRACE_EVENT_END() +#define PERFETTO_TRACE_EVENT_SCOPE(name, detail) +#endif + +#endif // PERFETTO_TRACE_CATEGORIES_H Index: llvm/cmake/modules/AddPerfetto.cmake =================================================================== --- /dev/null +++ llvm/cmake/modules/AddPerfetto.cmake @@ -0,0 +1,28 @@ +if(PERFETTO) + cmake_minimum_required(VERSION 2.8.2) + message(STATUS "Perfetto enabled.") + + find_package(Threads) + + include(ExternalProject) + ExternalProject_Add(perfetto_git + GIT_REPOSITORY https://github.com/google/perfetto.git + GIT_TAG releases/v4.x + PREFIX "${CMAKE_BINARY_DIR}/perfetto" + CONFIGURE_COMMAND "" + BUILD_COMMAND "" + INSTALL_COMMAND "" + LOG_DOWNLOAD ON + LOG_INSTALL ON + BUILD_BYPRODUCTS "perfetto/src/perfetto_git/sdk/perfetto.cc" + BUILD_BYPRODUCTS "perfetto/src/perfetto_git/sdk/perfetto.h" + ) + ExternalProject_Get_Property(perfetto_git source_dir) + include_directories(${source_dir}/sdk) + add_library(perfetto STATIC IMPORTED) + set_target_properties(perfetto PROPERTIES + IMPORTED_LOCATION ${source_dir}/sdk/perfetto.cc + ) + add_dependencies(perfetto perfetto_git) + SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DPERFETTO=1") +endif() Index: llvm/cmake/config-ix.cmake =================================================================== --- llvm/cmake/config-ix.cmake +++ llvm/cmake/config-ix.cmake @@ -629,6 +629,8 @@ endif() endif() +include(AddPerfetto) + string(REPLACE " " ";" LLVM_BINDINGS_LIST "${LLVM_BINDINGS}") function(find_python_module module) Index: clang/tools/driver/cc1_main.cpp =================================================================== --- clang/tools/driver/cc1_main.cpp +++ clang/tools/driver/cc1_main.cpp @@ -40,8 +40,8 @@ #include "llvm/Support/Signals.h" #include "llvm/Support/TargetRegistry.h" #include "llvm/Support/TargetSelect.h" -#include "llvm/Support/TimeProfiler.h" #include "llvm/Support/Timer.h" +#include "llvm/Support/Tracing.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Target/TargetMachine.h" #include <cstdio> @@ -210,6 +210,22 @@ llvm::timeTraceProfilerInitialize( Clang->getFrontendOpts().TimeTraceGranularity, Argv0); } +#ifdef PERFETTO + std::unique_ptr<PerfettoProfiler> PerfettoSession = nullptr; + if (Clang->getFrontendOpts().Perfetto) { + SmallString<128> Path(Clang->getFrontendOpts().OutputFile); + llvm::sys::path::replace_extension(Path, "pftrace"); + if (auto profilerOutput = + Clang->createOutputFile(Path.str(), + /*Binary=*/false, + /*RemoveFileOnSignal=*/false, "", + /*Extension=*/"json", + /*useTemporary=*/false)) { + + PerfettoSession.reset(new PerfettoProfiler(std::move(profilerOutput))); + } + } +#endif // --print-supported-cpus takes priority over the actual compilation. if (Clang->getFrontendOpts().PrintSupportedCPUs) return PrintSupportedCPUs(Clang->getTargetOpts().Triple); @@ -236,7 +252,7 @@ // Execute the frontend actions. { - llvm::TimeTraceScope TimeScope("ExecuteCompiler"); + LLVM_TRACE_SCOPE("ExecuteCompiler", StringRef("")); Success = ExecuteCompilerInvocation(Clang.get()); } Index: clang/tools/clang-shlib/CMakeLists.txt =================================================================== --- clang/tools/clang-shlib/CMakeLists.txt +++ clang/tools/clang-shlib/CMakeLists.txt @@ -36,6 +36,17 @@ set(INSTALL_WITH_TOOLCHAIN INSTALL_WITH_TOOLCHAIN) endif() +if (PERFETTO) +# Avoid issues with PRIVATE library +# There's probably a better fix for this +add_clang_library(clang-cpp + SHARED + ${INSTALL_WITH_TOOLCHAIN} + clang-shlib.cpp + ${_OBJECTS} + LINK_LIBS + ${clang_libs}) +else() add_clang_library(clang-cpp SHARED ${INSTALL_WITH_TOOLCHAIN} @@ -43,3 +54,4 @@ ${_OBJECTS} LINK_LIBS ${_DEPS}) +endif() Index: clang/lib/Serialization/GlobalModuleIndex.cpp =================================================================== --- clang/lib/Serialization/GlobalModuleIndex.cpp +++ clang/lib/Serialization/GlobalModuleIndex.cpp @@ -30,7 +30,7 @@ #include "llvm/Support/MemoryBuffer.h" #include "llvm/Support/OnDiskHashTable.h" #include "llvm/Support/Path.h" -#include "llvm/Support/TimeProfiler.h" +#include "llvm/Support/Tracing.h" #include <cstdio> using namespace clang; using namespace serialization; @@ -135,7 +135,7 @@ "' failed: " + toString(std::move(Err))); }; - llvm::TimeTraceScope TimeScope("Module LoadIndex"); + LLVM_TRACE_SCOPE("Module LoadIndex", StringRef("")); // Read the global index. bool InGlobalIndexBlock = false; bool Done = false; @@ -771,7 +771,7 @@ } using namespace llvm; - llvm::TimeTraceScope TimeScope("Module WriteIndex"); + LLVM_TRACE_SCOPE("Module WriteIndex", StringRef("")); // Emit the file header. Stream.Emit((unsigned)'B', 8); Index: clang/lib/Sema/SemaTemplateInstantiateDecl.cpp =================================================================== --- clang/lib/Sema/SemaTemplateInstantiateDecl.cpp +++ clang/lib/Sema/SemaTemplateInstantiateDecl.cpp @@ -26,7 +26,7 @@ #include "clang/Sema/SemaInternal.h" #include "clang/Sema/Template.h" #include "clang/Sema/TemplateInstCallback.h" -#include "llvm/Support/TimeProfiler.h" +#include "llvm/Support/Tracing.h" using namespace clang; @@ -4578,13 +4578,8 @@ return; } - llvm::TimeTraceScope TimeScope("InstantiateFunction", [&]() { - std::string Name; - llvm::raw_string_ostream OS(Name); - Function->getNameForDiagnostic(OS, getPrintingPolicy(), - /*Qualified=*/true); - return Name; - }); + LLVM_TRACE_SCOPE("InstantiateFunction", + StringRef(Function->getQualifiedNameAsString())); // If we're performing recursive template instantiation, create our own // queue of pending implicit instantiations that we will instantiate later, Index: clang/lib/Sema/SemaTemplateInstantiate.cpp =================================================================== --- clang/lib/Sema/SemaTemplateInstantiate.cpp +++ clang/lib/Sema/SemaTemplateInstantiate.cpp @@ -29,7 +29,7 @@ #include "clang/Sema/Template.h" #include "clang/Sema/TemplateDeduction.h" #include "clang/Sema/TemplateInstCallback.h" -#include "llvm/Support/TimeProfiler.h" +#include "llvm/Support/Tracing.h" using namespace clang; using namespace sema; @@ -2596,13 +2596,8 @@ Pattern, PatternDef, TSK, Complain)) return true; - llvm::TimeTraceScope TimeScope("InstantiateClass", [&]() { - std::string Name; - llvm::raw_string_ostream OS(Name); - Instantiation->getNameForDiagnostic(OS, getPrintingPolicy(), - /*Qualified=*/true); - return Name; - }); + LLVM_TRACE_SCOPE("InstantiateClass", + StringRef(Instantiation->getQualifiedNameAsString())); Pattern = PatternDef; Index: clang/lib/Sema/Sema.cpp =================================================================== --- clang/lib/Sema/Sema.cpp +++ clang/lib/Sema/Sema.cpp @@ -43,7 +43,7 @@ #include "clang/Sema/TypoCorrection.h" #include "llvm/ADT/DenseMap.h" #include "llvm/ADT/SmallSet.h" -#include "llvm/Support/TimeProfiler.h" +#include "llvm/Support/Tracing.h" using namespace clang; using namespace sema; @@ -113,11 +113,12 @@ SourceManager &SM = S->getSourceManager(); SourceLocation IncludeLoc = SM.getIncludeLoc(SM.getFileID(Loc)); if (IncludeLoc.isValid()) { - if (llvm::timeTraceProfilerEnabled()) { - const FileEntry *FE = SM.getFileEntryForID(SM.getFileID(Loc)); - llvm::timeTraceProfilerBegin( - "Source", FE != nullptr ? FE->getName() : StringRef("<unknown>")); - } + // FIXME: These events are mismatched which will cause problems if + // additional trace events are added. + LLVM_TRACE_BEGIN( + "Source", SM.getFileEntryForID(SM.getFileID(Loc)) != nullptr + ? SM.getFileEntryForID(SM.getFileID(Loc))->getName() + : StringRef("<unknown>")); IncludeStack.push_back(IncludeLoc); S->DiagnoseNonDefaultPragmaPack( @@ -127,8 +128,7 @@ } case ExitFile: if (!IncludeStack.empty()) { - if (llvm::timeTraceProfilerEnabled()) - llvm::timeTraceProfilerEnd(); + LLVM_TRACE_END(); S->DiagnoseNonDefaultPragmaPack( Sema::PragmaPackDiagnoseKind::ChangedStateAtExit, @@ -953,7 +953,7 @@ } { - llvm::TimeTraceScope TimeScope("PerformPendingInstantiations"); + LLVM_TRACE_SCOPE("PerformPendingInstantiations", StringRef("")); PerformPendingInstantiations(); } Index: clang/lib/Parse/ParseTemplate.cpp =================================================================== --- clang/lib/Parse/ParseTemplate.cpp +++ clang/lib/Parse/ParseTemplate.cpp @@ -19,7 +19,7 @@ #include "clang/Sema/DeclSpec.h" #include "clang/Sema/ParsedTemplate.h" #include "clang/Sema/Scope.h" -#include "llvm/Support/TimeProfiler.h" +#include "llvm/Support/Tracing.h" using namespace clang; /// Parse a template declaration, explicit instantiation, or @@ -252,11 +252,10 @@ return nullptr; } - llvm::TimeTraceScope TimeScope("ParseTemplate", [&]() { - return std::string(DeclaratorInfo.getIdentifier() != nullptr - ? DeclaratorInfo.getIdentifier()->getName() - : "<unknown>"); - }); + LLVM_TRACE_SCOPE("ParseTemplate", + DeclaratorInfo.getIdentifier() != nullptr + ? DeclaratorInfo.getIdentifier()->getName() + : StringRef("<unknown>")); LateParsedAttrList LateParsedAttrs(true); if (DeclaratorInfo.isFunctionDeclarator()) { Index: clang/lib/Parse/ParseDeclCXX.cpp =================================================================== --- clang/lib/Parse/ParseDeclCXX.cpp +++ clang/lib/Parse/ParseDeclCXX.cpp @@ -10,7 +10,6 @@ // //===----------------------------------------------------------------------===// -#include "clang/Parse/Parser.h" #include "clang/AST/ASTContext.h" #include "clang/AST/DeclTemplate.h" #include "clang/AST/PrettyDeclStackTrace.h" @@ -19,12 +18,13 @@ #include "clang/Basic/OperatorKinds.h" #include "clang/Basic/TargetInfo.h" #include "clang/Parse/ParseDiagnostic.h" +#include "clang/Parse/Parser.h" #include "clang/Parse/RAIIObjectsForParser.h" #include "clang/Sema/DeclSpec.h" #include "clang/Sema/ParsedTemplate.h" #include "clang/Sema/Scope.h" #include "llvm/ADT/SmallString.h" -#include "llvm/Support/TimeProfiler.h" +#include "llvm/Support/Tracing.h" using namespace clang; @@ -3177,11 +3177,11 @@ TagType == DeclSpec::TST_union || TagType == DeclSpec::TST_class) && "Invalid TagType!"); - llvm::TimeTraceScope TimeScope("ParseClass", [&]() { - if (auto *TD = dyn_cast_or_null<NamedDecl>(TagDecl)) - return TD->getQualifiedNameAsString(); - return std::string("<anonymous>"); - }); + LLVM_TRACE_SCOPE( + "ParseClass", + dyn_cast_or_null<NamedDecl>(TagDecl) + ? dyn_cast_or_null<NamedDecl>(TagDecl)->getQualifiedNameAsString() + : StringRef("<anonymous>")); PrettyDeclStackTraceEntry CrashInfo(Actions.Context, TagDecl, RecordLoc, "parsing struct/union/class body"); Index: clang/lib/Parse/ParseAST.cpp =================================================================== --- clang/lib/Parse/ParseAST.cpp +++ clang/lib/Parse/ParseAST.cpp @@ -22,7 +22,7 @@ #include "clang/Sema/SemaConsumer.h" #include "clang/Sema/TemplateInstCallback.h" #include "llvm/Support/CrashRecoveryContext.h" -#include "llvm/Support/TimeProfiler.h" +#include "llvm/Support/Tracing.h" #include <cstdio> #include <memory> @@ -151,7 +151,7 @@ bool HaveLexer = S.getPreprocessor().getCurrentLexer(); if (HaveLexer) { - llvm::TimeTraceScope TimeScope("Frontend"); + LLVM_TRACE_SCOPE("Frontend", StringRef("")); P.Initialize(); Parser::DeclGroupPtrTy ADecl; for (bool AtEOF = P.ParseFirstTopLevelDecl(ADecl); !AtEOF; Index: clang/lib/Parse/CMakeLists.txt =================================================================== --- clang/lib/Parse/CMakeLists.txt +++ clang/lib/Parse/CMakeLists.txt @@ -28,3 +28,9 @@ clangLex clangSema ) + +if(PERFETTO) + target_link_libraries(clangParse + PRIVATE perfetto + ) +endif() Index: clang/lib/Lex/CMakeLists.txt =================================================================== --- clang/lib/Lex/CMakeLists.txt +++ clang/lib/Lex/CMakeLists.txt @@ -29,3 +29,9 @@ LINK_LIBS clangBasic ) + +if(PERFETTO) + target_link_libraries(clangLex + PRIVATE perfetto + ) +endif() Index: clang/lib/Frontend/CompilerInvocation.cpp =================================================================== --- clang/lib/Frontend/CompilerInvocation.cpp +++ clang/lib/Frontend/CompilerInvocation.cpp @@ -1860,6 +1860,7 @@ Opts.TimeTrace = Args.hasArg(OPT_ftime_trace); Opts.TimeTraceGranularity = getLastArgIntValue( Args, OPT_ftime_trace_granularity_EQ, Opts.TimeTraceGranularity, Diags); + Opts.Perfetto = Args.hasArg(OPT_perfetto); Opts.ShowVersion = Args.hasArg(OPT_version); Opts.ASTMergeFiles = Args.getAllArgValues(OPT_ast_merge); Opts.LLVMArgs = Args.getAllArgValues(OPT_mllvm); Index: clang/lib/Frontend/CompilerInstance.cpp =================================================================== --- clang/lib/Frontend/CompilerInstance.cpp +++ clang/lib/Frontend/CompilerInstance.cpp @@ -47,8 +47,8 @@ #include "llvm/Support/Path.h" #include "llvm/Support/Program.h" #include "llvm/Support/Signals.h" -#include "llvm/Support/TimeProfiler.h" #include "llvm/Support/Timer.h" +#include "llvm/Support/Tracing.h" #include "llvm/Support/raw_ostream.h" #include <time.h> #include <utility> @@ -1051,7 +1051,7 @@ [](CompilerInstance &) {}, llvm::function_ref<void(CompilerInstance &)> PostBuildStep = [](CompilerInstance &) {}) { - llvm::TimeTraceScope TimeScope("Module Compile", ModuleName); + LLVM_TRACE_SCOPE("Module Compile", ModuleName); // Construct a compiler invocation for creating this module. auto Invocation = @@ -1713,7 +1713,7 @@ Timer.init("loading." + ModuleFilename, "Loading " + ModuleFilename, *FrontendTimerGroup); llvm::TimeRegion TimeLoading(FrontendTimerGroup ? &Timer : nullptr); - llvm::TimeTraceScope TimeScope("Module Load", ModuleName); + LLVM_TRACE_SCOPE("Module Load", ModuleName); // Try to load the module file. If we are not trying to load from the // module cache, we don't know how to rebuild modules. Index: clang/lib/Driver/ToolChains/Clang.cpp =================================================================== --- clang/lib/Driver/ToolChains/Clang.cpp +++ clang/lib/Driver/ToolChains/Clang.cpp @@ -5317,6 +5317,7 @@ Args.AddLastArg(CmdArgs, options::OPT_ftrapv); Args.AddLastArg(CmdArgs, options::OPT_malign_double); Args.AddLastArg(CmdArgs, options::OPT_fno_temp_file); + Args.AddLastArg(CmdArgs, options::OPT_perfetto); if (Arg *A = Args.getLastArg(options::OPT_ftrapv_handler_EQ)) { CmdArgs.push_back("-ftrapv-handler"); Index: clang/lib/Driver/Driver.cpp =================================================================== --- clang/lib/Driver/Driver.cpp +++ clang/lib/Driver/Driver.cpp @@ -3751,8 +3751,14 @@ /*TargetDeviceOffloadKind*/ Action::OFK_None); } + // We don't need to count the assembler as a job since it doesn't + // cause the memory issue that requires disabling integrated-cc1. + size_t NumJobs = C.getJobs().size(); + if (NumJobs && !C.getDefaultToolChain().useIntegratedAs()) + --NumJobs; + // If we have more than one job, then disable integrated-cc1 for now. - if (C.getJobs().size() > 1) + if (NumJobs > 1) for (auto &J : C.getJobs()) J.InProcess = false; Index: clang/lib/CodeGen/CodeGenModule.cpp =================================================================== --- clang/lib/CodeGen/CodeGenModule.cpp +++ clang/lib/CodeGen/CodeGenModule.cpp @@ -61,7 +61,7 @@ #include "llvm/Support/ConvertUTF.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/MD5.h" -#include "llvm/Support/TimeProfiler.h" +#include "llvm/Support/Tracing.h" using namespace clang; using namespace CodeGen; @@ -2869,13 +2869,8 @@ if (!shouldEmitFunction(GD)) return; - llvm::TimeTraceScope TimeScope("CodeGen Function", [&]() { - std::string Name; - llvm::raw_string_ostream OS(Name); - FD->getNameForDiagnostic(OS, getContext().getPrintingPolicy(), - /*Qualified=*/true); - return Name; - }); + LLVM_TRACE_SCOPE("CodeGen Function", + StringRef(FD->getQualifiedNameAsString())); if (const auto *Method = dyn_cast<CXXMethodDecl>(D)) { // Make sure to emit the definition(s) before we emit the thunks. Index: clang/lib/CodeGen/CodeGenAction.cpp =================================================================== --- clang/lib/CodeGen/CodeGenAction.cpp +++ clang/lib/CodeGen/CodeGenAction.cpp @@ -39,9 +39,9 @@ #include "llvm/Pass.h" #include "llvm/Support/MemoryBuffer.h" #include "llvm/Support/SourceMgr.h" -#include "llvm/Support/TimeProfiler.h" #include "llvm/Support/Timer.h" #include "llvm/Support/ToolOutputFile.h" +#include "llvm/Support/Tracing.h" #include "llvm/Support/YAMLTraits.h" #include "llvm/Transforms/IPO/Internalize.h" @@ -272,7 +272,7 @@ void HandleTranslationUnit(ASTContext &C) override { { - llvm::TimeTraceScope TimeScope("Frontend"); + LLVM_TRACE_SCOPE("Frontend", StringRef("")); PrettyStackTraceString CrashInfo("Per-file LLVM IR generation"); if (FrontendTimesIsEnabled) { LLVMIRGenerationRefCount += 1; Index: clang/lib/CodeGen/CMakeLists.txt =================================================================== --- clang/lib/CodeGen/CMakeLists.txt +++ clang/lib/CodeGen/CMakeLists.txt @@ -109,3 +109,9 @@ clangLex clangSerialization ) + +if(PERFETTO) + target_link_libraries(clangCodeGen + PRIVATE perfetto + ) +endif() Index: clang/lib/CodeGen/CGDebugInfo.cpp =================================================================== --- clang/lib/CodeGen/CGDebugInfo.cpp +++ clang/lib/CodeGen/CGDebugInfo.cpp @@ -46,7 +46,7 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/MD5.h" #include "llvm/Support/Path.h" -#include "llvm/Support/TimeProfiler.h" +#include "llvm/Support/Tracing.h" using namespace clang; using namespace clang::CodeGen; @@ -3107,12 +3107,7 @@ if (Ty.isNull()) return nullptr; - llvm::TimeTraceScope TimeScope("DebugType", [&]() { - std::string Name; - llvm::raw_string_ostream OS(Name); - Ty.print(OS, getPrintingPolicy()); - return Name; - }); + LLVM_TRACE_SCOPE("DebugType", StringRef(Ty.getAsString(getPrintingPolicy()))); // Unwrap the type as needed for debug information. Ty = UnwrapTypeForDebugInfo(Ty, CGM.getContext()); @@ -3851,14 +3846,12 @@ if (!D) return; - llvm::TimeTraceScope TimeScope("DebugFunction", [&]() { - std::string Name; - llvm::raw_string_ostream OS(Name); - if (const NamedDecl *ND = dyn_cast<NamedDecl>(D)) - ND->getNameForDiagnostic(OS, getPrintingPolicy(), - /*Qualified=*/true); - return Name; - }); + LLVM_TRACE_SCOPE( + "DebugFunction", + dyn_cast_or_null<NamedDecl>(D) + ? StringRef( + dyn_cast_or_null<NamedDecl>(D)->getQualifiedNameAsString()) + : StringRef("")); llvm::DINode::DIFlags Flags = llvm::DINode::FlagZero; llvm::DIFile *Unit = getOrCreateFile(Loc); @@ -4651,13 +4644,8 @@ assert(CGM.getCodeGenOpts().hasReducedDebugInfo()); if (VD->hasAttr<NoDebugAttr>()) return; - llvm::TimeTraceScope TimeScope("DebugConstGlobalVariable", [&]() { - std::string Name; - llvm::raw_string_ostream OS(Name); - VD->getNameForDiagnostic(OS, getPrintingPolicy(), - /*Qualified=*/true); - return Name; - }); + LLVM_TRACE_SCOPE("DebugConstGlobalVariable", + StringRef(VD->getQualifiedNameAsString())); auto Align = getDeclAlignIfRequired(VD, CGM.getContext()); // Create the descriptor for the variable. Index: clang/lib/CodeGen/BackendUtil.cpp =================================================================== --- clang/lib/CodeGen/BackendUtil.cpp +++ clang/lib/CodeGen/BackendUtil.cpp @@ -45,9 +45,9 @@ #include "llvm/Support/MemoryBuffer.h" #include "llvm/Support/PrettyStackTrace.h" #include "llvm/Support/TargetRegistry.h" -#include "llvm/Support/TimeProfiler.h" #include "llvm/Support/Timer.h" #include "llvm/Support/ToolOutputFile.h" +#include "llvm/Support/Tracing.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Target/TargetMachine.h" #include "llvm/Target/TargetOptions.h" @@ -939,7 +939,7 @@ { PrettyStackTraceString CrashInfo("Per-function optimization"); - llvm::TimeTraceScope TimeScope("PerFunctionPasses"); + LLVM_TRACE_SCOPE("PerFunctionPasses", StringRef("")); PerFunctionPasses.doInitialization(); for (Function &F : *TheModule) @@ -950,13 +950,13 @@ { PrettyStackTraceString CrashInfo("Per-module optimization passes"); - llvm::TimeTraceScope TimeScope("PerModulePasses"); + LLVM_TRACE_SCOPE("PerModulePasses", StringRef("")); PerModulePasses.run(*TheModule); } { PrettyStackTraceString CrashInfo("Code generation"); - llvm::TimeTraceScope TimeScope("CodeGenPasses"); + LLVM_TRACE_SCOPE("CodeGenPasses", StringRef("")); CodeGenPasses.run(*TheModule); } @@ -1630,7 +1630,7 @@ BackendAction Action, std::unique_ptr<raw_pwrite_stream> OS) { - llvm::TimeTraceScope TimeScope("Backend"); + LLVM_TRACE_SCOPE("Backend", StringRef("")); std::unique_ptr<llvm::Module> EmptyModule; if (!CGOpts.ThinLTOIndexFile.empty()) { Index: clang/include/clang/Frontend/FrontendOptions.h =================================================================== --- clang/include/clang/Frontend/FrontendOptions.h +++ clang/include/clang/Frontend/FrontendOptions.h @@ -248,6 +248,9 @@ /// Output time trace profile. unsigned TimeTrace : 1; + /// Output time trace profile. + unsigned Perfetto : 1; + /// Show the -version text. unsigned ShowVersion : 1; @@ -447,14 +450,15 @@ public: FrontendOptions() : DisableFree(false), RelocatablePCH(false), ShowHelp(false), - ShowStats(false), ShowTimers(false), TimeTrace(false), + ShowStats(false), ShowTimers(false), TimeTrace(false), Perfetto(false), ShowVersion(false), FixWhatYouCan(false), FixOnlyWarnings(false), FixAndRecompile(false), FixToTemporaries(false), ARCMTMigrateEmitARCErrors(false), SkipFunctionBodies(false), UseGlobalModuleIndex(true), GenerateGlobalModuleIndex(true), ASTDumpDecls(false), ASTDumpLookups(false), BuildingImplicitModule(false), ModulesEmbedAllFiles(false), - IncludeTimestamps(true), UseTemporary(true), TimeTraceGranularity(500) {} + IncludeTimestamps(true), UseTemporary(true), TimeTraceGranularity(500) { + } /// getInputKindForExtension - Return the appropriate input kind for a file /// extension. For example, "c" would return Language::C. Index: clang/include/clang/Driver/Options.td =================================================================== --- clang/include/clang/Driver/Options.td +++ clang/include/clang/Driver/Options.td @@ -2668,6 +2668,8 @@ def pass_exit_codes : Flag<["-", "--"], "pass-exit-codes">, Flags<[Unsupported]>; def pedantic_errors : Flag<["-", "--"], "pedantic-errors">, Group<pedantic_Group>, Flags<[CC1Option]>; def pedantic : Flag<["-", "--"], "pedantic">, Group<pedantic_Group>, Flags<[CC1Option]>; +def perfetto : Flag<["-"], "perfetto">, Flags<[HelpHidden, CoreOption, CC1Option]>, Group<f_Group>, + HelpText<"Enable Perfetto tracing and output trace data to the provided file">; def pg : Flag<["-"], "pg">, HelpText<"Enable mcount instrumentation">, Flags<[CC1Option]>; def pipe : Flag<["-", "--"], "pipe">, HelpText<"Use pipes between commands, when possible">; Index: clang/include/clang/Basic/CodeGenOptions.def =================================================================== --- clang/include/clang/Basic/CodeGenOptions.def +++ clang/include/clang/Basic/CodeGenOptions.def @@ -241,6 +241,7 @@ CODEGENOPT(TimePasses , 1, 0) ///< Set when -ftime-report is enabled. CODEGENOPT(TimeTrace , 1, 0) ///< Set when -ftime-trace is enabled. VALUE_CODEGENOPT(TimeTraceGranularity, 32, 500) ///< Minimum time granularity (in microseconds), +CODEGENOPT(Perfetto , 1, 0) ///< Set when -perfetto is enabled. ///< traced by time profiler CODEGENOPT(UnrollLoops , 1, 0) ///< Control whether loops are unrolled. CODEGENOPT(RerollLoops , 1, 0) ///< Control whether loops are rerolled.
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits