snehasish wrote: Thanks for sharing the draft, here are my thoughts --
1. The Darwin sanitizer changes are numerous but as you noted, a matter of following what asan does today. I don't think there are any fundamental challenges to making it work (though I am not a CMake expert). For the commented out interceptors, we should be able to achieve the same outcome through the use of SANITIZER_APPLE macros. I noted some in the draft that you shared. 2. Raw profile format: I suggest we incorporate the binary address data as a separate section in the existing format without any additional changes to the contents. I think this can be achieved with the following changes --- ``` // Format // ---------- Header // Magic // Version <-- Increment version // Total Size // Segment Offset // MIB Info Offset // Stack Offset // MemAddressOffset <-- New field added to header // ---------- Segment Info ... Existing Content ... // ----------- MemAddress NumMemBlockAddresses <-- How many u64 fields to read (itself a u64) u64 MemBlockAddress1; u64 MemBlockAddress2; ``` The segment entries can be shared. On ELF, the main binary entry is the first one by convention (first match in the PT_LOAD segment) though we can add additional handling for the main ".app" module if necessary. 3. llvm-profdata changes -- The pre-requisites of this step are the raw profile and the profiled binary. Here our output requirements are different. We would like to generate an indexed profile which includes static data addresses as part of the PGHO section. How about the following approach -- a. Use llvm-profdata to convert the raw binary access profile to an indexed profile b. Use llvm-profdata show to dump the indexed profile in YAML format (incl. binary data) c. Use a python script to convert the YAML output to order file (not part of LLVM tooling) The rationale for this approach is that we keep the llvm-profdata usage model the same, i.e. merges raw profiles to produce indexed profiles and provides utilities to show and convert file formats. This also avoids teaching llvm-profdata about order file formats. For (a) above, Mingming has already implemented the ability to specify data access profiles using symbols or content hashes so it should be straight forward to implement. 4. Shadow memory granularity -- Yes, we should reuse the histogram granularity which was added by @mattweingarten . The "memprof-histogram" flag controls the shadow mapping [during instrumentation](http://google3/third_party/llvm/llvm-project/llvm/lib/Transforms/Instrumentation/MemProfInstrumentation.cpp?l=159). We'll need to refactor it a bit to decouple the shadow mapping from the histogram implementation. Adding a new flag to imply the finer granularity without enabling histogram collection seems like a reasonable approach. Emitting full histogram information results in very large raw profiles so it's best to keep them decoupled. Let me know your thoughts and how you would like to proceed. Thanks! https://github.com/llvm/llvm-project/pull/142884 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits