[clang] [compiler-rt] [llvm] [DRAFT][memprof][darwin] Support memprof on Darwin platform and add binary access profile (PR #142884)

Snehasish Kumar via cfe-commits Fri, 06 Jun 2025 16:29:10 -0700

snehasish wrote:

Thanks for sharing the draft, here are my thoughts --

1. The Darwin sanitizer changes are numerous but as you noted, a matter of
following what asan does today. I don't think there are any fundamental
challenges to making it work (though I am not a CMake expert). For the
commented out interceptors, we should be able to achieve the same outcome
through the use of SANITIZER_APPLE macros. I noted some in the draft that you
shared.

2. Raw profile format: I suggest we incorporate the binary address data as a
separate section in the existing format without any additional changes to the
contents. I think this can be achieved with the following changes ---

```
// Format
// ---------- Header
// Magic
// Version <-- Increment version
// Total Size
// Segment Offset
// MIB Info Offset
// Stack Offset
// MemAddressOffset <-- New field added to header
// ---------- Segment Info

... Existing Content ...
// ----------- MemAddress
NumMemBlockAddresses <-- How many u64 fields to read (itself a u64)
u64 MemBlockAddress1;
u64 MemBlockAddress2;
```

The segment entries can be shared. On ELF, the main binary entry is the first
one by convention (first match in the PT_LOAD segment) though we can add
additional handling for the main ".app" module if necessary.

3. llvm-profdata changes -- The pre-requisites of this step are the raw profile
and the profiled binary. Here our output requirements are different. We would
like to generate an indexed profile which includes static data addresses as
part of the PGHO section. How about the following approach --
a. Use llvm-profdata to convert the raw binary access profile to an indexed
profile
b. Use llvm-profdata show to dump the indexed profile in YAML format (incl.
binary data)
c. Use a python script to convert the YAML output to order file (not part of
LLVM tooling)

The rationale for this approach is that we keep the llvm-profdata usage model
the same, i.e. merges raw profiles to produce indexed profiles and provides
utilities to show and convert file formats. This also avoids teaching
llvm-profdata about order file formats. For (a) above, Mingming has already
implemented the ability to specify data access profiles using symbols or
content hashes so it should be straight forward to implement.

4. Shadow memory granularity -- Yes, we should reuse the histogram granularity
which was added by @mattweingarten . The "memprof-histogram" flag controls the
shadow mapping [during
instrumentation](http://google3/third_party/llvm/llvm-project/llvm/lib/Transforms/Instrumentation/MemProfInstrumentation.cpp?l=159).
We'll need to refactor it a bit to decouple the shadow mapping from the
histogram implementation. Adding a new flag to imply the finer granularity
without enabling histogram collection seems like a reasonable approach.
Emitting full histogram information results in very large raw profiles so it's
best to keep them decoupled.

Let me know your thoughts and how you would like to proceed. Thanks!

https://github.com/llvm/llvm-project/pull/142884
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [compiler-rt] [llvm] [DRAFT][memprof][darwin] Support memprof on Darwin platform and add binary access profile (PR #142884)

Reply via email to