================
@@ -9427,3 +9427,37 @@ diagnostics with code like:
   __attribute__((nonstring)) char NotAStr[3] = "foo"; // Not diagnosed
   }];
 }
+
+def ModularFormatDocs : Documentation {
+  let Category = DocCatFunction;
+  let Content = [{
+The ``modular_format`` attribute can be applied to a function that bears the
+``format`` attribute (or standard library functions) to indicate that the
+implementation is modular on the format string argument. When the format string
+for a given call is constant, the compiler may redirect the call to the symbol
+given as the first argument to the attribute (the modular implementation
+function).
+
+The second argument is a implementation name, and the remaining arguments are
+aspects of the format string for the compiler to report. If the compiler does
+not understand a aspect, it must summarily report that the format string has
+that aspect.
+
+The compiler reports an aspect by issing a relocation for the symbol
+``<impl_name>_<aspect>``. This arranges for code and data needed to support the
+aspect of the implementation to be brought into the link to satisfy weak
+references in the modular implemenation function.
+
+For example, say ``printf`` is annotated with
+``modular_format(__modular_printf, __printf, float)``. Then, a call to
+``printf(var, 42)`` would be untouched. A call to ``printf("%d", 42)`` would
+become a call to ``__modular_printf`` with the same arguments, as would
----------------
mysterymath wrote:

> My concern is more about dispatching in ways the user may not anticipate and 
> getting observably different behavior. e.g., the user calls `printf("%I64d", 
> 0LL)` and they were getting the MSVC CRT `printf` call which supported that 
> modifier but now calls `__modular_printf` which doesn't know about the 
> modifier. What happens in that kind of situation?

Ah, if I understand what you're getting at, that can't happen; it's explicitly 
out of scope for the feature.

The `modular_format` attribute exists to advertise to compiler that is 
compiling calls to a function that the implementation can be split by 
redirecting calls and emitting relocs to various symbols. The only plausible 
mechanism to do so would be a header file, and that means that the header would 
need to be provided by and intrinsically tied to a specific version of the 
implementation. Otherwise, it would be impossible to determine what aspects the 
implementation requires to be emitted to function correctly.

Accordingly, this feature would primarily be useful for cases where libc is 
statically linked in and paired with its own headers. (llvm-libc, various 
embedded libcs, etc.) I suppose it's technically possible to break out printf 
implementation parts into a family of individual dynamic libraries, but even 
then, any libc header set that required that the libc implementation be 
dynamically replaceable would not be able to include `modular_format`.

So, for implementations that use this feature, `printf` and `__modular_printf` 
would always be designed together. To avoid ever introducing two full `printf` 
implementations into the link, `printf` would be a thin wrapper around 
`__modular_printf` that also requests every possible aspect of the 
implementation. This would mean that the two could never diverge.


As an aside, this is my first time landing a RFC across so many components of 
LLVM. I wasn't sure how much detail to include in each change; my intuition was 
to try to provide links to the RFC instead. I don't want the above reasoning to 
 get buried, and it gives me pause that it wasn't readily accessible. But I'm 
also not entirely sure where it should live going forward. Advice would be 
appreciated.


https://github.com/llvm/llvm-project/pull/147431
_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to