On Mon, Sep 08, 2025 at 04:36:57PM -0700, Andrew Pinski wrote: > On Mon, Sep 8, 2025 at 4:24 PM Kees Cook <k...@kernel.org> wrote: > > > > On Tue, Sep 09, 2025 at 12:13:19AM +0200, Martin Uecker wrote: > > > Sorry, example should have been this: > > > > > > typedef int arr_t[]; > > > typedef int arr3_t[3]; > > > > > > void f(arr3_t*); > > > > > > void g(void (*fp)(arr_t*)) { > > > > > > int a[3]; > > > (*fp)(&a); // this call would fail? > > > } > > > > > > int h() > > > { > > > g(&f); > > > } > > > > Yes, that would fail. It's a pointer to an array of 3 ints, and that's > > fine for distinguishing mismatched callers: > > > > void(arr3_t*) -> void(int(*)[3]) -> _ZTSFvPA3_iE > > void(arr_t*) -> void(int(*)[]) -> _ZTSFvPA_iE > > > > How should I adjust this patch's description to say the mangling may > > follow a stricter type system? > > Maybe it is time to step back for a second and figure out other things > are not going to work with this mangling scheme? At the bare minimum > write down limitations to the user documentation (and write up a bug > report to LLVM for them to do the same).
Since the mangler is (now) designed to be only used by KCFI, where should I write this up? Putting type examples in invoke.texi seems out of place. Should it be comments in mangle.cc? > Or even better step back and fix the hashing scheme to fit better with > C rather than reusing IA64 C++ ABI mangling (which is not designed for > C). If we need a hard ABI break due to limitations, it is good to do > it now rather than having 2 different implementations that need to > change upstream; right now you only need to break LLVM's ABI. One thing for sure we need to settle on is a common hash and (AIUI) LLVM would like to drop the hash KCFI is currently using (KCFI is the last user of it). As I mentioned elsewhere, the hash doesn't need to be cryptographically secure, so FNV-1a could very well be sufficient. As for https://github.com/llvm/llvm-project/issues/106593, my GCC implementation actually doesn't exhibit this behavior since it uses the typedef name instead already (I had other places where the behavior was mismatched and the logic I ended up with turned out to be slightly more generalized than LLVM's). (I will need to fix this in LLVM.) I test for it in the type mangling tests already: /* Anonymous struct via typedef - should get typedef name as struct name. */ typedef struct { int anon_member_1; } anon_typedef_1; typedef struct { int anon_member_2; } anon_typedef_2; extern void func_anon_typedef_1(anon_typedef_1 *param); /* Should use typedef name. */ extern void func_anon_typedef_2(anon_typedef_2 *param); /* Should be different from anon_typedef_1 */ ... /* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvP14anon_typedef_1E' typeid=0x55475a23} kcfi0 } } */ /* { dg-final { scan-tree-dump {KCFI type ID: mangled='_ZTSFvP14anon_typedef_2E' typeid=0x454f8fb8} kcfi0 } } */ And wow would I like a better way to test this stuff. It's really painfully doing it via forcing KCFI address-taking and then finding the typeid in the dump file. -Kees -- Kees Cook