On Tue, Sep 09, 2025 at 08:01:59AM +0200, Martin Uecker wrote: > Other things this approach may break are using an enum on the one > side and the integer type which it should be compatible to on the > other.
This should be a KCFI mismatch: both sides need to use the enum. In fact, while working on this, I found a recent mistake in Linux: https://lore.kernel.org/linux-hardening/20250829190721.it.373-k...@kernel.org/ > Maybe also check whether calling functions via pointers to incomplete > structure types vs function where the struct is complete or vice versa. We want the name of the struct, so incomplete structure types should be fine. > > One thing for sure we need to settle on is a common hash and (AIUI) > > LLVM would like to drop the hash KCFI is currently using (KCFI is the > > last user of it). As I mentioned elsewhere, the hash doesn't need to be > > cryptographically secure, so FNV-1a could very well be sufficient. > > > > As for https://github.com/llvm/llvm-project/issues/106593, my GCC > > implementation actually doesn't exhibit this behavior since it uses the > > typedef name instead already (I had other places where the behavior was > > mismatched and the logic I ended up with turned out to be slightly more > > generalized than LLVM's). (I will need to fix this in LLVM.) I test for > > it in the type mangling tests already: > > Note that is also not entirely correct for C. It is allowed > to call this via different typedef names across TU, > > > It would be possible to normalize the type before (or during) mangling > in a way that it works correctly for all C ta cost of precision. > > I think this would be desirable because otherwise this could cause > surprises for users, and with it conforming to C, it could be used > more easily also by other projects. We want to be able to distinguish separate function types which may be more strict than "conforming C" (e.g. enum vs int: we _want_ this to be a mismatch.) > The cost is that is a bit less precise. This isn't a feature someone just slaps in place and runs away from. :) We spent literal years getting Linux's function prototypes sorted out. There was a lot of mess that needed cleaning up. > One issue is also that future C changes could break this again, > but this is an even bigger risk with the current scheme. > > > > > /* Anonymous struct via typedef - should get typedef name as struct name. > > */ > > typedef struct { int anon_member_1; } anon_typedef_1; > > typedef struct { int anon_member_2; } anon_typedef_2; > > extern void func_anon_typedef_1(anon_typedef_1 *param); /* Should use > > typedef name. */ > > extern void func_anon_typedef_2(anon_typedef_2 *param); /* Should be > > different from anon_typedef_1 */ > > ... > > /* { dg-final { scan-tree-dump {KCFI type ID: > > mangled='_ZTSFvP14anon_typedef_1E' typeid=0x55475a23} kcfi0 } } */ > > /* { dg-final { scan-tree-dump {KCFI type ID: > > mangled='_ZTSFvP14anon_typedef_2E' typeid=0x454f8fb8} kcfi0 } } */ > > > > And wow would I like a better way to test this stuff. It's really > > painfully doing it via forcing KCFI address-taking and then finding the > > typeid in the dump file. > > Why not have a builtin that returns the hash for a type? If this would be acceptable, sure. I would like to be able to examine the _string_, though, since the hash is just a way to smash the string into a u32. -Kees -- Kees Cook