On Tue, Sep 09, 2025 at 08:01:59AM +0200, Martin Uecker wrote:
> Other things this approach may break are using an enum on the one
> side and the integer type which it should be compatible to on the
> other.

This should be a KCFI mismatch: both sides need to use the enum. In
fact, while working on this, I found a recent mistake in Linux:
https://lore.kernel.org/linux-hardening/20250829190721.it.373-k...@kernel.org/

> Maybe also  check whether calling functions via pointers to incomplete
> structure types vs function where the struct is complete or vice versa.

We want the name of the struct, so incomplete structure types should be
fine.

> > One thing for sure we need to settle on is a common hash and (AIUI)
> > LLVM would like to drop the hash KCFI is currently using (KCFI is the
> > last user of it). As I mentioned elsewhere, the hash doesn't need to be
> > cryptographically secure, so FNV-1a could very well be sufficient.
> > 
> > As for https://github.com/llvm/llvm-project/issues/106593, my GCC
> > implementation actually doesn't exhibit this behavior since it uses the
> > typedef name instead already (I had other places where the behavior was
> > mismatched and the logic I ended up with turned out to be slightly more
> > generalized than LLVM's). (I will need to fix this in LLVM.) I test for
> > it in the type mangling tests already:
> 
> Note that is also not entirely correct for C.  It is allowed
> to call this via different typedef names across TU,
> 
> 
> It would be possible to normalize the type before (or during) mangling
> in a way that it works correctly for all C ta cost of precision. 
> 
> I think this would be desirable because otherwise this could cause
> surprises for users, and with it  conforming to C, it could be used
> more easily also by other projects.

We want to be able to distinguish separate function types which may be
more strict than "conforming C" (e.g. enum vs int: we _want_ this to be
a mismatch.)

> The cost is that is a bit less precise.

This isn't a feature someone just slaps in place and runs away from. :)
We spent literal years getting Linux's function prototypes sorted out.
There was a lot of mess that needed cleaning up.

> One issue is also that future C changes could break this again,
> but this is an even bigger risk with the current scheme.
> 
> > 
> > /* Anonymous struct via typedef - should get typedef name as struct name.  
> > */
> > typedef struct { int anon_member_1; } anon_typedef_1;
> > typedef struct { int anon_member_2; } anon_typedef_2;
> > extern void func_anon_typedef_1(anon_typedef_1 *param); /* Should use 
> > typedef name.  */
> > extern void func_anon_typedef_2(anon_typedef_2 *param); /* Should be 
> > different from anon_typedef_1 */
> > ...
> > /* { dg-final { scan-tree-dump {KCFI type ID: 
> > mangled='_ZTSFvP14anon_typedef_1E' typeid=0x55475a23} kcfi0 } } */
> > /* { dg-final { scan-tree-dump {KCFI type ID: 
> > mangled='_ZTSFvP14anon_typedef_2E' typeid=0x454f8fb8} kcfi0 } } */
> > 
> > And wow would I like a better way to test this stuff. It's really
> > painfully doing it via forcing KCFI address-taking and then finding the
> > typeid in the dump file.
> 
> Why not have a builtin that returns the hash for a type?

If this would be acceptable, sure. I would like to be able to examine
the _string_, though, since the hash is just a way to smash the string
into a u32.

-Kees

-- 
Kees Cook

Reply via email to