On Mon, Sep 08, 2025 at 04:36:57PM -0700, Andrew Pinski wrote:
> On Mon, Sep 8, 2025 at 4:24 PM Kees Cook <k...@kernel.org> wrote:
> >
> > On Tue, Sep 09, 2025 at 12:13:19AM +0200, Martin Uecker wrote:
> > > Sorry, example should have been this:
> > >
> > > typedef int arr_t[];
> > > typedef int arr3_t[3];
> > >
> > > void f(arr3_t*);
> > >
> > > void g(void (*fp)(arr_t*)) {
> > >
> > >       int a[3];
> > >       (*fp)(&a);      // this call would fail?
> > > }
> > >
> > > int h()
> > > {
> > >   g(&f);
> > > }
> >
> > Yes, that would fail. It's a pointer to an array of 3 ints, and that's
> > fine for distinguishing mismatched callers:
> >
> >   void(arr3_t*) -> void(int(*)[3]) -> _ZTSFvPA3_iE
> >   void(arr_t*)  -> void(int(*)[])  -> _ZTSFvPA_iE
> >
> > How should I adjust this patch's description to say the mangling may
> > follow a stricter type system?
> 
> Maybe it is time to step back for a second and figure out other things
> are not going to work with this mangling scheme? At the bare minimum
> write down limitations to the user documentation (and write up a bug
> report to LLVM for them to do the same).

Since the mangler is (now) designed to be only used by KCFI, where
should I write this up? Putting type examples in invoke.texi seems out
of place. Should it be comments in mangle.cc?

> Or even better step back and fix the hashing scheme to fit better with
> C rather than reusing IA64 C++ ABI mangling (which is not designed for
> C). If we need a hard ABI break due to limitations, it is good to do
> it now rather than having 2 different implementations that need to
> change upstream; right now you only need to break LLVM's ABI.

One thing for sure we need to settle on is a common hash and (AIUI)
LLVM would like to drop the hash KCFI is currently using (KCFI is the
last user of it). As I mentioned elsewhere, the hash doesn't need to be
cryptographically secure, so FNV-1a could very well be sufficient.

As for https://github.com/llvm/llvm-project/issues/106593, my GCC
implementation actually doesn't exhibit this behavior since it uses the
typedef name instead already (I had other places where the behavior was
mismatched and the logic I ended up with turned out to be slightly more
generalized than LLVM's). (I will need to fix this in LLVM.) I test for
it in the type mangling tests already:

/* Anonymous struct via typedef - should get typedef name as struct name.  */
typedef struct { int anon_member_1; } anon_typedef_1;
typedef struct { int anon_member_2; } anon_typedef_2;
extern void func_anon_typedef_1(anon_typedef_1 *param); /* Should use typedef 
name.  */
extern void func_anon_typedef_2(anon_typedef_2 *param); /* Should be different 
from anon_typedef_1 */
...
/* { dg-final { scan-tree-dump {KCFI type ID: 
mangled='_ZTSFvP14anon_typedef_1E' typeid=0x55475a23} kcfi0 } } */
/* { dg-final { scan-tree-dump {KCFI type ID: 
mangled='_ZTSFvP14anon_typedef_2E' typeid=0x454f8fb8} kcfi0 } } */

And wow would I like a better way to test this stuff. It's really
painfully doing it via forcing KCFI address-taking and then finding the
typeid in the dump file.

-Kees

-- 
Kees Cook

Reply via email to