On 13 Nov 2023, at 03:10, Bruno Haible <br...@clisp.org> wrote: > > Jessica Clarke wrote: >> I can see in your patches that you're using __CHERI__ as your ABI >> detection macro. Unfortunately, this currently isn't quite right. >> ... >> we also have a hybrid ABI, which is >> binary-compatible with non-CHERI code, treating all pointers as >> traditional integer addresses, but with the ability to qualify them with >> __capability to opt into them being capabilities (and (u)intcap_t for >> the (u)intptr_t case). >> ... >> the >> __CHERI__ macro is for detecting this latter case, i.e. just that you >> have CHERI ISA features available (cc -march=morello -mabi=aapcs for >> Morello) >> ... >> What you instead want is the (rather >> cumbersome) __CHERI_PURE_CAPABILITY__ macro, which is specifically >> testing for the pure-capability ABI, so hybrid code where pointers are >> plain integer addresses continues to use the old code paths rather than >> the new capability ones. > > Thanks for the advice. Indeed, I can see 3 ABIs on cfarm240: > > (1) "clang" — the traditional aarch64 ABI. > (2) "clang -march=morello" == "clang -march=morello -mabi=aapcs" > — what you call "hybrid" mode. > (3) "clang -march=morello -mabi=purecap" == "cc" > > Predefined symbol differences between (1) and (2): > > $ diff <(:|clang -E -dM -|LC_ALL=C sort) <(:|clang -march=morello -E -dM > -|LC_ALL=C sort) | grep '^[<>]' >> #define __ARM_CAP_PERMISSION_BRANCH_SEALED_PAIR__ 256 >> #define __ARM_CAP_PERMISSION_COMPARTMENT_ID__ 128 >> #define __ARM_CAP_PERMISSION_EXECUTIVE__ 2 >> #define __ARM_CAP_PERMISSION_MUTABLE_LOAD__ 64 >> #define __ARM_FEATURE_ATOMICS 1 >> #define __ARM_FEATURE_CRC32 1 >> #define __ARM_FEATURE_DOTPROD 1 >> #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 >> #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 >> #define __ARM_FEATURE_QRDMX 1 >> #define __CHERI_ADDRESS_BITS__ 64 >> #define __CHERI_CAPABILITY_WIDTH__ 128 >> #define __CHERI_CAP_PERMISSION_ACCESS_SYSTEM_REGISTERS__ 512 >> #define __CHERI_CAP_PERMISSION_GLOBAL__ 1 >> #define __CHERI_CAP_PERMISSION_PERMIT_EXECUTE__ 32768 >> #define __CHERI_CAP_PERMISSION_PERMIT_LOAD_CAPABILITY__ 16384 >> #define __CHERI_CAP_PERMISSION_PERMIT_LOAD__ 131072 >> #define __CHERI_CAP_PERMISSION_PERMIT_SEAL__ 2048 >> #define __CHERI_CAP_PERMISSION_PERMIT_STORE_CAPABILITY__ 8192 >> #define __CHERI_CAP_PERMISSION_PERMIT_STORE_LOCAL__ 4096 >> #define __CHERI_CAP_PERMISSION_PERMIT_STORE__ 65536 >> #define __CHERI_CAP_PERMISSION_PERMIT_UNSEAL__ 1024 >> #define __CHERI__ 1 >> #define __INTCAP_MAX__ 9223372036854775807L >> #define __SIZEOF_CHERI_CAPABILITY__ 16 >> #define __SIZEOF_INTCAP__ 16 >> #define __SIZEOF_UINTCAP__ 16 >> #define __UINTCAP_MAX__ 18446744073709551615UL > > Predefined symbol differences between (2) and (3): > > $ diff <(:|clang -march=morello -E -dM -|LC_ALL=C sort) <(:|clang > -march=morello -mabi=purecap -E -dM -|LC_ALL=C sort) | grep '^[<>]' >> #define __ARM_FEATURE_C64 1 >> #define __CHERI_CAPABILITY_TABLE__ 3 >> #define __CHERI_CAPABILITY_TLS__ 1 >> #define __CHERI_PURE_CAPABILITY__ 2 >> #define __CHERI_SANDBOX__ 4 > < #define __INTPTR_FMTd__ "ld" > < #define __INTPTR_FMTi__ "li" >> #define __INTPTR_FMTd__ "Pd" >> #define __INTPTR_FMTi__ "Pi" > < #define __INTPTR_TYPE__ long int > < #define __INTPTR_WIDTH__ 64 >> #define __INTPTR_TYPE__ __intcap >> #define __INTPTR_WIDTH__ 128 > < #define __POINTER_WIDTH__ 64 >> #define __PIC__ 1 >> #define __POINTER_WIDTH__ 128 > < #define __SIZEOF_POINTER__ 8 >> #define __SIZEOF_POINTER__ 16 > < #define __UINTPTR_FMTX__ "lX" > < #define __UINTPTR_FMTo__ "lo" > < #define __UINTPTR_FMTu__ "lu" > < #define __UINTPTR_FMTx__ "lx" >> #define __UINTPTR_FMTX__ "PX" >> #define __UINTPTR_FMTo__ "Po" >> #define __UINTPTR_FMTu__ "Pu" >> #define __UINTPTR_FMTx__ "Px" > < #define __UINTPTR_TYPE__ long unsigned int > < #define __UINTPTR_WIDTH__ 64 >> #define __UINTPTR_TYPE__ unsigned __intcap >> #define __UINTPTR_WIDTH__ 128 >> #define __pic__ 1 > > So, if I understood it correctly, in hybrid mode (2), programs (especially > memory allocators) _can_ use <cheri.h> and its functions, but it's not > necessary since the programs will also work without it?
You can, but they take in a void * __capability, which is not the same in hybrid as void * (your standard integer). You can cast between the two with (__cheri_tocap void * __capability)/(__cheri_fromcap void *), where the former uses DDC (the “default data capability”, null in purecap code but set to cover the whole address space by default for hybrid code) as the capability authority and the latter just extracts the address. But that won’t help your allocator that’s void *(size_t); you’d have to make it be void * __capability(size_t) and then annotate every single place those pointers flow with __capability otherwise you’d go back to dealing with traditional addresses and use DDC’s bounds for all your loads and stores. This need to annotate all uses (including any third-party library APIs you use with them), which ends up spreading throughout entire codebases, is why hybrid is awkward to use at scale and only really works in highly-constrained environments. For most userspace code there’s very little benefit to using it and a massive engineering effort needed to adopt it (though those who can drive Coccinelle can probably do fancy things to automate a lot of it, and you could imagine tools that iteratively propagate the annotations via static analysis). So it’s not really a case of not being necessary, it’s a case of “are you using void * or using void * __capability everywhere to opt into capability use?” Unless that’s what you mean by necessary? Jess