[asan/hwasan co-author here, with clearly biased opinions] On Android, HWASAN is already a fully usable testing tool. We apply it to the kernel, user space system libraries, and select apps. A phone with HWASAN-ified system is fully usable (I carry one as my primary device since March 2019). HWASAN has discovered over 120 bugs by now (heap-use-after-free, heap/stack buffer overflows, stack-use-after-return, double free). Many of the bugs were discovered during the everyday use (as opposed to testing in the lab). The overhead is low enough that on a top-tier CPU the user will rarely notice any slowdown (the increased battery drain *is* noticeable - compiler instrumentation is not a substitute for hardware). HWASAN has also helped discover 4 instances of future incompatibility with MTE, all fixed.
The main benefit of HWASAN over ASAN is, as Matthew correctly explains, the memory usage. On embedded devices, this is often the difference between "can't deploy" and "can deploy" because, unlike in the server land, you can't install more RAM. The other, more subtle benefit, is that HWASAN is more sensitive to some types of bugs, such as buffer-overflow-far-from-bounds or use-after-long-ago-free, etc. MTE hardware is years away. Even once we have it in major new devices, many smaller devices will still be running on Arm v8, for a decade or two. As with ASAN/TSAN/UBSAN, having this sanitizer implemented in GCC will vastly extend its user base and applicability and thus contribute to the overall code quality and security. Whether HWASAN should intercept libc functions or libc itself should support HWASAN... My strong opinion is that today the interception approach can only be seen as a way to prototype. ASAN, implemented in 2011, had to use interception because we needed to get a new idea working fast. However, over these 9 years, the interception caused an enormous amount of complexity and user dissatisfaction. The Android implementation of HWASAN (with hooks in the Bionic libc and no interceptors) is many times simpler, robust, and complete. We need to do the same for other LIBCs, eventually, but we don't have to do it immediately. --kcc On Wed, Jan 8, 2020 at 3:26 AM Matthew Malcomson <matthew.malcom...@arm.com> wrote: > > Hi everyone, > > I'm writing this email to summarise & publicise the state of this patch > series, especially the difficulties around approval for GCC 10 mentioned > on IRC. > > > The main obstacle seems to be that no maintainer feels they have enough > knowledge about hwasan and justification that it's worthwhile to approve > the patch series. > > Similarly, Martin has given a review of the parts of the code he can > (thanks!), but doesn't feel he can do a deep review of the code related > to the RTL hooks and stack expansion -- hence that part is as yet not > reviewed in-depth. > > > > The questions around justification raised on IRC are mainly that it > seems like a proof-of-concept for MTE rather than a stand-alone useable > sanitizer. Especially since in the GNU world hwasan instrumented code > is not really ready for production since we can only use the > less-"interceptor ABI" rather than the "platform ABI". This restriction > is because there is no version of glibc with the required modifications > to provide the "platform ABI". > > (n.b. that since https://reviews.llvm.org/D69574 the code-generation for > these ABI's is the same). > > > From my perspective the reasons that make HWASAN useful in itself are: > > 1) Much less memory usage. > > From a back-of-the-envelope calculation based on the hwasan paper's > table of memory overhead from over-alignment > https://arxiv.org/pdf/1802.09517.pdf I guess hwasan instrumented code > has an overhead of about 1.1x (~4% from overalignment and ~6.25% from > shadow memory), while asan seems to have an overhead somewhere in the > range 1.5x - 3x. > > Maybe there's some data out there comparing total overheads that I > haven't found? (I'd appreciate a reference if anyone has that info). > > > > 2) Available on more architectures that MTE. > > HWASAN only requires TBI, which is a feature of all AArch64 machines, > while MTE will be an optional extension and only available on certain > architectures. > > > 3) This enables using hwasan in the kernel. > > While instrumented user-space applications will be using the > "interceptor ABI" and hence are likely not production-quality, the > biggest aim of implementing hwasan in GCC is to allow building the Linux > kernel with tag-based sanitization using GCC. > > Instrumented kernel code uses hooks in the kernel itself, so this ABI > distinction is no longer relevant, and this sanitizer should produce a > production-quality kernel binary. > > > > > I'm hoping I can find a maintainer willing to review and ACK this patch > series -- especially with stage3 coming to a close soon. If there's > anything else I could do to help get someone willing up-to-speed then > please just ask. > > > Cheers, > Matthew > > > > On 07/01/2020 15:14, Martin Liška wrote: > > On 12/12/19 4:18 PM, Matthew Malcomson wrote: > > > > Hello. > > > > I've just sent few comments that are related to the v3 of the patch set. > > Based on the HWASAN (limited) knowledge the patch seems reasonable to me. > > I haven't looked much at the newly introduced RTL-hooks. > > But these seems to me isolated to the aarch64 port. > > > > I can also verify that the patchset works on my aarch64 linux machine and > > hwasan.exp and asan.exp tests succeed. > > > >> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory > >> tagging, > >> since I'm not sure the way I found to implement this would be > >> acceptable. The > >> inlined patch below works but it requires a special declaration > >> instead of just > >> an ~#include~. > > > > Knowing that, I would not bother with the printing of HWASAN_MARK. > > > > Thanks for the series, > > Martin >