On 15.03.2025 01:36, Volodymyr Babchuk wrote:
> LibAFL, which is a part of AFL++ project is a instrument that allows
> us to perform fuzzing on beremetal code (Xen hypervisor in this case)
> using QEMU as an emulator. It employs QEMU's ability to create
> snapshots to run many tests relatively quickly: system state is saved
> right before executing a new test and restored after the test is
> finished.
>
> This patch adds all necessary plumbing to run aarch64 build of Xen
> inside that LibAFL-QEMU fuzzer. From the Xen perspective we need to
> do following things:
>
> 1. Able to communicate with LibAFL-QEMU fuzzer. This is done by
> executing special opcodes, that only LibAFL-QEMU can handle.
>
> 2. Use interface from p.1 to tell the fuzzer about code Xen section,
> so fuzzer know which part of code to track and gather coverage data.
>
> 3. Report fuzzer about crash. This is done in panic() function.
>
> 4. Prevent test harness from shooting itself in knee.
>
> Right now test harness is an external component, because we want to
> test external Xen interfaces, but it is possible to fuzz internal code
> if we want to.
>
> Test harness is implemented XTF-based test-case(s). As test harness
> can issue hypercall that shuts itself down, KConfig option
> CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING was added. It basically tells
> fuzzer that test was completed successfully if Dom0 tries to shut
> itself (or the whole machine) down.
>
> Signed-off-by: Volodymyr Babchuk <[email protected]>
>
> ---
>
> I tried to fuzz the vGIC emulator and hypercall interface. While vGIC
> fuzzing didn't yield any interesting results, hypercall fuzzing found a
> way to crash the hypervisor from Dom0 on aarch64, using
> "XEN_SYSCTL_page_offline_op" with "sysctl_query_page_offline" sub-op,
> because it leads to page_is_ram_type() call which is marked
> UNREACHABLE on ARM.
>
> In v2:
>
> - Moved to XTF-based test harness
> - Severely reworked the fuzzer itself. Now it has user-friendly
> command-line interface and is capable of running in CI, as it now
> returns an appropriate error code if any faults were found
> - Also I found, debugged and fixed a nasty bug in LibAFL-QEMU fork,
> which crashed the whole fuzzer.
>
> Right now the fuzzer is lockated at Xen Troops repo:
>
> https://github.com/xen-troops/xen-fuzzer-rs
>
> But I believe that it is ready to be included into
> gitlab.com/xen-project/
>
> XTF-based harness is at
>
> https://gitlab.com/vlad.babchuk/xtf/-/tree/mr_libafl
>
> and there is corresponding MR for including it into
>
> https://gitlab.com/xen-project/fusa/xtf/-/tree/xtf-arm
>
> So, to sum up. All components are basically ready for initial
> inclusion. There will be smaller, integration-related changes
> later. For example - we will need to update URLs for various
> components after they are moved to correct places.
> ---
> docs/hypervisor-guide/fuzzing.rst | 90 ++++++++++++
> xen/arch/arm/Kconfig.debug | 26 ++++
> xen/arch/arm/Makefile | 1 +
> xen/arch/arm/include/asm/libafl_qemu.h | 54 +++++++
> xen/arch/arm/include/asm/libafl_qemu_defs.h | 37 +++++
> xen/arch/arm/libafl_qemu.c | 152 ++++++++++++++++++++
> xen/arch/arm/psci.c | 13 ++
> xen/common/sched/core.c | 17 +++
> xen/common/shutdown.c | 7 +
> xen/drivers/char/console.c | 8 ++
> 10 files changed, 405 insertions(+)
> create mode 100644 docs/hypervisor-guide/fuzzing.rst
> create mode 100644 xen/arch/arm/include/asm/libafl_qemu.h
> create mode 100644 xen/arch/arm/include/asm/libafl_qemu_defs.h
> create mode 100644 xen/arch/arm/libafl_qemu.c
This looks to be about Arm only, which would be nice if that was visible
right from the subject.
Also, nit: New files' names are to use dashes in favor of underscores.
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -47,6 +47,10 @@
> #define pv_shim false
> #endif
>
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
> +
> /* opt_sched: scheduler - default to configured value */
> static char __initdata opt_sched[10] = CONFIG_SCHED_DEFAULT;
> string_param("sched", opt_sched);
> @@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll
> *sched_poll)
> if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) )
> return -EFAULT;
>
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> + libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> +
> set_bit(_VPF_blocked, &v->pause_flags);
> v->poll_evtchn = -1;
> set_bit(v->vcpu_id, d->poll_mask);
> @@ -1904,12 +1912,18 @@ ret_t do_sched_op(int cmd,
> XEN_GUEST_HANDLE_PARAM(void) arg)
> {
> case SCHEDOP_yield:
> {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> + libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> ret = vcpu_yield();
> break;
> }
>
> case SCHEDOP_block:
> {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> + libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> vcpu_block_enable_events();
> break;
> }
> @@ -1924,6 +1938,9 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void)
> arg)
>
> TRACE_TIME(TRC_SCHED_SHUTDOWN, current->domain->domain_id,
> current->vcpu_id, sched_shutdown.reason);
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> + libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> ret = domain_shutdown(current->domain, (u8)sched_shutdown.reason);
>
> break;
If I was a scheduler maintainer, I'd likely object to this kind of #ifdef-ary.
> --- a/xen/common/shutdown.c
> +++ b/xen/common/shutdown.c
> @@ -11,6 +11,10 @@
> #include <xen/kexec.h>
> #include <public/sched.h>
>
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
> +
> /* opt_noreboot: If true, machine will need manual reset on error. */
> bool __ro_after_init opt_noreboot;
> boolean_param("noreboot", opt_noreboot);
> @@ -32,6 +36,9 @@ static void noreturn reboot_or_halt(void)
>
> void hwdom_shutdown(unsigned char reason)
> {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> + libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> switch ( reason )
> {
> case SHUTDOWN_poweroff:
It's not as bad here and ...
> --- a/xen/drivers/char/console.c
> +++ b/xen/drivers/char/console.c
> @@ -40,6 +40,9 @@
> #ifdef CONFIG_SBSA_VUART_CONSOLE
> #include <asm/vpl011.h>
> #endif
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
>
> /* console: comma-separated list of console outputs. */
> static char __initdata opt_console[30] = OPT_CONSOLE_STR;
> @@ -1289,6 +1292,11 @@ void panic(const char *fmt, ...)
>
> kexec_crash(CRASHREASON_PANIC);
>
> + #ifdef CONFIG_LIBAFL_QEMU_FUZZER
> + /* Tell the fuzzer that we crashed */
> + libafl_qemu_end(LIBAFL_QEMU_END_CRASH);
> + #endif
... here, but still.
Also, pre-processor directives want their # to live at the beginning of the
line.
Jan