On Wed, Apr 13, 2016 at 11:43:56PM +0200, Marc-André Lureau wrote:
> On Wed, Apr 13, 2016 at 7:32 PM, Yuanhan Liu
> <[email protected]> wrote:
> >>
> >> > I'm asking because I found a seg fault issue sometimes,
> >> > due to opaque is NULL.
> >
> > Oh, I was wrong, it's u being NULL, but not opaque.
> >> >
> >>
> >> I would be interested to see the backtrace or have a reproducer.
> >
> > It's a normal test steps: start a vhost-user switch (I'm using DPDK
> > vhost-switch example), kill it, and wait for a while (something like
> > more than 10s or even longer), then I saw a seg fault:
> >
> > (gdb) p dev
> > $4 = (struct vhost_dev *) 0x555556571bf0
> > (gdb) p u
> > $5 = (struct vhost_user *) 0x0
> > (gdb) where
> > #0 0x0000555555798612 in slave_read (opaque=0x555556571bf0)
> > at /home/yliu/qemu/hw/virtio/vhost-user.c:539
> > #1 0x0000555555a343a4 in aio_dispatch (ctx=0x55555655f560) at
> > /home/yliu/qemu/aio-posix.c:327
> > #2 0x0000555555a2738b in aio_ctx_dispatch (source=0x55555655f560,
> > callback=0x0, user_data=0x0)
> > at /home/yliu/qemu/async.c:233
> > #3 0x00007ffff51032a6 in g_main_context_dispatch () from
> > /lib64/libglib-2.0.so.0
> > #4 0x0000555555a3239e in glib_pollfds_poll () at
> > /home/yliu/qemu/main-loop.c:213
> > #5 0x0000555555a3247b in os_host_main_loop_wait (timeout=29875848) at
> > /home/yliu/qemu/main-loop.c:258
> > #6 0x0000555555a3252b in main_loop_wait (nonblocking=0) at
> > /home/yliu/qemu/main-loop.c:506
> > #7 0x0000555555846e35 in main_loop () at /home/yliu/qemu/vl.c:1934
> > #8 0x000055555584e6bf in main (argc=31, argv=0x7fffffffe078,
> > envp=0x7fffffffe178)
> > at /home/yliu/qemu/vl.c:4658
> >
>
> This patch set doesn't try to handle crashes from backend. This would
> require a much more detailed study of the existing code path. A lot of
> places assume the backend is fully working as expected. I think
> handling backend crashes should be a different, later, patch set.
Oh, sorry for not making it clear. I actually did the kill by "ctrl-c".
It then is captured to send a SLAVE_SHUTDOWN request. So, I would say
it's a normal quit.
--yliu