On 2017-12-08 14:51, Kevin Wolf wrote: > Am 08.12.2017 um 14:39 hat Max Reitz geschrieben: >> On 2017-12-06 10:12, Kevin Wolf wrote: >>> Am 06.12.2017 um 08:28 hat Kangjie Xi geschrieben: >>>> Hi, >>>> >>>> I encountered a qemu-nbd segfault, finally I found it was caused by >>>> NULL bs-drv, which is located in block/io.c function bdrv_co_flush >>>> line 2377: >>>> >>>> https://git.qemu.org/?p=qemu.git;a=blob;f=block/io.c;h=4fdf93a0144fa4761a14b8cc6b2a9a6b6e5d5bec;hb=d470ad42acfc73c45d3e8ed5311a491160b4c100#l2377 >>>> >>>> It is before the patch at line 2402, so the patch needs to be updated >>>> to fix NULL bs-drv at line 2337. >>>> >>>> https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg03425.html >>> >>> Can you please post a full backtrace? Do you see any error message >>> on stderr before the process crashes? >>> >>> I don't see at the moment how this can happen, except the case that Max >>> mentioned where bs->drv = NULL is set when an image corruption is >>> detected - this involves an error message, though. >>> >>> We check bdrv_is_inserted() as the first thing, which includes a NULL >>> check for bs->drv. So it must have been non-NULL at the start of the >>> function and then become NULL. I suppose this can theoretically happen >>> in qemu_co_queue_wait() if another flush request detects image >>> corruption. >>> >>> Max: I think bs->drv = NULL in the middle of a request was a stupid >>> idea. In fact, it's already a stupid idea to have any BDS with >>> bs->drv = NULL. Maybe it would be better to schedule a BH that replaces >>> the qcow2 node with a dummy node (null-co?) and properly closes the >>> qcow2 one. >> >> Yes, that is an idea John had, too. It sounded good to me (we'd just >> need to add a new flag to null-co so it would respond with -ENOMEDIUM to >> all requests or something)... The only issue I had is how that would >> work together with the GRAPH_MOD op blocker. > > In order to answer this question, I'd first have to understand what > GRAPH_MOD is even supposed to mean and which operations it needs to > protect. There aren't currently any users of GRAPH_MOD.
That is exactly the reason why we could not come to a conclusion. :-) Max
signature.asc
Description: OpenPGP digital signature
