Il 14/01/2013 09:16, Liu Yuan ha scritto: > Hi List, > This problem can be reproduced by: > 1. start a sheepdog cluster and create a volume 'test'* > 2. attach 'test' to a bootable image like > $ qemu -hda image -drive if=virtio,file=sheepdog:test > 3. pkill sheep # create a half-closed situation > > I have straced it that QEMU is busy doing nonsense read/write() after > select() in os_host_main_loop_wait(). I have no knowledge of > glib_select_xxx, so someone please help fix it.
read/write() is not done by os_host_main_loop_wait(). It must be done by qemu_co_send()/qemu_co_recv() after the handler has reentered the coroutine. > Another unexpected behavior is that qemu_co_send() will send data > successfully for the half-closed situation, even the other end is > completely down. I think the *expected* behavior is that we get notified > by a HUP and close the affected sockfd, then qemu_co_send() will not > send any data, then the caller of qemu_co_send() can handle error case. qemu_co_send() should get an EPIPE or similar error. The first time it will report a partial send, the second time it will report the error directly to the caller. Please check if this isn't a bug in the Sheepdog driver. Paolo > I don't know which one I should Cc, so I only include Stefan in. > > * You can easily start up a one node sheepdog cluster as following: > $ git clone https://github.com/collie/sheepdog.git > $ cd sheepdog > $ apt-get install liburcu-dev > $ ./autogen.sh; ./configure --disable-corosync;make > #start up a one node sheep cluster > $ mkdir store;./sheep/sheep store -c local > $ collie/collie cluster format -c 1 > #create a volume named test > $ collie/collie vdi create test 1G > > Thanks, > Yuan > >
