On Wed, Oct 13, 2021 at 8:51 AM John Snow <[email protected]> wrote:

>
>
> On Wed, Oct 13, 2021 at 4:45 AM Hanna Reitz <[email protected]> wrote:
>
>> On 13.10.21 00:34, John Snow wrote:
>> > Based-on: <[email protected]>
>> >            [PULL 00/10] Python patches
>> > GitLab:
>> https://gitlab.com/jsnow/qemu/-/commits/python-aqmp-iotest-wrapper
>> > CI: https://gitlab.com/jsnow/qemu/-/pipelines/387210591
>> >
>> > Hiya,
>> >
>> > This series continues where the last two AQMP series left off and adds a
>> > synchronous 'legacy' wrapper around the new AQMP interface, then drops
>> > it straight into iotests to prove that AQMP is functional and totally
>> > cool and fine. The disruption and churn to iotests is pretty minimal.
>> >
>> > In the event that a regression happens and I am not physically proximate
>> > to inflict damage upon, one may set the QEMU_PYTHON_LEGACY_QMP variable
>> > to any non-empty string as it pleases you to engage the QMP machinery
>> > you are used to.
>> >
>> > I'd like to try and get this committed early in the 6.2 development
>> > cycle to give ample time to smooth over any possible regressions. I've
>> > tested it locally and via gitlab CI, across Python versions 3.6 through
>> > 3.10, and "worksforme". If something bad happens, we can revert the
>> > actual switch-flip very trivially.
>>
>> So running iotests locally, I got one failure:
>>
>> $ TEST_DIR=/tmp/vdi-tests ./check -c writethrough -vdi 300
>> [...]
>> 300                             fail       [10:28:06] [10:28:11]
>> 5.1s                 output mismatch (see 300.out.bad)
>> --- /home/maxx/projects/qemu/tests/qemu-iotests/300.out
>> +++ 300.out.bad
>> @@ -1,4 +1,5 @@
>> -.......................................
>> +..............ERROR:qemu.aqmp.qmp_client.qemu-b-222963:Task.Reader:
>> ConnectionResetError: [Errno 104] Connection reset by peer
>> +.........................
>>   ----------------------------------------------------------------------
>>   Ran 39 tests
>> [...]
>>
>>
> Oh, unfortunate.
>
>
>>
>> I’m afraid I can’t really give a reproducer or anything.  It feels like
>>
>
> Thank you for the report!
>
>
>> just some random spurious timing-related error.  Although then again,
>> 300 does have an `except machine.AbnormalShutdown` clause at one
>> point...  So perhaps that’s the culprit, and we need to disable logging
>> there.
>>
>>
> I'll investigate!
>

Unfortunately, even in a loop some 150 times I couldn't reproduce this one.
As you point out, it appears to be just a failure caused by logging. The
test logic itself completes as expected.

Still, I would expect, on a "clean" shutdown of the destination host (where
the destination process fails to load the migration stream and voluntarily
exits with an error code) to end with a FIN/ACK for TCP or ... uh, whatever
happens for a UNIX socket. Where's the Connection Reset coming from? Did
the destination VM process *crash*?

I'm not so sure that I *should* silence this error, but I also can't
reproduce it at all to answer these questions, so uh. uhhh. I guess I will
just hammer it on a loop a few hundred times more and see if I get lucky.

Reply via email to