Re: [PATCH v3 3/4] ci: Add a migration compatibility test job

Fabiano Rosas Tue, 09 Jan 2024 13:00:11 -0800

Cédric Le Goater <[email protected]> writes:

> On 1/5/24 19:04, Fabiano Rosas wrote:
>> The migration tests have support for being passed two QEMU binaries to
>> test migration compatibility.
>> 
>> Add a CI job that builds the lastest release of QEMU and another job
>> that uses that version plus an already present build of the current
>> version and run the migration tests with the two, both as source and
>> destination. I.e.:
>> 
>>   old QEMU (n-1) -> current QEMU (development tree)
>>   current QEMU (development tree) -> old QEMU (n-1)
>> 
>> The purpose of this CI job is to ensure the code we're about to merge
>> will not cause a migration compatibility problem when migrating the
>> next release (which will contain that code) to/from the previous
>> release.
>> 
>> I'm leaving the jobs as manual for now because using an older QEMU in
>> tests could hit bugs that were already fixed in the current
>> development tree and we need to handle those case-by-case.
>> 
>> Note: for user forks, the version tags need to be pushed to gitlab
>> otherwise it won't be able to checkout a different version.
>> 
>> Signed-off-by: Fabiano Rosas <[email protected]>
>> ---
>>   .gitlab-ci.d/buildtest.yml | 53 ++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 53 insertions(+)
>> 
>> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
>> index 91663946de..81163a3f6a 100644
>> --- a/.gitlab-ci.d/buildtest.yml
>> +++ b/.gitlab-ci.d/buildtest.yml
>> @@ -167,6 +167,59 @@ build-system-centos:
>>         x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
>>       MAKE_CHECK_ARGS: check-build
>>   
>> +build-previous-qemu:
>> +  extends: .native_build_job_template
>> +  artifacts:
>> +    when: on_success
>> +    expire_in: 2 days
>> +    paths:
>> +      - build-previous
>> +    exclude:
>> +      - build-previous/**/*.p
>> +      - build-previous/**/*.a.p
>> +      - build-previous/**/*.fa.p
>> +      - build-previous/**/*.c.o
>> +      - build-previous/**/*.c.o.d
>> +      - build-previous/**/*.fa
>> +  needs:
>> +    job: amd64-opensuse-leap-container
>> +  variables:
>> +    QEMU_JOB_OPTIONAL: 1
>> +    IMAGE: opensuse-leap
>> +    TARGETS: x86_64-softmmu aarch64-softmmu
>> +  before_script:
>> +    - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' 
>> VERSION)"
>> +    - git checkout $QEMU_PREV_VERSION
>> +  after_script:
>> +    - mv build build-previous
>> +
>> +.migration-compat-common:
>> +  extends: .common_test_job_template
>> +  needs:
>> +    - job: build-previous-qemu
>> +    - job: build-system-opensuse
>> +  allow_failure: true
>> +  variables:
>> +    QEMU_JOB_OPTIONAL: 1
>> +    IMAGE: opensuse-leap
>> +    MAKE_CHECK_ARGS: check-build
>> +  script:
>> +    - cd build
>> +    - QTEST_QEMU_BINARY_SRC=../build-previous/qemu-system-${TARGET}
>> +          QTEST_QEMU_BINARY=./qemu-system-${TARGET} 
>> ./tests/qtest/migration-test
>> +    - QTEST_QEMU_BINARY_DST=../build-previous/qemu-system-${TARGET}
>> +          QTEST_QEMU_BINARY=./qemu-system-${TARGET} 
>> ./tests/qtest/migration-test
>> +
>> +migration-compat-aarch64:
>> +  extends: .migration-compat-common
>> +  variables:
>> +    TARGET: aarch64
>> +
>> +migration-compat-x86_64:
>> +  extends: .migration-compat-common
>> +  variables:
>> +    TARGET: x86_64
>
>
> What about the others archs, s390x and ppc ? Do you lack the resources
> or are there any problems to address ?


Currently s390x and ppc are only tested on KVM. Which means they are not
tested at all unless someone runs migration-test on a custom runner. The
same is true for this test.

The TCG tests have been disabled:
    /*
     * On ppc64, the test only works with kvm-hv, but not with kvm-pr and TCG
     * is touchy due to race conditions on dirty bits (especially on PPC for
     * some reason)
     */

    /*
     * Similar to ppc64, s390x seems to be touchy with TCG, so disable it
     * there until the problems are resolved
     */

It would be great if we could figure out what these issues are and fix
them so we can at least test with TCG like we do for aarch64.

Doing a TCG run of migration-test with both archs (one binary only, not
this series):

- ppc survived one run, taking 6 minutes longer than x86/Aarch64.
- s390x survived one run, taking 40s less than x86/aarch64.

I'll leave them enabled on my machine and do some runs here and there,
see if I spot something. If not, we can consider re-enabling them once
we figure out why ppc takes so long.

Re: [PATCH v3 3/4] ci: Add a migration compatibility test job

Reply via email to