On Mon, 18 May 2020 09:38:24 +0000 "Zhang, Chen" <[email protected]> wrote:
> > -----Original Message----- > > From: Lukas Straub <[email protected]> > > Sent: Monday, May 11, 2020 8:27 PM > > To: qemu-devel <[email protected]> > > Cc: Alberto Garcia <[email protected]>; Dr. David Alan Gilbert > > <[email protected]>; Zhang, Chen <[email protected]> > > Subject: [PATCH 0/5] colo: Introduce resource agent and test suite/CI > > > > Hello Everyone, > > These patches introduce a resource agent for fully automatic management of > > colo and a test suite building upon the resource agent to extensively test > > colo. > > > > Test suite features: > > -Tests failover with peer crashing and hanging and failover during > > checkpoint > > -Tests network using ssh and iperf3 -Quick test requires no special > > configuration -Network test for testing colo-compare -Stress test: failover > > all > > the time with network load > > > > Resource agent features: > > -Fully automatic management of colo > > -Handles many failures: hanging/crashing qemu, replication error, disk > > error, ... > > -Recovers from hanging qemu by using the "yank" oob command -Tracks > > which node has up-to-date data -Works well in clusters with more than 2 > > nodes > > > > Run times on my laptop: > > Quick test: 200s > > Network test: 800s (tagged as slow) > > Stress test: 1300s (tagged as slow) > > > > The test suite needs access to a network bridge to properly test the > > network, > > so some parameters need to be given to the test run. See > > tests/acceptance/colo.py for more information. > > > > I wonder how this integrates in existing CI infrastructure. Is there a > > common > > CI for qemu where this can run or does every subsystem have to run their > > own CI? > > Wow~ Very happy to see this series. > I have checked the "how to" in tests/acceptance/colo.py, > But it looks not enough for users, can you write an independent document for > this series? > Include test Infrastructure ASC II diagram, test cases design , detailed how > to and more information for > pacemaker cluster and resource agent..etc ? Hi, I quickly created a more complete howto for configuring a pacemaker cluster and using the resource agent, I hope it helps: https://wiki.qemu.org/Features/COLO/Managed_HOWTO Regards, Lukas Straub > Thanks > Zhang Chen > > > > > > Regards, > > Lukas Straub > > > > > > Lukas Straub (5): > > block/quorum.c: stable children names > > colo: Introduce resource agent > > colo: Introduce high-level test suite > > configure,Makefile: Install colo resource-agent > > MAINTAINERS: Add myself as maintainer for COLO resource agent > > > > MAINTAINERS | 6 + > > Makefile | 5 + > > block/quorum.c | 20 +- > > configure | 10 + > > scripts/colo-resource-agent/colo | 1429 ++++++++++++++++++++++ > > scripts/colo-resource-agent/crm_master | 44 + > > scripts/colo-resource-agent/crm_resource | 12 + > > tests/acceptance/colo.py | 689 +++++++++++ > > 8 files changed, 2209 insertions(+), 6 deletions(-) create mode 100755 > > scripts/colo-resource-agent/colo create mode 100755 scripts/colo-resource- > > agent/crm_master > > create mode 100755 scripts/colo-resource-agent/crm_resource > > create mode 100644 tests/acceptance/colo.py > > > > -- > > 2.20.1
pgpZFUQIepymp.pgp
Description: OpenPGP digital signature
